Tag Archives: data science
The Barclays Data Science Hackathon: Building Retail Recommender Systems based on Customer Shopping Behaviour
In the depths of the last cold, wet British winter, the Advanced Data Analytics team from Barclays escaped to a villa on Lanzarote, Canary Islands, for a one week hackathon where they collaboratively developed a recommendation system on top of Apache Spark. The contest consisted on using Bristol customer shopping behaviour data to make personalised recommendations in a sort of Kaggle-like competition where each team’s goal was to build an MVP and then repeatedly iterate on it using common interfaces defined by a specifically built framework. Continue reading
Code should be developed in a proper IDE and make use of advanced tools for re-factoring, auto-completion, syntax highlighting and auto-formatters; at least.
Notebooks should use routine libraries from the main codebase. As soon as some code is developed in a notebook and is reusable, it should be moved into a codebase. Continue reading
Logical Data Warehouse for Data Science: map raw data directly from source to Spark in-memory with Tachyon
Common problems for large organizations dealing with Big Data and Data Science applications are: Data stored in non scalable infrastructure for analysis and processing Data governance and security policies 1. Data often resides into central data warehouse and RDBMS of which many legacy applications … Continue reading
The challenge here is “Are you able to implement a basic solution that solve the end-to-end goal (not necessary with the required quality) in a few days?”. Continue reading