Tag Archives: sparkz

The Barclays Data Science Hackathon: Building Retail Recommender Systems based on Customer Shopping Behaviour

In the depths of the last cold, wet British winter, the Advanced Data Analytics team from Barclays escaped to a villa on Lanzarote, Canary Islands, for a one week hackathon where they collaboratively developed a recommendation system on top of Apache Spark. The contest consisted on using Bristol customer shopping behaviour data to make personalised recommendations in a sort of Kaggle-like competition where each team’s goal was to build an MVP and then repeatedly iterate on it using common interfaces defined by a specifically built framework. Continue reading

Posted in Agile, Machine Learning, recommender systems, Scala, Spark | Tagged , , , , , , , , , , | Leave a comment

Robust and declarative machine learning pipelines for predictive buying

Proof of concept of how to use Scala, Spark and the recent library Sparkz for building production quality machine learning pipelines for predicting buyers of financial products.

The pipelines are implemented through custom declarative APIs that gives us greater control, transparency and testability of the whole process.

The example followed the validation and evaluation principles as defined in The Data Science Manifesto available in beta at http://www.datasciencemanifesto.org Continue reading

Posted in Big Data, Classification, Machine Learning, Scala, Spark | Tagged , , , , , , | Leave a comment