Robust and declarative machine learning pipelines for predictive buying

Proof of concept of how to use Scala, Spark and the recent library Sparkz for building production quality machine learning pipelines for predicting buyers of financial products.

The pipelines are implemented through custom declarative APIs that gives us greater control, transparency and testability of the whole process.

The example followed the validation and evaluation principles as defined in The Data Science Manifesto available in beta at http://www.datasciencemanifesto.org

Advertisements

About Gianmario

Data Scientist with experience on building data-driven solutions and analytics for real business problems. His main focus is on scaling machine learning algorithms over distributed systems. Co-author of the Agile Manifesto for Data Science (datasciencemanifesto.com), he loves evangelising his passion for best practices and effective methodologies amongst the data geeks community.
This entry was posted in Big Data, Classification, Machine Learning, Scala, Spark and tagged , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s