Tag Archives: best practices

Lessons learnt from building data-driven production systems at Barclays

In the last years at Barclays we learnt and tried a lot of stuff that made the Advanced Analytics team very successful inside a large organization where, as such, being a productive data scientist is a tough challenge.

Our team works on a mix of descriptive, predictive and prescriptive projects that make use of machine learning and big data technologies, mainly on top of Apache Spark. Even though we deliver per-request insights coming from manual analysis, we primarily build automated and scalable systems to be periodically used either internally for a better decision-making or customer-facing in the form of analytics services (e.g. via the web portal).
In this post series I want to share some of the best practices, tools, methodologies and workflows that we experimented and the lessons learnt from them. Continue reading

Advertisements
Posted in Agile, Big Data, Machine Learning | Tagged , , , , | 4 Comments

Coding practices for data products development

Code should be developed in a proper IDE and make use of advanced tools for re-factoring, auto-completion, syntax highlighting and auto-formatters; at least.

Notebooks should use routine libraries from the main codebase. As soon as some code is developed in a notebook and is reusable, it should be moved into a codebase. Continue reading

Posted in Agile, Machine Learning, Software Development | Tagged , , , , | 2 Comments

The ScrumBan Jira board

Let’s start with one of the core tool of the agile workflow. We use a Jira board for tracking and organizing all of our projects. We developed a custom board which uses the sprints concept of Scrum but in a more flexible way as in Kanban. Continue reading

Posted in Agile, Big Data | Tagged , , , , , | 4 Comments