A Distributed Genetic Evolutionary Tuning for Data Clustering: Part 1

This was my original post that was published on the AgilOne blog on June 2013 about the developed framework for self-tuning of data clustering algorithms.

In order for any data analytics service provider to high margin sustainable business has to deal with scalability, multi-tenancy and self-adaptability. Machine learning is a very powerful instrument for Big Data applications but a bad choice of algorithm can lead to poor results of the intended analysis. One way to mitigate this is to automate the tuning process. Such as tuning process should not require a priori knowledge of the data and without human intervention. As a Big Data Engineer at AgilOne, I worked on solutions for the self-tuning open problem. The work led to the development of TunUp: A Distributed Cloud-based Genetic Evolutionary Tuning for Data Clustering. The result was a solution that automatically evaluates and tunes data clustering algorithms, so that clustering-based analytics services can self-adapt and scale in a cost-efficient manner. Evaluating clusters For the initial work we choose K-Means as our clustering algorithm. K-Means is a simple but popular algorithm, widely used in many data mining applications.

TunUp is open-source and available at his GitHub page: https://github.com/gm-spacagna/tunup

The original report is available at: http://www.academia.edu/5082681/TunUp_A_Distributed_Cloud-based_Genetic_Evolutionary_Tuning_for_Data_Clustering

Advertisements

About Gianmario

Data Scientist with experience on building data-driven solutions and analytics for real business problems. His main focus is on scaling machine learning algorithms over distributed systems. Co-author of the Agile Manifesto for Data Science (datasciencemanifesto.com), he loves evangelising his passion for best practices and effective methodologies amongst the data geeks community.
Link | This entry was posted in Amazon EC2, Cloud, Clustering, Correlation, Java, Machine Learning, Open Source, Software Development and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s