One of the determinants for a good anomaly detector is finding smart data representations that can easily evince deviations from the normal distribution. Traditional supervised approaches would require a strong assumption about what is normal and what not plus a non negligible effort in labeling the training dataset. Deep auto-encoders work very well in learning high-level abstractions and non-linear relationships of the data without requiring data labels. In this talk we will review a few popular techniques used in shallow machine learning and propose two semi-supervised approaches for novelty detection: one based on reconstruction error and another based on lower-dimensional feature compression.
The amount of digital data in the new era has grown exponentially in recent years and with the development of new technologies, is growing more rapidly than ever before. Simply recording data is one thing, whereas the ability to utilize it and turn it into a profit is another. Supposing we want to collect as many pieces of information as we can gather from any source, our database will be populated with a lot of sparse, unstructured, and not-explicitly-well-clear correlated data. In this essay we summarized the approach proposed in Chapter IV “Uncertain Knowledge and Representation” of the book “Artificial Intelligence: A Modern Approach” written by Russel S. and Norvig P., showing how the problem of reasoning under uncertainty is applied in data science, and in particular in the recent data revolution scenario. The proposed approach analyzes an extension of the Bayesian networks called Decisions networks that resulted to be a simple but elegant model for reasoning in presence of uncertainty.