An anomaly is defined as a pattern that does not conform to expected or normal behavior. When finding anomalies goes beyond the skill of mere humans due to quantity, complexity, speed, or the infrequent occurrence of anomalies, machine learning can be used for confident anomaly detection. Machine learning anomaly detection is used extensively across many industries for tasks such as fraud detection, process control, manufacturing, cyber-security, fault detection in critical systems, and military surveillance.
A typical machine learning anomaly detection approach defines a region in the data that represents normal behavior, fits a model that describes this normal behavior, then concludes that any observation not belonging to this model is an anomaly. However, common issues with data can make anomaly detection very challenging. With some ingenuity, there are paths to overcome the challenges.
- Defining a normal region in the data that encompasses every possible normal behavior is very difficult. For example, one SpaceTime customer seeking to predict anomalies with a sophisticated renewable energy asset had less than a year of operating data to train their model. This, in turn, created a challenge to define a broad swath of ‘normal’ operating conditions in the model. To overcome this challenge, we focused on short time periods following the asset type’s maintenance intervals to help define ‘normal’, which provided a higher level of assurance that the data used for training did not contain operating anomalies.
- The normal behavior may evolve also over time, so that the currently-used normal model may become obsolete. To overcome this issue, a regularized model that reduces the variability of the estimated model and avoids overfitting to the training data can be applied in an Expectation Maximization algorithm. Regularization deliberately adds bias to the estimated model parameters, which is essential to make the normal fitted model generalize well to time periods that were never observed before.
- Finally, noisy data (seemingly meaningless data with many random outliers) can lead to a high rate of false positives. To overcome this, SpaceTime has used a scoring technique that relies on the Irwin-Hall distribution that not only scores each observation for anomalies, but combines a sequence of observations to trigger an alert. In other words, if the data is noisy, the alert will not be triggered, however if a sequence of anomalous observations occurs for an extended period of time, an alert will be triggered.
Anomaly detection, in its most general form, is not an easy problem to solve. When your project begins with imperfect data, it makes anomaly detection even more challenging. With an understanding of the project’s subject, the business it supports, the assets’ characteristics and operations, and the relationship of the data among them, proven statistical techniques can be applied to overcome many of the common issues with data.
Read the follow-on post Confident Anomaly Detection: Put Anomalies To Work.