How can we develoop an anomaly detection model?

Consider a client logging into your mobile phone contract on the website of your service provider for additional details. The website times out when you put your order. When the customer calls technical support, it might be unclear where the mistake happens and why in the application stack.

-- Does the error occur in the front or back-end of the software?

-- Is the network overloaded or does it lock a database server?

A lengthy, labor-intensive method may be the standard technical support

technique of manually looking through log files to diagnose the problem.

By automating root cause analysis by using an anomaly detection framework and creating prevention models, I can better understand the device situation by the service provider and solve accidents more quickly.

__________________________________________________________________

1- Setting Realistic Expectations

__________________________________________________________________

It is necessary to set reasonable standards before implementing an anomaly detection system. Preetam Jinka and Baron Schwartz mention in her book Anomaly Detection for monitoring what a perfect anomaly detector is going to do, how they are evolving, how they are used and what they worked, and what I would expect to do with a real world anomaly detector.

1.1- The Ideal Structure for Anomaly Detection would be:

-- Identify suspicious device activity changes automatically.

-- Predict significant 100% accurate failures.

-- Have root cause assessments that are easy to grasp to ensure service providers know exactly how to address the challenges in hand.

-- There can be no 100 percent correct yes / no responses to any defect detector.

-- There will always be false positive and false negative and trade-offs between the two.

-- There are 100 % accurate root cause analysis, likely due to low signal-to

- noise ratios and similarity between performance metrics, which can not provide an irregular detector.

-- Service providers often have to mini mise causality by integrating anomaly detection findings with their area of expertise.

1.2- This Task is Made Complicated by Additional Challenges:

1.2.1 -- The quantity of information for training and model testing can be small

and not classified (i.e. I do not know which data points are anomalies).

Machine learning algorithms usually require vast volumes of data since deviations are not, by definition, statistically likely to occur (i.e. the probability of

abnormal activity is smaller than normal behavior) and data sets are frequently imbalanced (i.e., more normal behaviors, not the same anomalous behavior occurs).

1.2.2 -- Anomaly detectors can be installed on fast-growing dynamic systems.

As the underlying device progresses, anomaly detectors must then change their actions over time.

1.3- Single Variable Identification of Anomalies by following these steps a Baseline Univariate Anomaly Detector can be built:

1.3.1 -- First, calculate the mean m arithmetic and standard deviation s from the metric or sliding window indicator. Calculate for instance the mean and normal network latency variance over the last two hours.

1.3.2-- Secondly, evaluate the z-parameter z=(v-m)/s (the z-parameter measurements how many standard variations the metric mean is).

1.3.3 -- Third, if the z-score reaches the default threshold, mark points as anomalies. An z score of three which corresponds to three standard deviations from the average (where I assume that the data normally distributed is more than three standard deviations from the average of about three out of 1000 data points), is a good start. In practice, both statistical and domain considerations can help to determine the threshold value.

__________________________________________________________________

2- Multi-variable Anomaly Detection With Machine Learning

__________________________________________________________________

The importance of several metrics in many systems defines the health of the system. The creation independently of anomaly detectors for each metric is an uncomplicated extension of the one-metric anomaly detection approach, but this does not address potential associations and/or causal relations between metrics.

For instance, the connection between latency and traffic levels could be predicted.

A network latency spike on its own may appear anomalous but can be expected to occur in a related network traffic spike. This means that high latency in the network can only be anomalous if traffic is low.

If several, correlated metrics assess system health, anomalies can be detected by means of machine learning methods. When the data are not labelled, as is common for multi-variant anomalies (i.e. except for apparent system failure), unsupervised learning approaches, including Robust Covariance, 1 class SVM and Isolation Forests may not be acceptable at any time, I do not know whether or not systems behavior is anomalous. These algorithms basically work by defining groups with related data points and taking into account deviations in points outside of these groups.

The Robust Covariance technique presupposes that normal data points have a Gaussian distribution and thus estimates the joint distribution structure (i.e., estimates the Gaussian multivariate distribution's mean and covariance).

__________________________________________________________________

3- A Deep Learning Approach

__________________________________________________________________

Another common method for multivariate detection of anomaly is neural network-based auto-encoders. By encrypting them into an unconstrained mechanism where high dimensional multivariable datasets are shown, automotive encoders learn efficient representations of complex datasets. For instance, a dataset can be efficient cat images.

This is expressed by an auto-encoder that learns to rebuild images based on small features ( e.g. the color of the cat and its pose). The auto-encoder performs well, with low reconstruction errors, if trained with a data set consisting entirely of cats. However, I expect a higher reconstruction error when the auto-encoder is faced with dog images. Similarly, an auto-encoder educated on standard network data learns how standard conduct looks. The reconstruction error is supposed to be low when regular data points are encountered. However, the error is high when irregular data points are reached and the data points are listed as anomalies.

__________________________________________________________________

4- Next Steps

__________________________________________________________________

The next step is the incorporation into a production system once anomaly detection models have been established. This can present problems for data engineering, since irregularities with a potentially large volume of streaming data should be detected continuously, in real time. The performance of an anomaly detector can be incorporated into an automated root cause analysis system and finally into a predictive maintenance system. The subject of the future blog posts will be these subjects.

Pre-conditions for panel cointegration analysis?

How can I connect a PSO or other Optimization Algorithms code to a PI controller in Simulink?

How to choose the best optimization algorithm related to your problem?

How to solve inequality constraints in Genetic Algorithm optimization tool box?

communication behavior : Data normalization?

Can anyone recommend a protocol to obtain lysates from cells encapsulated in hydrogels?

How can one simulate a PPG signal using MATLAB ?

Why CNN algorithm performs well in Network Intrusion Detection based on some recent resources?

Can i quantify protein concentration in sds page?

How do I collect online measures in a same-different task?

MVN, how to conciliate contradictory results from Multivariate normality and Multivariate outliers test?

Has anyone encountered any problems with abm good microRNA primers for quantification of mature miRNAs?

How to convert isotrophic Deriche filter into anistrophic?

What is the reason, we got both test and control band on LFA in negative sample?

What is the Greedy windowed algorithm description?

Antibody dilution for IHQ in OCT frozen tissue is more similar to IHC in Paraffin-embedded samples or to Immunocytochemistry?

What are the methods for anomaly detection in non-stationary time series data?