Supervised Machine Learning for Anomaly Detection using ML.NET
Machine learning algorithms can be trained with two approaches: supervised or unsupervised. Supervised learning means that the algorithm is trained with labeled data. In the case of anomaly detection, this means each line of the training data set includes a label indicating whether or not an input is an anomaly. The algorithm learns the relationship between inputs and outputs based on the label for each line of the training data set. Unsupervised learning algorithms are trained without labels to make sense of unorganized data. Once trained, the algorithm can be used to make a prediction on new unlabeled data.
The anomaly detection algorithms available for C# Digital Twins are binary classification algorithms trained with labeled data indicating whether the data constitutes an anomaly. These algorithms are provided by the Microsoft’s ML.NET library.
Available Algorithms
The algorithms used for anomaly detection are called binary classification algorithms because they classify the elements of a set into two groups (normal or abnormal in this case). ML.NET offers many binary classification algorithms, and each algorithm has characteristics that make it better suited for specific applications. You can select one or more algorithms for training and evaluation to find the best fit for your application.
Here is the list of the ML.NET binary classification algorithms that can be used in the ScaleOut Digital Twins (Source: ML.NET documentation)
AveragedPerceptron : linear binary classification model trained with the averaged perceptron
FastForest : decision tree binary classification model using Fast Forest
FastTree : decision tree binary classification model using FastTree
LbfgsLogisticRegression : linear logistic regression model trained with L-BFGS method
LightGbm : boosted decision tree binary classification model using LightGBM
LinearSvm : linear binary classification model trained with Linear SVM
SdcaLogisticRegression : binary logistic regression classification model using the stochastic dual coordinate ascent method
SgdCalibrated : logistic regression using a parallel stochastic gradient method
SymbolicSgdLogisticRegression : linear binary classification model trained with the symbolic stochastic gradient descent
FieldAwareFactorizationMachine : field-aware factorization machine model trained using a stochastic gradient method
Averaged Perceptron
The averaged perceptron algorithm is an extension of the standard perceptron that updates the model’s weights based on misclassified examples during training, but instead of using the final weight vector, it averages all the weight vectors from each iteration to reduce variance and improve prediction accuracy.
See also
Fast Forest
The Fast Forest classification algorithm is an ensemble method that constructs a large number of decision trees using random feature subsets at each split, and accelerates training and prediction by employing techniques like data binning and histogram-based node splitting, making it well-suited for large-scale and high-dimensional data classification tasks.
See also
Fast tree
The Fast Tree classification algorithm is a gradient boosting method that builds an ensemble of decision trees in a stage-wise manner, optimizing for both speed and accuracy by using efficient histogram-based techniques for finding optimal split points, making it particularly effective for large datasets with high-dimensional features.
See also
L-BFGS Logistic Regression
L-BFGS Logistic Regression is an optimization-based classification algorithm that uses the Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method to efficiently handle large-scale datasets by approximating the inverse Hessian matrix to find the optimal weights for the logistic regression model.
See also
Light Gradient Boosted Machine
Light Gradient Boosted Machine (LightGBM) is a highly efficient gradient boosting framework that uses a histogram-based learning method, optimized for speed and memory usage, and is particularly well-suited for handling large-scale datasets with high dimensionality and complex patterns.
See also
Linear SVM
The Linear SVM algorithm is a supervised learning method that finds the optimal hyperplane to linearly separate data into classes by maximizing the margin between the closest points of the classes, making it effective for binary classification tasks with linearly separable data.
See also
Sdca Logistic Regression
Sdca Logistic Regression is a scalable classification algorithm that uses Stochastic Dual Coordinate Ascent (SDCA) to efficiently optimize the logistic loss function, making it suitable for large-scale and sparse datasets.
See also
Stochastic Gradient Descent Calibrated
Stochastic Gradient Descent Calibrated (SGD Calibrated) is an iterative optimization algorithm that combines stochastic gradient descent with an additional calibration step, such as Platt scaling, to improve the probability estimates of linear classifiers for better predictive accuracy.
See also
Symbolic Stochastic Gradient Descent
Symbolic Stochastic Gradient Descent is a variant of stochastic gradient descent that leverages symbolic computation to optimize the training of machine learning models by efficiently managing and computing gradients through symbolic expressions. It makes its predictions by finding a separating hyperplane.
See also
Field Aware Factorization Machine
Field Aware Factorization Machine is a machine learning model designed to capture interactions between features in different fields, enhancing predictive performance by incorporating field-specific factorization in addition to general feature interactions.
See also
Understanding Training Metrics
It is important to understand how to evaluate an algorithm as you go through the training process. While the accuracy of a trained model is an important metric, other metrics, such as precision and recall, should also be evaluated for each application to avoid issues like false positives.
Please refer to the Microsoft ML.NET documentation to understand the different metrics produced by ML.NET that the tool will compute and display after the candidate ML algorithms you select have been trained.
Definition of metrics used by Microsoft ML.NET