Skip to main content
Skip table of contents

ML Algorithms > [Algorithm Name]

Intended audience: DATA SCIENTISTS DEVELOPERS ADMINISTRATORS

AO Platform: 4.3

Overview

This topic only relates to ML Models of Type: Regression, Classification, and Clustering. For Bayesian ML Models, the ML Algorithms page is not available; instead, the user will see a Learning Algorithm page. See Learning Algorithm.

The AO Platform is provided with the following out-of-the-box ML Algorithms that can be used to create Models. Many more are expected to be added over time. For a full description of the ML Algorithms and their technical details, see > https://scikit-learn.org/stable/.

All properties are prepopulated with the default values - see links for each of the ML Algorithms in the table below.

ML Algorithm Types

Regression

ML Algorithms

Category

Description

Additional Documentation

Lasso Regression

Supervised Learning

Linear Model trained with L1 prior as regularizer (aka the Lasso). Technically, the Lasso model is optimizing the same objective function as the Elastic Net with l1_ratio=1.0 (no L2 penalty).

Link

Linear Regression

Supervised Learning

LinearRegression fits a linear model with coefficients w = (w1, …, wp) to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation.

Link

Logistic Regression

Supervised Learning

This class implements regularized logistic regression using the ‘liblinear’ library, ‘newton-cg’, ‘sag’, ‘saga’, and ‘lbfgs’ solvers. Note that regularization is applied by default. It can handle both dense and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit floats for optimal performance; any other input format will be converted (and copied).

Link

Ridge Regression

Supervised Learning

This model solves a regression model where the loss function is the linear least squares function and regularization is given by the l2-norm. This estimator has built-in support for multi-variate regression (i.e., when y is a 2d-array of shape (n_samples, n_targets)).

Link

Support Vector Regression (SVR)

Supervised Learning

The free parameters in the model are C and epsilon.

The implementation is based on libsvm. The fit time complexity is more than quadratic with the number of samples, which makes it hard to scale to datasets with more than a couple of 10000 samples.

Link

Classification

ML Algorithms

Category

Description

Additional Documentation

ADA Boost Classifier

Supervised Learning

An AdaBoost classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. This class implements the algorithm known as AdaBoost-SAMME.

Link

Decision Tree Classifier

Supervised Learning

A class capable of performing multi-class classification on a dataset.

Link

K Neighbors Classifier

Supervised Learning

Classifier implementing the k-nearest neighbors vote.

Link

Random Forest Classifier

Supervised Learning

A random forest is a meta estimator that fits several decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.

Link

Support Vector Classifier (SVC)

Supervised Learning

The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples. For large datasets, consider using LinearSVC or SGDClassifier instead, possibly after a Nystroem transformer.

The multiclass support is handled according to a one-vs-one scheme.

Link

XG Boost Classifier

Supervised Learning

eXtreme Gradient Boosting for classification.

GB builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage, n_classes_ regression trees are fit on the negative gradient of the binomial or multinomial deviance loss function. Binary classification is a special case where only a single regression tree is induced.

Link

Clustering

ML Algorithms

Category

Description

Additional Documentation

DBSCAN Clustering

Unsupervised Learning

Perform DBSCAN clustering from a vector array or distance matrix.

DBSCAN - Density-Based Spatial Clustering of Applications with Noise. Finds core samples of high density and expands clusters from them. Good for data that contains clusters of similar density.

Link

K Means Clustering

Unsupervised Learning

The K Means algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares (see below). This algorithm requires the number of clusters to be specified. It scales well to many samples and has been used across a large range of application areas in many different fields.

Link


Contact App Orchid | Disclaimer

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.