Layer 1: Tabular Anomaly Detection

PyOD has 43 tabular detectors covering probabilistic, linear, proximity, ensemble, and deep learning approaches. All use the same fit/predict/decision_function API.

from pyod.models.iforest import IForest
clf = IForest()
clf.fit(X_train)
y_train_scores = clf.decision_scores_
y_test_scores = clf.decision_function(X_test)


All Tabular Examples

Probabilistic: ECOD, COPOD, ABOD, MAD, SOS, QMCD, KDE, Sampling, GMM

Linear Models: PCA, KPCA, MCD, CD, OCSVM, LMDD

Proximity-Based: LOF, COF, CBLOF, LOCI, HBOS, HDBSCAN, KNN, SOD, ROD

Outlier Ensembles: IForest, INNE, DIF, Feature Bagging, LSCP, XGBOD, LODA, SUOD

Neural Networks: AutoEncoder, VAE, DeepSVDD, SO_GAAL, MO_GAAL, AnoGAN, ALAD, AE1SVM, DevNet


Example Walkthrough

Full example: knn_example.py

  1. Import and generate data:

from pyod.models.knn import KNN
from pyod.utils.data import generate_data, evaluate_print

contamination = 0.1
X_train, X_test, y_train, y_test = generate_data(
    n_train=200, n_test=100, contamination=contamination)
  1. Fit and predict:

clf = KNN()
clf.fit(X_train)

y_train_pred = clf.labels_                  # 0: inlier, 1: outlier
y_train_scores = clf.decision_scores_       # raw scores
y_test_pred = clf.predict(X_test)
y_test_scores = clf.decision_function(X_test)
  1. Evaluate:

evaluate_print('KNN', y_test, y_test_scores)
# KNN ROC:0.9989, precision @ rank n:0.9