ADEngine¶
pyod.utils.ad_engine.ADEngine is PyOD’s anomaly detection lifecycle engine. It provides three layers of capability:
Knowledge queries – list detectors, explain detectors, get benchmarks
Detection lifecycle – profile, plan, run, analyze, explain, iterate, report
Session workflow (V3) – start → plan → run → analyze → iterate → report with typed state
See Layer 2: ADEngine Lifecycle Orchestration for usage examples and Layer 3: Agentic Investigation for the agentic workflow.
pyod.utils.ad_engine module¶
ADEngine: anomaly detection lifecycle engine.
Handles data profiling, detection planning, detector construction, and knowledge queries. Works as a standalone Python API (no LLM required) or as the backend for MCP/agent interfaces.
- class pyod.utils.ad_engine.ADEngine(knowledge_dir: str | None = None, random_state: int | None = None)[source]¶
Bases:
objectAnomaly detection lifecycle engine.
Parameters¶
- knowledge_dirstr or None
Path to knowledge base directory. If None, uses bundled.
- random_stateint or None, optional
Random seed forwarded to every detector that declares an explicit
random_stateparameter when the engine instantiates it from a plan. Detectors withoutrandom_statein their signature (e.g., ABOD, KNN, LOF, SOD) are deterministic by construction (distance, angle, or density based, with no internal sampling) and need no seed. With this set, the shallow-detector pipeline is reproducible: a run-to-run audit of the shipped shallow detectors found every one either honors the seed or is deterministic by construction, with no nondeterministic cases. Deep detectors additionally depend on framework-level seeding (e.g.,torch.manual_seed). Set this to a fixed integer for byte-identical flagged sets across re-runs on the same input.
- analyze(state: InvestigationState) InvestigationState[source]¶
Analyze detection results with quality assessment.
Computes per-detector analysis, consensus analysis, quality metrics (separation, agreement, stability), and selects the best detector.
Parameters¶
state : InvestigationState
Returns¶
state : InvestigationState
- analyze_results(result: dict, X: Any = None, top_k: int = 10) dict[source]¶
Analyze detection results.
Parameters¶
- resultdict
Output of run_detection().
- Xarray-like or None
Original training data for feature-level analysis.
- top_kint
Number of top anomalies to return.
Returns¶
analysis : dict
- build_detector(plan: dict) Any[source]¶
Build and return an unfitted detector from a plan.
Parameters¶
- plandict (DetectionPlan)
Output of plan_detection().
Returns¶
detector : BaseDetector
- compare_detectors(names: list[str] | None = None, data_type: str | None = None, top_k: int = 3) list[dict][source]¶
Compare detectors.
When names is provided, returns explanations for those detectors in input order.
When names is omitted and data_type has a benchmark-backed ranking in the KB, returns up to top_k detectors ranked by that benchmark, then appends remaining shipped detectors in catalog order until top_k is reached. Two ranking sources are supported: top-level overall_top_5 for benchmarks whose names match PyOD detector names (currently tabular via ADBench); per-detector benchmark_rank metadata when the benchmark lists paper method names (currently time_series via TSB-AD, sorted ascending by the best matching rank key). For modalities without an applicable ranking (graph, text, image, multimodal) or when no data_type is given, falls back to the catalog order from list_detectors.
Parameters¶
- nameslist of str or None
Explicit list of detector names to compare.
- data_typestr or None
Filter by data type.
- top_kint
Number of detectors to return when not using explicit names.
Returns¶
comparison : list of dict
- contamination_diagnostics(state: InvestigationState, threshold_sweep: list[float] | None = None) dict[source]¶
Diagnostic helper for contamination calibration.
Reports the contamination value the run actually used, the actual flagged rate from the consensus, the score-percentile distribution, and (optionally) a threshold sweep showing what fraction would be flagged at each candidate contamination value. The agent can use these numbers to choose a sensible next contamination before iterating.
This helper does NOT estimate contamination automatically and does NOT mutate state. It is purely a read-only diagnostic the agent uses to inform a subsequent engine.iterate(state, {‘action’: ‘adjust_contamination’, ‘value’: <rate>}) call.
Parameters¶
- stateInvestigationState
Must be in the ‘analyzed’ phase.
- threshold_sweeplist of float or None
Optional sequence of candidate contamination values in (0, 1). For each value c, the result includes the corresponding threshold (the (1 - c) quantile of consensus scores) and the resulting flagged rate. Use this to preview how the flagged set would change before deciding to iterate. Values outside (0, 1) are skipped.
Returns¶
- diagnosticsdict
Keys:
effective_contamination(float or None): contamination value from the primary plan’s params, orNoneif the plan has no contamination set.flagged_rate(float): actual fraction flagged by the consensus labels.score_percentiles(dict[int, float]): consensus-score percentiles at the 50th, 75th, 90th, 95th, and 99th.threshold_sweep(list of dict, optional): present only whenthreshold_sweepwas passed; each entry hascontamination,threshold, andflagged_rate.
- detect(X_train: Any, X_test: Any = None, data_type: str | None = None, priority: str = 'balanced') dict[source]¶
One-shot anomaly detection: profile -> plan -> run -> analyze.
Parameters¶
- X_trainarray-like
Training data.
- X_testarray-like or None
Optional test data.
- data_typestr or None
Explicit data type override.
- prioritystr
‘speed’, ‘accuracy’, or ‘balanced’.
Returns¶
- resultdict
Output of run_detection() enriched with analysis. Compatible with all Tier B methods (analyze_results, explain_findings, suggest_next_step, generate_report).
- explain_detector(name: str) dict[source]¶
Explain a detector.
Parameters¶
- namestr
Detector short name (e.g. ‘ECOD’).
Returns¶
info : dict
- explain_findings(result: dict, indices: list[int] | None = None, top_k: int = 5, X: Any = None, feature_names: list[str] | None = None) list[dict][source]¶
Explain why specific samples were flagged as anomalies.
Parameters¶
- resultdict
Output of run_detection().
- indiceslist of int or None
Specific sample indices. If None, explains top-k.
- top_kint
Number of top anomalies to explain if indices is None.
- Xarray-like or None
Original data for feature-level explanations.
- feature_nameslist of str or None
Optional feature labels in column order, threaded through to
feature_contributionsso each contributing feature has a human-readable name. When omitted, names default tof'feature_{column_index}'.
Returns¶
- explanationslist of dict
Each entry has
'index','score','percentile','label','narrative'. WhenXis provided, also includes'contributing_features': a list of dicts with'feature','name','value','mean','z_score', and'direction'.
- generate_report(result: dict, analysis: dict, format: str = 'text') str[source]¶
Generate a summary report.
Parameters¶
- resultdict
Output of run_detection().
- analysisdict
Output of analyze_results().
- formatstr
‘text’ (markdown) or ‘json’.
Returns¶
report : str
- get_benchmarks(benchmark: str = 'all') dict[source]¶
Get benchmark results.
Parameters¶
- benchmarkstr
Benchmark name, or ‘all’ for everything.
Returns¶
benchmarks : dict
- get_kb_for_routing(profile: dict, top_k: int = 3, constraints: dict | None = None) dict[source]¶
Return a structured KB snapshot for caller-driven detector selection.
This is the agent-facing companion to
plan_detection().plan_detectionconsumes the KB through hand-coded rules and returns a single plan;get_kb_for_routingexposes the KB directly so a caller (LLM agent, MCP tool client, …) can reason over each detector’s strengths, weaknesses, complexity, and benchmark rank, then callmake_plan()to commit a plan.Parameters¶
- profiledict
Output of
profile_data(). Must includedata_type;n_samples/n_featuresare passed through unchanged.- top_kint, default 3
The number of detectors the caller intends to select. The KB snapshot itself is returned in full (filtered + sorted); the field is included in the returned dict so the response-format hint can reference it.
- constraintsdict or None, optional
{'exclude_detectors': list[str], 'data_type_strict': bool}.exclude_detectorsis a hard filter.data_type_strict(defaultTrue) drops detectors whose KBdata_typesfield does not includeprofile['data_type'].
Returns¶
- dict
{'task_profile': {...}, 'available_detectors': [...], 'top_k_requested': int, 'response_format_hint': str, 'n_available': int}.
Notes¶
Pure function; no LLM calls, no state mutation.
- investigate(X: Any, data_type: str | None = None, priority: str = 'balanced') InvestigationState[source]¶
One-shot investigation: start → plan → run → analyze.
Parameters¶
- Xarray-like
Input data.
data_type : str or None priority : str
Returns¶
state : InvestigationState
- iterate(state: InvestigationState, feedback: str | dict) InvestigationState[source]¶
Iterate based on feedback.
Structured dicts execute immediately. NL strings are parsed with confidence; ambiguous feedback triggers
'confirm_with_user'.Most actions require phase
'analyzed'. The'recover'action also accepts phase'detected'so the agent can substitute failed detectors immediately afterrun()without first callinganalyze().Parameters¶
state : InvestigationState feedback : str or dict
Returns¶
state : InvestigationState
- list_detectors(data_type: str | None = None, status: str = 'shipped') list[dict][source]¶
List available detectors.
Parameters¶
- data_typestr or None
Filter by data type (e.g. ‘tabular’, ‘text’).
- statusstr
Filter by status. Use ‘all’ to list everything.
Returns¶
detectors : list of dict
- make_plan(detector_choices: list, justifications: list | None = None, params: list | None = None) dict[source]¶
Commit a caller-driven detector plan and return a DetectionPlan.
Companion to
get_kb_for_routing(). The caller (LLM agent, rule engine, human script) selectslen(detector_choices)detectors and this method validates names against the KB, fills per-detector defaults, and packages the result as apyod.utils._kb_router.make_plan()-shaped dict so existing consumers (build_detector,run, downstream MCP clients) keep working unchanged.Parameters¶
- detector_choiceslist of str
Ordered list of detector class names.
detector_choices[0]is the primary; the rest becomealternativesin plan order. Length must be >= 1. Names must match KB entries (case-sensitive) withstatus='shipped'; otherwiseValueErroris raised.- justificationslist of str, optional
Parallel to
detector_choices. One short sentence per choice.Noneis accepted and yields autogenerated reasons.- paramslist of dict, optional
Parallel to
detector_choices. Per-detector constructor kwargs.None-> KB defaults overlaid with the engine’s contamination resolution.
Returns¶
- dict
Closed-schema DetectionPlan:
{'detector_name', 'params', 'reason', 'evidence', 'confidence', 'alternatives', 'note'}.
Raises¶
- ValueError
If
detector_choicesis empty or any name is unknown / notstatus='shipped'in the KB.
- plan(state: InvestigationState, priority: str = 'balanced', constraints: dict | None = None) InvestigationState[source]¶
Plan detection: select top-N detectors.
Wraps
plan_detection()and extracts primary + alternatives intostate.plans(up to 3 detectors, v1 limit).Parameters¶
state : InvestigationState priority : str constraints : dict or None
Returns¶
state : InvestigationState
- plan_detection(profile: dict, priority: str = 'balanced', constraints: dict | None = None, *, top_k: int = 3, llm_client=None, llm_strict: bool | None = None) dict[source]¶
Plan a detection pipeline.
Parameters¶
- profiledict
Output of profile_data().
- prioritystr
‘speed’, ‘accuracy’, or ‘balanced’.
- constraintsdict or None
Optional: {‘exclude_detectors’: […]}
- top_kint, default 3
Number of detectors in the returned plan (primary +
top_k - 1alternatives). Default3preserves the v3.5.2 behaviour (valid[1:3]produced two alternatives plus the primary). Values < 1 are clamped to 1.- llm_clientcallable or None, default None
Optional
(prompt: str) -> strcallable (seepyod.utils._llm.LLMCallable). When provided, routing consults the LLM with the KB context and parses its response into a plan viapyod.utils._llm.parse_routing_response(). If the LLM call or parser raises, falls back to rule routing with aRuntimeWarning(seellm_strict). WhenNone(default), v3.5.2 rule routing is unchanged.- llm_strictbool or None, default None
Per-call control for LLM-routing failure mode.
Truere-raises any exception fromllm_clientor the response parser;Falsefalls back to rule routing with aRuntimeWarning;Nonedefers to thePYOD3_LLM_STRICTenvironment variable ("1"re-raises, anything else falls back). The explicit kwarg takes precedence so concurrent callers in the same process can choose independently.
Returns¶
plan : dict (DetectionPlan, closed schema)
- profile_data(X: Any, data_type: str | None = None) dict[source]¶
Profile the input data.
Parameters¶
- Xarray-like, list, or dict
Input data.
- data_typestr or None
Explicit override. One of ‘tabular’, ‘text’, ‘image’, ‘audio’, ‘time_series’, ‘multimodal’, ‘graph’.
Returns¶
profile : dict
- report(state: InvestigationState, format: str = 'text') str | dict[source]¶
Generate investigation report.
Text format wraps
generate_report()for best detector, prepending session-level context. JSON format returns a native dict.Parameters¶
state : InvestigationState format : str
‘text’ or ‘json’.
Returns¶
report : str or dict
- run(state: InvestigationState) InvestigationState[source]¶
Run detection with all planned detectors.
Wraps
run_detection()per plan. Computes consensus via rank normalization and majority vote. Records errors per detector without stopping.Parameters¶
state : InvestigationState
Returns¶
state : InvestigationState
- run_detection(X_train: Any, plan: dict, X_test: Any = None) dict[source]¶
Execute a detection plan.
Parameters¶
- X_trainarray-like
Training data.
- plandict (DetectionPlan)
Output of plan_detection().
- X_testarray-like or None
Optional test data.
Returns¶
- resultdict
Keys: ‘plan’, ‘scores_train’, ‘labels_train’, ‘threshold’, ‘n_anomalies’, ‘anomaly_ratio’, ‘detector’, ‘runtime_seconds’, ‘score_summary’. If X_test: also ‘scores_test’, ‘labels_test’.
- start(X: Any, data_type: str | None = None) InvestigationState[source]¶
Start an investigation session.
Profiles the data and returns an InvestigationState.
Parameters¶
- Xarray-like, Data, list, or dict
Input data (any modality).
- data_typestr or None
Explicit type override.
Returns¶
state : InvestigationState
- suggest_next_step(result: dict, analysis: dict, feedback: str | None = None) dict[source]¶
Suggest what to try next.
Parameters¶
- resultdict
Output of run_detection().
- analysisdict
Output of analyze_results().
- feedbackstr or None
User feedback like ‘too many false positives’.
Returns¶
- suggestiondict
Keys: ‘action’, ‘reason’, optionally ‘new_plan’, ‘threshold_adjustment’.
- validate(state: InvestigationState, y: Any) dict[source]¶
Hindsight validation of consensus and per-detector results.
Computes label-based metrics from y against the consensus labels and each successful detector, plus a consensus-vs-best-detector diagnostic so the agent can see whether consensus actually helped.
Pure functional; does not mutate state. Use after analyze when held-out labels become available (e.g., a labeled cohort opened post-hoc for hindsight evaluation). For routine unsupervised detection runs, this method is unnecessary.
Parameters¶
- stateInvestigationState
Must be in the ‘analyzed’ phase.
- yarray-like, shape (n_samples,)
Held-out binary labels (0 = inlier, 1 = anomaly). Length must match the consensus.
Returns¶
- validationdict
Keys:
consensus(dict): label_metrics for the consensus labels and scores.per_detector(dict[str, dict]): label_metrics per successful detector, keyed by detector name.best_detector(dict or None): label_metrics for the detector picked by analyze as best (or None when state.analysis does not name one).consensus_vs_best(dict): comparison summary with keysconsensus_f1,best_detector_f1(or None), andconsensus_helped(True if consensus F1 is at least the best-detector F1; None when no best detector).false_positives(list[int]): row indices flagged by consensus but inlier in y.false_negatives(list[int]): row indices not flagged by consensus but anomaly in y.
Raises¶
- ValueError
If state is not in ‘analyzed’ phase, if the consensus is missing (all detectors failed), or if len(y) does not match the consensus length.
pyod.utils.investigation module¶
Investigation state for ADEngine session workflow.
- class pyod.utils.investigation.InvestigationState(phase: str, iteration: int = 0, history: list = <factory>, data: object = None, profile: dict = <factory>, plans: list = <factory>, results: list = <factory>, consensus: dict = None, analysis: dict = None, quality: dict = None, next_action: dict = <factory>)[source]¶
Bases:
objectTyped state object for an ADEngine investigation session.
Tracks the full workflow: profiling, planning, detection, analysis, and iteration. Each session method updates the state and sets
next_actionto guide the agent.Attributes¶
- phasestr
One of
PHASES: ‘profiled’, ‘planned’, ‘detected’, ‘analyzed’.- iterationint
Current iteration (0 = first run).
- historylist
List of HistoryEntry dicts.
- dataobject
Reference to input data (not copied).
- profiledict
Output of
profile_data().- planslist
List of DetectionPlan dicts (top-N).
- resultslist
List of DetectorResult dicts.
- consensusdict or None
ConsensusResult dict.
- analysisdict or None
InvestigationAnalysis dict.
- qualitydict or None
QualityAssessment dict.
- next_actiondict
NextAction dict guiding the agent.
pyod.utils.knowledge module¶
Knowledge base for PyOD’s intelligent agent layer.
Loads structured JSON files containing algorithm metadata, benchmark results, routing rules, and paper citations.
- class pyod.utils.knowledge.KnowledgeBase(knowledge_dir=None)[source]¶
Bases:
objectLoader and accessor for PyOD’s structured knowledge base.
Reads JSON files from the knowledge directory and provides query methods for algorithm metadata, benchmarks, and routing.
Parameters¶
- knowledge_dirstr or None
Path to knowledge directory. If None, uses the bundled directory shipped with PyOD.
- property algorithms¶
- property benchmarks¶
- list_by_data_type(data_type, status='shipped')[source]¶
List algorithms supporting a given data type.
- property papers¶
- property routing_rules¶