343-class structural taxonomy of AI failure mechanisms with keyword classifier and semantic search

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

AI Failure Periodic Table

A living periodic table of AI failure—the structural spec that names mechanisms, maps evidence, and grounds deployment brakes (classification, boundaries, enforcement).

The spec, Daily Driver, and brakes

Daily Driver MCP brings the table into Cursor, Claude Desktop, and other MCP-compatible workflows. Your everyday AI can ask the table what failure class is present in a paragraph, file, URL, report, or agent workflow.

Agent Buccet is the brakes layer.

The current base is 343 failure classes across 7 dimensions.

We treat every failure as a data point in a closed-loop engineering process.

The failure-to-enforcement loop

failure → known versus unknown → class or gap → boundary → UPL / user custom rules → Buccet enforcement → proof ledger.

Connect your everyday AI (daily driver)

If you use Cursor, Claude Desktop, or another app that supports MCP (Model Context Protocol), you can plug that assistant into this repo and classify paragraphs, public URLs, or files from chat—same 343-class table, read-only (it does not edit the taxonomy).

This is where you connect: add an MCP server in your AI host’s settings that runs python3 -m src.ai_failure_mcp with this repo as the working directory (see the full guide).
This is what you get: hit or miss on the table, which class(es), compound readings, structural mitigation patterns from the taxonomy, and CONTRIBUTING-style next steps when the fit is weak.

Start here: docs/mcp-daily-driver.md — plain-English purpose, what vs how, choose your setup path (Cursor / Claude / other), first-use walkthrough, and example config (docs/cursor-mcp-config.example.json). See Guaranteed fallbacks when MCP is down and Requirements: Python vs chat model.

The Problem

AI capability is advancing faster than our shared ability to reason about what can go wrong.

Every lab has its own internal vocabulary for failure. One lab calls something one thing, the next lab calls it another, a startup doesn't name it at all because they don't know it exists yet. When an incident happens — a jailbreak, a deceptive agent, a hallucinated medical dosage — there's no shared language to say precisely what failed and why. Without shared language there's no shared defense.

This is the gap this project addresses: a common structural map for AI failure so the whole field can reason about safety in the same terms, find failures before deployment, and build defenses that transfer across systems and organizations.

Proof

The current base is 343 failure classes across 7 dimensions.

The classifier has been run against 29 primary sources: frontier system cards, safety reports, security reports, regulatory investigations, CVEs, red-team papers, and agent-security research.

Across the tested corpus:

40 classifier runs
2,777 chunks
100% of substantive AI-failure content classified
Non-hits manually checked as boilerplate, equations, benchmark tables, headers, citations, or other non-failure text

Per-source breakdowns, partial hit counts like 110/146, methodology, and the full source list—including items not named in the table below—live in docs/proof.md.

Heavy sources include:

Category	Sources
Frontier system cards	OpenAI GPT-5.3-Codex, OpenAI GPT-5.2, Anthropic Claude Opus 4.6 / 4.7, Anthropic Mythos Preview, Google Gemini 3 Pro, xAI Grok 4.1
Safety / governance	International AI Safety Report 2026, NIST/CAISI DeepSeek Eval, ICO Grok investigation, Project Glasswing
Security / threat reports	CrowdStrike 2026 Global Threat Report, Palo Alto Unit 42 2026, Cisco 2026 State of AI Security, Microsoft 2026 Data Security Index, Google Cloud AI Security
Agent/CVE research	EchoLeak CVE-2025-32711, GitHub Copilot RCE CVE-2025-53773, Google DeepMind Agent Traps
Open-weight / technical reports	DeepSeek-V3, Qwen3, Qwen3Guard, Meta Llama

Examples that resolved into existing classes:

sabotage concealment
blackmail simulation
bio uplift at the “High” threshold
500+ zero-days
100% jailbreak success rates
zero-click data exfiltration
invisible HTML/CSS agent traps
prompt-injection-to-shell-execution
open-weight irreversibility
tool misuse
indirect prompt injection
comply-then-warn failure
strategic deception

Full proof vault: docs/proof.md — intentionally large: counts, source-by-source notes, summaries, system-card mapping, chunk JSON, and everything needed to reproduce or challenge the claim.

Raw classifier reports: reports/ — chunked JSON, live summaries, and source text dumps.

If you find a real AI failure mechanism that does not fit — not boilerplate, not a vague concern, but a real mechanism with no class — open a propose-new-class issue. A real gap is not a loss. It is new structure.

The goal is not omniscience but structural predictiveness: that newly encountered failures should resolve into this structure as a class, sub-mode, or compound — unless evidence demonstrates otherwise.

Version: 1.4.22 | Released: April 2026 | License: Apache 2.0 | Status: Open for community testing and falsification

Live Visual Table

→ Open the Interactive Periodic Table

343 clickable cells. Color-coded by dimension. Live semantic search. Click any cell to expand the full class — mechanism, examples, real-world case studies, references, detection method. Or open index.html locally in any browser — fully self-contained, no server needed.

Or click to see the 343 list →

Critical-Severity Classes (26)

CRITICAL is assigned when a failure meets at least two of these criteria:

Irreversibility — harm cannot be undone after the failure occurs (e.g., released pathogen synthesis steps, published CSAM, exfiltrated model weights)
Catastrophic scale — potential to harm large populations, not individual users (e.g., bio uplift, infrastructure attack, mass-targeting)
Corrigibility breakdown — directly undermines the human ability to detect, stop, or correct AI behavior (e.g., oversight immunity, log manipulation, evaluator deception)
Enabling cascade — the failure enables other CRITICAL-class failures (e.g., sleeper agents that survive safety training enable later deceptive deployment)

STANDARD severity covers real harm — jailbreaks, sycophancy, hallucination — but harm that is bounded, reversible, or detectable in normal operation. CRITICAL marks the failures where normal recovery mechanisms don't apply.

The highest-severity failures — catastrophic or irreversible harm potential: 26 classes are marked CRITICAL

ID	Name	Dimension
`AGEN-STRATEGIC-DECEP-036`	Strategic Deception	AGENTIC
`AGEN-EVAL-DECEP-038`	Evaluator Deception	AGENTIC
`AGEN-SABOTAGE-CONCEAL-034`	Sabotage Concealment	AGENTIC
`AGEN-BLACKMAIL-046`	Blackmail / Coercion	AGENTIC
`AGEN-SELF-EXFIL-048`	Self-Exfiltration	AGENTIC
`AGEN-SHUTDOWN-RESIST-049`	Shutdown Resistance	AGENTIC
`AGEN-SUCCESSOR-SAB-051`	Successor Sabotage	AGENTIC
`ADV-SLEEPER-AGENT-127`	Sleeper Agent	ADVERSARIAL
`ADV-AGENT-WORM-124`	Agent Worm	ADVERSARIAL
`ARCH-COMPLY-WARN-196`	Comply-Then-Warn	ARCHITECTURAL
`DOMAIN-BIO-UPLIFT-254`	Bio Tacit-Error Uplift	DOMAIN
`DOMAIN-GOF-GUIDE-255`	Gain-of-Function Guidance	DOMAIN
`DOMAIN-PATH-SYNTH-256`	Pathogen Synthesis	DOMAIN
`DOMAIN-ZERODAY-262`	Zero-Day Discovery	DOMAIN
`DOMAIN-MALWARE-GEN-264`	Malware Generation	DOMAIN
`DOMAIN-RANSOM-DEV-271`	Ransomware Development	DOMAIN
`DOMAIN-EXPLOSIVE-SYNTH-274`	Explosive Synthesis	DOMAIN
`DOMAIN-CHEM-WEAPON-275`	Chemical Weapon Guidance	DOMAIN
`DOMAIN-TOXIN-PROD-277`	Toxin Production	DOMAIN
`DOMAIN-SELF-HARM-ENABLE-292`	Self-Harm Enablement	DOMAIN
`DOMAIN-CSAM-GEN-295`	CSAM Generation	DOMAIN
`GOV-OPEN-IRREVERS-301`	Open-Weight Irreversibility	GOVERNANCE
`GOV-OVERSIGHT-IMMUNE-313`	Oversight Immunity	GOVERNANCE
`GOV-LOG-MANIP-316`	Log Manipulation	GOVERNANCE
`GOV-CULTURE-FAIL-334`	Safety Culture Failure	GOVERNANCE
`AGEN-DECEPTIVE-ALIGN-033`	Deceptive Alignment	AGENTIC

The 7 Dimensions

#	Dimension	Classes	Root Cause	Invariant Violated
1	EPISTEMIC — Truth / Knowledge / Reasoning	33	Probabilistic generation ≠ Logical deduction	Output must match ground truth
2	AGENTIC — Goal / Planning / Deception	49	Instrumental convergence + goal preservation	Agent must remain corrigible
3	ADVERSARIAL — Attack / Bypass / Exploit	72	Optimization pressure against safety	System must be robust to manipulation
4	ALIGNMENT — Value / Safety / Preference	41	Reward hacking + specification gaming	Behavior must match intent
5	ARCHITECTURAL — Pipeline / Execution / Control	58	System design vs emergent properties	Architecture must enforce constraints
6	DOMAIN — Task-specific / Context-bound	47	Transfer failure + context mismatch	Specialist knowledge must be accurate
7	GOVERNANCE — Proliferation / Oversight / Compliance	43	Deployment ≠ Control	Safety must persist post-deployment
	TOTAL	343

Every class has

Mechanism — the root structural cause
Examples — concrete failure instances
Case studies — real documented incidents with system, date, outcome, source
References — primary research citations (avg 2.2 per class)
Detection — how to identify this failure
Keywords — for search and classification

Quick Start

Python 3.10+ for CLI and search:

git clone https://github.com/lml-layer-system/ai-failure-periodic-table
cd ai-failure-periodic-table

Semantic search (recommended for finding classes by meaning):

# Build the search index (one-time, ~2 seconds, no dependencies)
python scripts/generate_embeddings.py

# Search by meaning
python scripts/semantic_search.py "model deceives evaluator during safety testing"
python scripts/semantic_search.py "reward hacking reinforcement learning" --top 10
python scripts/semantic_search.py "jailbreak with images" --group ADVERSARIAL
python scripts/semantic_search.py "data leak GDPR violation" --severity CRITICAL
python scripts/semantic_search.py "autonomous agent acquires resources" --json

Classify a failure description:

python -m src.cli "The model fabricated a scientific citation that doesn't exist"

Look up a class by ID:

python -m src.cli --lookup EPIS-CITE-SPOOF-008

Classifier notes: The CLI uses stemmed keyword matching with synonym expansion. It achieves 100% recall on 49 documented real-world incidents. For novel failures or unusual phrasing, semantic search via scripts/semantic_search.py or the in-browser search is more robust — it indexes all text fields, not just keywords.

Not the same as Freshness Watch: the scheduled feed pipeline for maintainers is separate; see docs/freshness-watch.md.

Who can benefit the most?

Teams deploying agents in consequential workflows (money, data, infra, compliance) who need a shared failure map and a path to runtime enforcement.

Using this for pre-deployment auditing

Ship with the table in the loop: pick the dimensions that match your deployment surface, pull CRITICAL-leaning classes with semantic search, then lock mitigations with --lookup on each ID.

Worked example (coding assistant, four steps + commands): docs/pre-deployment-audit.md

Compound Failures

Most real incidents activate more than one dimension. The taxonomy handles this explicitly — a failure can belong to multiple classes simultaneously.

Example: a jailbreak that generates malware

Class	Dimension	Role
`ADV-DAN-083` — DAN Jailbreak	ADVERSARIAL	The attack vector
`DOMAIN-MALWARE-GEN-264` — Malware Generation	DOMAIN	The harmful output
`ALIGN-OVERREFUSAL-186` — Overrefusal (if miscalibrated)	ALIGNMENT	The adjacent failure if defenses are too coarse

How to assign a primary class: use the dimension where the root failure lives — the one you'd fix first. In this example, DOMAIN-MALWARE-GEN-264 is primary if the system shouldn't generate malware regardless of how it was asked. ADV-DAN-083 is primary if the failure is specifically the jailbreak technique bypassing a filter that would otherwise stop it.

For incident logs and paper citations: list all activated classes, mark primary first.

Semantic search and sample CLI output

Semantic search (TF‑IDF over full class text): build the index once, then query by meaning—best when keyword-stem classification is too brittle. Commands and when to use search vs CLI: docs/how-to-use.md#semantic-search.

Seven-dimension sample transcript (same engine as MCP / daily driver): docs/how-to-use.md#sample-cli-transcript-seven-dimensions.

Repository structure

At a glance: data/ holds the 343-class JSON and search index; src/ is classifier + CLI + MCP entry; scripts/ builds index, taxonomy, and visuals; tests/ locks behavior; reports/ stores classifier bundles; docs/ holds guides and the proof vault.

Annotated tree: docs/repository-structure.md

Running tests

pip install pytest
python -m pytest tests/ -v

Suite scope (classification, MCP bridge, API, freshness helpers, TF‑IDF search, data/schema integrity; heavily parameterized): docs/developer-testing.md

Class ID Stability Guarantee

Class IDs are permanent. Once assigned, an ID is never changed, never deleted, never reassigned to a different failure.

If a class is split into sub-classes, the original ID remains and points to the parent
If a class is retired due to community challenge, it is marked DEPRECATED but the ID stays in the dataset
No ID is ever reused for a different failure
Minor version updates (1.x) never change IDs or remove classes
Major version updates (x.0) may restructure dimensions but will publish a full migration table

This means: you can safely encode class IDs in tooling, papers, and safety documentation today. They will resolve correctly in future versions.

Known Gaps and Classification Limits

Failures the classifier handles well:

Described in terms of the failure mechanism (what structurally went wrong)
Failures with documented real-world incidents
Technical descriptions from safety papers

Failures that may require browsing TAXONOMY.md directly:

Novel failure patterns not yet in the taxonomy
Compound failures where the right class isn't obvious from a keyword search
Failures described in domain-specific jargon (legal, medical, security) without crossover vocabulary

Known classifier boundary cases:

Descriptions that are very short (< 10 words) may not provide enough signal
Failures described entirely in abstract terms without concrete mechanism may miss
The classifier was validated on English; non-English descriptions are untested

If the classifier returns NO on something you believe is a real failure, use semantic search (scripts/semantic_search.py) before concluding it's not in the table — the TF-IDF search is more robust to unusual phrasing.

How to Challenge or Extend

Run the classifier or semantic search on the failure description
If it returns NO — document the description, the closest classes returned, and why you believe it represents a new mechanism
Open an issue with that documentation
The community evaluates: is it a new class, a compound of existing classes, or a sub-mode?

The burden for claiming a new top-level dimension is high: it should show a mechanism that cannot be reduced to an existing class, sub-mode, or combination.

Contributing

This taxonomy lives or dies by community engagement. See CONTRIBUTING.md for the full process.

Found a failure outside the 343? Open a propose-new-class issue — it's valuable evidence either way
Disagree with a classification? Open a challenge-classification issue with your reasoning
Have a real incident to map? Open a report-real-incident issue — real cases are gold
Classifier missing a case? Open an improve-keywords issue

See ROADMAP.md for where this project is headed.

Relationship to other frameworks

MIT, Microsoft, and AVID sit at category or incident-corpus altitude. This repository names 343 mechanisms, pairs them with detection and mitigation, and—where we’ve run sources—shows live classifier bundles beside companion write-ups.

Full comparison table, in-repo narratives (Glasswing, Lynch, Mythos/Meta disclosures), and mit_domain / ms_agentic_category field mapping: docs/related-frameworks.md

About

Built by R. Gatoloai-Faupula — independent, no lab affiliation, no grant funding. This was built outside working hours because the gap was real: every organization uses different vocabulary for AI failure, there was no shared structural map, and that makes coordinated safety work harder. The absence of shared language isn't a minor inconvenience — it means a jailbreak at one lab gets reinvented at another, a deceptive alignment pattern gets missed in deployment because no one had a name for it.

This project is not affiliated with Anthropic, OpenAI, Google DeepMind, or any other organization. Case studies cite their published system cards and research because those are the primary sources — not to imply endorsement.

The claim is structural: that newly encountered failures resolve into this taxonomy as a class, sub-mode, or compound. That claim is falsifiable. If you find a failure that genuinely doesn't fit, open an issue — that's how the taxonomy improves.

Citation

Gatoloai-Faupula, R. (2026). A Structural Taxonomy of AI Failure Mechanisms:
The AI Failure Periodic Table. Independent Research.
Contact: ryangat@lmlsystemlayer.com

License

Apache 2.0 — open source, free to use, fork, test, and build on.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

lmlsystemlayer

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.5.3

May 5, 2026

1.5.2

Apr 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_failure_periodic_table-1.5.3.tar.gz (414.7 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ai_failure_periodic_table-1.5.3-py3-none-any.whl (403.4 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file ai_failure_periodic_table-1.5.3.tar.gz.

File metadata

Download URL: ai_failure_periodic_table-1.5.3.tar.gz
Upload date: May 5, 2026
Size: 414.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ai_failure_periodic_table-1.5.3.tar.gz
Algorithm	Hash digest
SHA256	`468125e460fc9d48d3e82d06f9cee785c2846b3e3246c8cadae644c8107f3373`
MD5	`1aeb96f2f945cdb6ade8268cacb0c8b4`
BLAKE2b-256	`4b14448ba75f8d042d8a6efdcb8f1613c48b36cbfcbf26ad0a06057236546746`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_failure_periodic_table-1.5.3.tar.gz:

Publisher: release.yml on lml-layer-system/ai-failure-periodic-table

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ai_failure_periodic_table-1.5.3.tar.gz
- Subject digest: 468125e460fc9d48d3e82d06f9cee785c2846b3e3246c8cadae644c8107f3373
- Sigstore transparency entry: 1442464147
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: lml-layer-system/ai-failure-periodic-table@00e1d558181bb145ba3481bd34505b8e561eca20
- Branch / Tag: refs/tags/v1.5.3
- Owner: https://github.com/lml-layer-system
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@00e1d558181bb145ba3481bd34505b8e561eca20
- Trigger Event: push

File details

Details for the file ai_failure_periodic_table-1.5.3-py3-none-any.whl.

File metadata

Download URL: ai_failure_periodic_table-1.5.3-py3-none-any.whl
Upload date: May 5, 2026
Size: 403.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ai_failure_periodic_table-1.5.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7411b3beaf8f8736c0b6eb5f5549c865cb8a313fdfd91077d939a4b73da00433`
MD5	`b8ae6d623305274c7ad3efae25f377f6`
BLAKE2b-256	`8c2e2af5f7c54c674a78fb5ac13c4c8f451ef56017b5c4e963f2248d6a7c8c37`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_failure_periodic_table-1.5.3-py3-none-any.whl:

Publisher: release.yml on lml-layer-system/ai-failure-periodic-table

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ai_failure_periodic_table-1.5.3-py3-none-any.whl
- Subject digest: 7411b3beaf8f8736c0b6eb5f5549c865cb8a313fdfd91077d939a4b73da00433
- Sigstore transparency entry: 1442464225
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: lml-layer-system/ai-failure-periodic-table@00e1d558181bb145ba3481bd34505b8e561eca20
- Branch / Tag: refs/tags/v1.5.3
- Owner: https://github.com/lml-layer-system
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@00e1d558181bb145ba3481bd34505b8e561eca20
- Trigger Event: push

ai-failure-periodic-table 1.5.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

AI Failure Periodic Table

The spec, Daily Driver, and brakes

Connect your everyday AI (daily driver)

The Problem

Proof

Live Visual Table

Critical-Severity Classes (26)

The 7 Dimensions

Quick Start

Who can benefit the most?

Using this for pre-deployment auditing

Compound Failures

Semantic search and sample CLI output

Repository structure

Running tests

Class ID Stability Guarantee

Known Gaps and Classification Limits

How to Challenge or Extend

Contributing

Relationship to other frameworks

About

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance