madcop — the supply chain cop that goes mad for anomalies. Pluggable LangGraph framework: detect, diagnose, decide.

These details have not been verified by PyPI

Project links

Project description

madcop

mad + cop — the supply chain cop that goes mad for anomalies. Pluggable LangGraph framework: from "detect" to "diagnose" to "decide", with self-evolution.

madcop is a pluggable supply chain anomaly framework. It turns raw telemetry (orders, shipments, warehouse readings, contracts) into decision prompts with full causal chains. Where most tools stop at "alert fired", madcop walks the chain back to the human decision that made the anomaly possible.

The name is short for mad cop — a cop that goes mad for anomalies. Not in a punitive sense, but in the sense of "won't let a single anomaly go untraced to its source".

What's new in v0.6.0

v0.5.0	v0.6.0
Fixed linear graph (ingest → detect → counterfactual → decision → summarise)	Plan-execute-replan loop with 4 modes (flash / standard / pro / ultra)
Single LLM call per session	Multi-model orchestration — auto router picks T1/T2/T3 per step, manual override in `~/.madcop/config.yaml`
Single LLM provider (OpenAI-compat)	5 default providers registered (nvidia_nim, nvidia_glm, zhipu, openai, deepseek) + custom add
2-layer memory (working + episodic)	4-layer memory — L1 working / L2 episodic / L3 semantic / L4 reflective + cross-layer retriever with time-decay
No self-growth	3-mechanism 成长 engine — episodic→semantic distillation, feedback reflection, meta-pattern mining
Ad-hoc eval	EvalRunner v2 — `EvalTrend` (cross-run regression), `RobustnessProbe` (4 input perturbations), `AdversarialChecker` (safety smoke tests)
214 tests	382 tests
No scratchpad / compactor	`Scratchpad` (cross-step state on disk) + `ContextCompactor` (sliding window + summarization)
No cost tracking	`CostTracker` (per-call + per-run cost) with token estimation for CJK + ASCII

Quick taste

from madcop.agent import PlanExecuteLoop, TrivialPlanner, FnStepExecutor, ExecutionMode
from madcop.strategy import ModelRouter, ProviderRegistry
from madcop.memory import MemoryStore, EpisodicMemory, SemanticMemory, ReflectiveMemory, GrowthEngine

# 1. Plan-execute-replan loop
loop = PlanExecuteLoop(
    planner=TrivialPlanner(),
    executor=my_executor,           # you provide one
    config=PlanExecuteConfig(mode=ExecutionMode.PRO),
)
result = loop.run("diagnose OMS cancel spike")
print(result.final_output)         # multi-step report
print(f"cost: ${result.total_cost_usd:.4f}")

# 2. Multi-model router (auto + manual)
router = ModelRouter(ProviderRegistry.default(), mode="auto")
tier = router.classify(task_signals)  # T1 reasoning / T2 balanced / T3 fast

# 3. 4-layer memory + 3-mechanism growth
store = MemoryStore(path="~/.madcop/memory.db")
epi = EpisodicMemory(store); sem = SemanticMemory(store); ref = ReflectiveMemory(store)
engine = GrowthEngine(epi, sem, ref, my_llm)
facts = engine.distill_episode(my_episode)         # M1
refl  = engine.record_feedback(epi, rating=5)      # M2
metas = engine.mine_meta_patterns()                # M3

Design philosophy (v0.6.0)

Five principles shaped v0.6.0. They're not features — they're decisions about what kind of software madcop wants to be.

1. Personal-first, not team-first. madcop runs on a laptop. One process, one SQLite file, one operator. No gateway, no Redis, no Kubernetes. If you need a multi-tenant agent platform, you're looking for the wrong tool — and that's fine, those exist.

2. Local-first, no cloud lock-in. Memory is a SQLite file at ~/.madcop/memory.db. Trends are a JSONL file. Eval results are JSON. You can cat everything, grep everything, and back up everything with rsync. There is no Langfuse or LangSmith to log into.

3. Self-growth over time. madcop is the only mainstream AI agent framework (that we know of) where the memory layer is the primary deliverable, not an afterthought. The 3-mechanism 成长 engine means that the longer you use madcop, the more it knows about your domain, your preferences, and your meta-strategies. New here, not as a checkbox.

4. Cost-aware routing as a first-class concern. Every step of every run can pick a different model. The auto router scores each step on 4 signals (structural / domain / context / user) and picks T1 (reasoning) / T2 (balanced) / T3 (fast). Manual override per provider in ~/.madcop/config.yaml. Built because shipping "always-call-gpt-4" demos is a lie.

5. The harness is small enough to read in one sitting. The whole plan-execute-replan loop is ~90 lines. The router is ~300 lines. The memory layer is 6 modules averaging 200 lines each. We picked this deliberately — every line of indirection is a line you can't debug.

What's new in v0.7.0

v0.7.0 adds a sub-agent layer to the v0.6.0 plan-execute loop. The lead agent can now dispatch steps to specialised sub-agents that run in parallel, in isolated contexts, and cannot recursively spawn more sub-agents.

The pieces:

SubagentSpec — a frozen dataclass describing a sub-agent (name, description, system_prompt, tools, disallowed_tools, max_turns, timeout). Two ships with v0.7.0: general-purpose (multi-step reasoning, inherits parent tools) and bash (shell command execution, tools = ("bash",)).
SubagentResult + SubagentStatus — race-safe state machine with try_set_terminal(). The first writer of a terminal status wins; late writes are no-ops. The four terminal states are COMPLETED, FAILED, CANCELLED, TIMED_OUT.
SubagentExecutor — runs sub-agents on a ThreadPoolExecutor capped at 3 (clamped to [1, 4]). Each sub-agent gets a deep copy of the parent's context (no leakage back). Cancellation is cooperative: set holder.cancel_event, the runner checks it between LLM calls.
PlanStep.subagent — set this on any plan step to dispatch the step to a sub-agent instead of running it inline. The lead agent's plan-execute loop routes sub-agent steps through the executor; inline steps go through the v0.6.0 path.

Three things we deliberately did not do:

Sub-agents cannot spawn sub-agents. The task tool is hard-coded as disallowed; this prevents recursive explosions.
We did not implement custom sub-agents from user config. That's v0.7.1.
We did not build an async executor. The thread pool is enough for personal use; if you need asyncio, open an issue.

from madcop.agent import PlanExecuteLoop, PlanStep, Plan, ExecutionMode
from madcop.agent.subagent import SubagentExecutor, FnRunner, ExecutorConfig

# 1. Build a runner — real impl wraps your LLM client; here we use a
#    function for tests + simple integrations.
runner = FnRunner(lambda **kw: f"[{kw['agent'].name}] {kw['prompt']}")

# 2. Build the executor. max_concurrent caps the thread pool.
executor = SubagentExecutor(runner=runner, config=ExecutorConfig(max_concurrent=3))

# 3. Write a plan that mixes inline + sub-agent steps.
plan = Plan(steps=[
    PlanStep(name="ingest", action="gather data"),               # inline
    PlanStep(name="analyse", action="classify findings",         # sub-agent
             subagent="general-purpose"),
    PlanStep(name="report",  action="build CSV",                 # sub-agent
             subagent="bash"),
])

# 4. Run the loop. Sub-agent steps are dispatched in parallel where
#    the plan allows; inline steps run sequentially.
loop = PlanExecuteLoop(my_planner, my_inline_executor)
result = loop.run("diagnose something")

Run the demo: python examples/v070_subagent_demo.py. It dispatches two sub-agents in parallel and merges the results into one report.

What madcop actually does

Five CLI demos, all runnable after pip install madcop. The output below is real — these are the screenshots rendered straight from the CLI.

1. Counterfactual cost simulation — "if we'd acted 1h earlier, we save ¥72"

counterfactual demo

A TMS shipment was 50% late (planned 4h, actual 6h). madcop simulates 5 interventions and recommends expedite_1h — net saving ¥72 vs. doing nothing. The other options (2h expedite, reroute, 4h expedite) all cost more than they save, even though they "feel safer".

2. Anomaly replay — "if every recommendation had been adopted, savings = 96.2%"

replay demo

Re-running madcop over a historical event log quantifies the total ROI: actual loss ¥6,400 → simulated loss ¥240 → ¥6,160 saved (96.2%). This is the single number every supply chain manager wants but most monitoring tools can't produce.

3. Decision diff — "operator ignored this recommendation 11 of 14 times"

decisions demo

When madcop keeps recommending the same action but humans keep rejecting or ignoring it, that's a fatigue signal. The same (rule, subject) signature appearing ≥2 times with ≥50% ignore rate gets flagged. Above: tms.leadtime.overrun|SHIP-7 recommended 14× but accepted only 3× (79% ignored).

4. Root-cause analysis — anomaly → 5-step chain → contract decision

rca demo

Every anomaly can be traced back through a typed property graph: from the temperature reading on a specific shipment, through the supplier, to the contract clause (PASSIVE — "notify within 30 min or pay 0.5%"), to the BD decision that accepted the concession three months earlier.

5. LangGraph orchestration — detect → replan → counterfactual → diff → summary

agent demo

The full pipeline as a typed state machine. Each node is a pure function (no LLM call) so the whole thing runs air-gapped.

Architecture (4 layers, 1 graph)

┌──────────────────────────────────────────────────────────────┐
│  L6  Replay + Decision Diff   — "if we'd listened" ROI,      │
│                                operator-fatigue detection     │
├──────────────────────────────────────────────────────────────┤
│  L5  LangGraph                — 6-node state machine          │
│                                (ingest → detect → replan →    │
│                                 cf → diff → summarise)        │
├──────────────────────────────────────────────────────────────┤
│  L4  Counterfactual + Planning — cost simulation, safety     │
│                                 stock, EOQ, ABC classification│
├──────────────────────────────────────────────────────────────┤
│  L3  CUSUM anomaly engine     — Page 1954 SPC + category     │
│                                 baselines (pharma 0.02,       │
│                                 apparel 0.30, ...)            │
├──────────────────────────────────────────────────────────────┤
│  L2  RCA + Multi-zone WMS     — frozen / refrigerated /      │
│                                 controlled / ambient bands    │
├──────────────────────────────────────────────────────────────┤
│  L1  Unified Data Layer       — OMS/TMS/WMS/BMS adapters,    │
│                                 UTC-validated, severity-rated │
└──────────────────────────────────────────────────────────────┘

What's shipped in v0.4.0

Layer	Component	Status
L1	`UnifiedEvent` with UTC + severity + source/event_type validation	✅
L1	`BaseAdapter` contract + WMS mock (cold-chain, 4 zones)	✅
L2	`Detector` + 5 rules (cold-chain temp/duration, OMS CUSUM, TMS lead, BMS score)	✅
L2	`KnowledgeGraph` + `trace()` + `explain()` RCA	✅
L2	Cold-chain seed graph (5 nodes, 4 edges)	✅
L3	CUSUM change-point detector with category-specific baselines	✅
L4	Counterfactual cost engine (5 canned interventions, pure functions)	✅
L4	Planning primitives: safety_stock / reorder_point / EOQ / ABC	✅
L5	LangGraph StateGraph orchestrator (6 nodes, no LLM dependency)	✅
L6	Replay engine (historical ROI quantification)	✅
L6	Decision diff (operator-fatigue detection)	✅

Installation

pip install madcop

That's it. Requires Python 3.10+. Optional langgraph is bundled as a hard dependency (used by the L5 orchestrator).

Quick start

# L1+L2: see the raw event stream and detect anomalies
python -m madcop run coldchain
python -m madcop run anomalies

# L2: trace each anomaly to a root-cause decision
python -m madcop run rca

# L3+L4: cost-simulate interventions on a TMS anomaly
python -m madcop run counterfactual

# L5: run the full LangGraph agent end-to-end
python -m madcop run agent

# L6: replay historical events and quantify the ROI of adopting every recommendation
python -m madcop replay examples/replay_sample.json

# L6: detect "operator fatigue" from a decision log
python -m madcop decisions examples/decisions_sample.jsonl

Tests

pip install -e ".[dev]"
pytest

149 tests, all passing (Python 3.10–3.12, macOS / Linux). CI runs on every push via GitHub Actions. Coverage:

L1 contract (UTC validation, event type / source system consistency)
L2 detector (every rule, windowed-rule state machine, multi-zone bands)
L2 RCA graph (forward/reverse traversal, empty chain, unknown subject)
L3 CUSUM (Siegmund ARL₀→h, category baselines, persistent-shift detection)
L4 counterfactual (TMS vs OMS branches, intervention capping, recommend logic)
L4 planning (safety stock / ROP / EOQ formulas, ABC Pareto cutoffs)
L5 LangGraph graph (node wiring, end-to-end with empty / WMS / synthetic events)
L6 replay (ROI totals, top savings, JSON event loader with case normalisation)
L6 decision diff (signature aggregation, ignore-rate filtering, JSONL I/O)

Why "madcop"?

When the user asked for a name for "the agent that goes mad for anomalies", the obvious answer was mad + cop. The product is a cop that goes mad for anomalies — not in a punitive sense, but in the sense of "won't let a single anomaly go untraced to its source."

License

MIT. See LICENSE.

Contact

Lin Ruihan · chuiniu@me.com

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.3.0rc3 pre-release

Jun 26, 2026

1.3.0rc2 pre-release

Jun 26, 2026

1.3.0rc1 pre-release

Jun 26, 2026

1.2.0rc2 pre-release

Jun 26, 2026

1.2.0rc1 pre-release

Jun 26, 2026

1.1.0rc1 pre-release

Jun 26, 2026

1.0.0rc1 pre-release

Jun 26, 2026

0.9.0

Jun 26, 2026

0.8.0

Jun 26, 2026

0.7.2

Jun 25, 2026

0.7.1

Jun 25, 2026

This version

0.7.0

Jun 25, 2026

0.6.0

Jun 25, 2026

0.5.0

Jun 25, 2026

0.4.0

Jun 24, 2026

0.3.0

Jun 24, 2026

0.2.0

Jun 24, 2026

0.1.0

Jun 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

madcop-0.7.0.tar.gz (122.6 kB view details)

Uploaded Jun 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

madcop-0.7.0-py3-none-any.whl (116.8 kB view details)

Uploaded Jun 25, 2026 Python 3

File details

Details for the file madcop-0.7.0.tar.gz.

File metadata

Download URL: madcop-0.7.0.tar.gz
Upload date: Jun 25, 2026
Size: 122.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.3

File hashes

Hashes for madcop-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`9b4e7265043502fa3b755467ec61b94cd7b9c388415bebf58f8dfde8aaf3c631`
MD5	`e7523542541351ebb555797779b38f33`
BLAKE2b-256	`bf31f247a0d7482275db11c2ef04856aeae7e97c36f13e29318ab639b9597af1`

See more details on using hashes here.

File details

Details for the file madcop-0.7.0-py3-none-any.whl.

File metadata

Download URL: madcop-0.7.0-py3-none-any.whl
Upload date: Jun 25, 2026
Size: 116.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.3

File hashes

Hashes for madcop-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d4a1ba6c87137332112b5eff30502a2400a829dadba3482fdf2f5b64326fb3d8`
MD5	`efb1fa5067f4da833ecd17d35ad91be2`
BLAKE2b-256	`287f71315e2c238df77887d381d3fbb6242f493e88579ee0b8afb38adcacc0c4`

See more details on using hashes here.

madcop 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

madcop

What's new in v0.6.0

Quick taste

Design philosophy (v0.6.0)

What's new in v0.7.0

What madcop actually does

1. Counterfactual cost simulation — "if we'd acted 1h earlier, we save ¥72"

2. Anomaly replay — "if every recommendation had been adopted, savings = 96.2%"

3. Decision diff — "operator ignored this recommendation 11 of 14 times"

4. Root-cause analysis — anomaly → 5-step chain → contract decision

5. LangGraph orchestration — detect → replan → counterfactual → diff → summary

Architecture (4 layers, 1 graph)

What's shipped in v0.4.0

Installation

Quick start

Tests

Why "madcop"?

License

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes