Skip to main content

Bolt-on correction primitive for AI coding agents.

Project description

Functional Scars

Stop explaining the same fix every session. Make corrections persist.

CI License Claude Code Codex Companion paper

Status: alpha (v0.4.0). Install with pip install fscars. Core engine, Claude Code and Codex (native hooks) adapters, 5 starter scars, and the validation layers (fscars.validation) — a three-tier loop for turning observations into auditable outcomes — are working. Codex PreToolUse blocking is deterministic on the surfaces Codex supports (Bash / apply_patch / MCP); it remains a guardrail, not a complete boundary. Read CHANGELOG.md for the current state.


Why this exists

A junior engineer reads the textbooks and learns the fundamentals — that is the floor. What turns the junior into a senior is the weight that mistakes leave behind: the migration that ran half-applied in production, the timezone bug that shipped to a customer, the build that broke at 2am. Those scars become heavier than any chapter of the book; they bend future decisions in a way pure knowledge cannot.

AI coding agents come into your project with a strong prior — billions of tokens of training, especially on code. But the way your assistant behaves on your codebase is not just that prior; it is shaped by every correction you make along the way. The catch is that those corrections rarely survive: the next session starts from training again, and the model regresses to its statistical default in any area where the correction carries less weight than the prior. A functional scar is the anchor that gives your correction enough weight to bend the next decision.


What is a Functional Scar?

A scar is what an operator's correction becomes when you make it deterministic. Not text presented to the model — code that runs outside the model, intercepts the moment of risk, and pushes back.

System prompt Memory / KB Hook Functional Scar
Where does the rule live? In context In context In code outside the model In code outside the model
Does the model decide whether it applies? Yes Yes No No
Does it survive /compact? Partial Yes Yes Yes
Does it learn from its own fires? No No Manual Yes
Built directly from a real correction? No No Manual Yes — by design

Functional Scars complement memory and skills, they do not compete with them. The companion paper Lucy Syndrome in LLM Agents explains the underlying framework — five invariants that distinguish corrections that persist from those that decay.

This repository is the first installable implementation of those invariants.


Quick start

pip install fscars            # v0.5.0 on PyPI
cd your-project
fscar init                    # creates .fscars/, scaffolds 5 starter scars, wires Claude Code
fscar init --adapter codex    # same, but registers native Codex hooks (.codex/hooks.json)
fscar list                    # the 5 starters, now live under .fscars/scars/

fscar init copies the starter scars into .fscars/scars/ in your project. They are yours to edit or delete — the hook entrypoint loads that directory at runtime, so a fresh pip install fires on the first oversized write.

Three quick wins to try right away:

# 1) Web dev — kill timezone regressions in handler code
fscar list | grep utc-timestamps

# 2) Data science — require explicit UTF-8 in pandas.read_csv
fscar list | grep csv-encoding

# 3) Marketing copy — block "we don't do X" framing
fscar list | grep avoid-negative-framing

Once installed, every Claude Code tool call passes through the engine. When a scar matches, the engine emits an additionalContext reminder (or blocks the call when the scar is severity block) and writes one JSON line to .fscars/logs/fires.jsonl.


Commands

Command Description
fscar init Initialize .fscars/ and register the hook entrypoint
fscar list Show registered scars + fire counts
fscar log [-n N] Show the most recent fires (filter by --scar, --session)
fscar stats Compute fire counts, latency p50/p99, tokens added
fscar disable <scar_id> Disable without deleting (use --enable to restore)
fscar doctor Diagnose installation and hook wiring
fscar validate Run Capa 4 deterministic rules over observed opportunities
fscar dashboard Render markdown + HTML metrics from fires + opportunities
fscar audit Validate + cross-link fires↔opportunities + render dashboard
fscar --version Print the installed version

The hook entrypoint is python -m fscars.run_hook. Single command across every event type — no per-scar hook scripts. For Codex, fscar init --adapter codex registers that entrypoint as a native command hook in .codex/hooks.json for every parity event and keeps an AGENTS.md block as an operational fallback. Run /hooks in the Codex CLI once to trust the hooks. fscars registers every documented Codex hook. Three are blocking surfaces: PreToolUse (deny a Bash / apply_patch / MCP call before it runs), PermissionRequest (deny the approval of a request — event_type = PermissionRequest), and SubagentStop (keep a subagent from stopping until a condition is met — event_type = SubagentStop). The rest inject context: SubagentStart, PreCompact, and PostCompact (each matched to its exact output schema). The tool-use surfaces remain guardrails, not a complete boundary (WebSearch and other non-shell/non-MCP tools are not intercepted).


How it works

┌─────────────────────────────────────────────────────────────┐
│                          fscars.core                        │
│   payload · scar · engine · log · store · fire (Pydantic)   │
└──────────────────────┬──────────────────────────────────────┘
                       │
        ┌──────────────┴──────────────┐
        │  fscars.adapters/           │
        │   claude_code (v0.1)        │
        │   codex (native hooks)      │
        │   cursor (community)        │
        └──────────────┬──────────────┘
                       │
                       ▼
       .claude/settings.json wired with one entrypoint:
              python -m fscars.run_hook

       Codex projects register the same entrypoint natively in
       .codex/hooks.json (+ AGENTS.md fallback / audit contract)

The engine reads stdin, parses through the right adapter, dispatches to every matching scar, and emits the combined additionalContext plus exit code. A failure inside any scar is swallowed — the host harness must never crash because of fscars.


Cookbook

fscar init scaffolds the first five of these into .fscars/scars/; the table below is the full catalog shipped in the cookbook package. Use fscar init --all to scaffold every scar (including import_aware_imports.py), or fscar init --no-scars to wire the hook without copying any:

File What it does
large_write_review.py Reminds the operator to self-review writes over 200 lines
utc_timestamps.py Pushes back on time.Now() / new Date() in handler files
csv_encoding.py Requires explicit encoding="utf-8" in pandas.read_csv
avoid_negative_framing.py Blocks "we don't do X" patterns in marketing copy
subagent_coverage_report.py Reminds the operator to ask subagents for a coverage report
import_aware_imports.py AST-based detection of writes that import a watched package — see cookbook_import_aware.md
_template.py Copy-paste starting point for new scars

See cookbook/scars/README.md for the contract and the 5-invariant checklist.


Validation layers

Once you have observation in place, the next problem is precision: out of every hundred fires, how many actually prevented an error? fscars.validation is a three-tier loop developed in production during May 2026 that downgrades the labelling problem from "operator stares at thousands of rows" to "operator confirms an edge slice automation cannot resolve":

  1. Capa 4 — deterministic rules per scar. Free, predictable, resolves most clearly-true and clearly-false opportunities.
  2. Capa 3 — LLM classifier (subprocess to the local claude CLI) for what Capa 4 leaves ambiguous. Configurable threshold, parallel workers.
  3. Capa 5 — cross-link the observed opportunities to actual hook fires so coverage stops being a proxy.

A fscars.dashboard module renders the resulting metrics as markdown + self-contained HTML; fscars.io.safe_jsonl guards concurrent pipeline writes with file-locked atomic merges.

The CLI shortcut for the common case:

fscar audit --classifiers myapp.scars:register --period 30d

Full architecture, examples, and the cross-link / outcome marker details: docs/advanced_validation.md.


When NOT to use fscars

A scar only works when the correction satisfies the five invariants. If your fix is:

  • Subjective ("I prefer tabs over spaces") — use .editorconfig or a linter.
  • Proportional ("use async when it makes sense") — leave it to the model's judgment.
  • One-off (the case has not repeated) — wait for the second occurrence first.
  • Non-binary (cannot be checked deterministically) — keep it in your knowledge base.

These are the four cases the paper explicitly excludes. Adding a scar there creates noise without preventing anything.


Platforms

Currently supported:

  • Claude Code (Anthropic) — full adapter, all event types
  • Codex CLI (OpenAI) — native hooks via .codex/hooks.json (deterministic PreToolUse deny on Bash / apply_patch / MCP), with an AGENTS.md fallback / audit contract; see docs/codex_integration_plan.md

On the roadmap:

  • Cursor, Aider, Continue.dev — community adapters welcome

The core engine is platform-agnostic. Each adapter is a small glue layer (~300 LoC) that translates between platform-specific JSON shapes and the canonical HookPayload.


The research behind this

Functional Scars is the reference implementation of the framework described in Lucy Syndrome in LLM Agents: A Practitioner Framework for Cross-Session Correction Persistence (Del Puerto, 2026). The paper analyzes 163 findings from 17 production session logs, identifies 5 persistence invariants, and proposes a 3-layer implementation model.

If you want the why, read the paper. If you want the how, you are in the right place.

The first derivative essay From Memory to Scar (May 2026) extends the four-layer progression with Anthropic's Managed Agents Memory beta as a working example of Layer 3 industrialized.


Related projects

Part of a small cluster for operating LLM coding agents in production:

  • lucy-syndrome — the companion research: five persistence invariants and the framework fscars implements.
  • callus — per-author voice calibration: score and rewrite drafts against your own raw voice instead of a generic AI detector.

Contributing

See CONTRIBUTING.md. New adapters and cookbook scars are especially welcome. Contributors and acknowledgments — including the Codex-authored Codex adapter — live in CONTRIBUTORS.md.

git clone https://github.com/Vdp89/fscars
cd fscars
pip install -e ".[dev]"
pytest -q
ruff check fscars cookbook tests

License

Apache 2.0 — see LICENSE.


Built on research first published as Lucy Syndrome in LLM Agents · companion repo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fscars-0.8.0.tar.gz (90.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fscars-0.8.0-py3-none-any.whl (84.9 kB view details)

Uploaded Python 3

File details

Details for the file fscars-0.8.0.tar.gz.

File metadata

  • Download URL: fscars-0.8.0.tar.gz
  • Upload date:
  • Size: 90.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fscars-0.8.0.tar.gz
Algorithm Hash digest
SHA256 106c2d14e641fc787bafedda8f18d66c4f77858c81b8692798783fc7b2b1d3a7
MD5 29fd0d10f18aeb195a8ac8674a5b1702
BLAKE2b-256 1cb2d86d0528baed5c2b974e5c9db591c61eb2044875732d3679806158cec45e

See more details on using hashes here.

Provenance

The following attestation bundles were made for fscars-0.8.0.tar.gz:

Publisher: release.yml on VDP89/fscars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fscars-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: fscars-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 84.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fscars-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 adbe1bb8e70a5f13debc00a760a7476b6b16deb03890c0881543d6dd7bbd425e
MD5 dd5988bc9d8454fabb13d5478d8830cb
BLAKE2b-256 3c1fd112c153b0e57d53aec748f1e4be2731692f5ce46deb75337035b2edd522

See more details on using hashes here.

Provenance

The following attestation bundles were made for fscars-0.8.0-py3-none-any.whl:

Publisher: release.yml on VDP89/fscars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page