SQLite-backed rule catcher for emergent LLM DSLs.

Project description

RuleCatcher

rulecatcher is a small SQLite-backed CLI for catching emergent syntax rules from symbolic text.

The first version is intentionally simple:

Read one or more text files.
Learn repeated next-token expectations from history.
Store those expectations as candidate rules with evidence and confidence.
Let a human adopt or reject the rules.
Lint future text against the adopted rule stack.
Suggest normalizations when the stack predicts a different next token.

This is not a full parser or reasoning engine. It is the first runtime loop for:

observe pattern
-> propose rule
-> adopt/reject
-> stack rule
-> lint future text
-> normalize drift

Why SQLite

The system needs durable internal state for:

candidate rules
rule evidence
adoption status
decision history
source artifacts
stack history

That becomes database-shaped quickly, so the MVP uses SQLite as the source of truth.

Commands

rulecatcher catch examples/runes.txt
cat examples/runes.txt | rulecatcher catch - --label shell-demo
rulecatcher review
rulecatcher review --scope syntax --rule-type next_kind
rulecatcher review --scope syntax --json
rulecatcher adopt 1 --actor silas --reason "stable core syntax"
rulecatcher reject 7 --source review-loop --reason "too noisy"
rulecatcher stack --scope global
rulecatcher explain 3 --json
rulecatcher explain --scope alpha --rule-type next_token --prefix-token state --prefix-token '=>' --expected-token open
rulecatcher report --scope alpha --json
rulecatcher export-scope --scope alpha > alpha.rulecatcher.json
rulecatcher import-scope alpha.rulecatcher.json --scope beta --replace-scope --json
rulecatcher compare scope:alpha scope:beta --json
rulecatcher compare alpha.rulecatcher.json scope:alpha
rulecatcher triage --scope global
rulecatcher triage --scope global --focus all
rulecatcher triage --scope global --all
rulecatcher triage --scope alpha --recommendation review --json
rulecatcher apply-triage --scope alpha --recommendation adopt --dry-run --json
rulecatcher apply-triage --scope alpha --recommendation adopt --focus core --actor silas
rulecatcher apply-triage --scope alpha --recommendation adopt --actor silas
rulecatcher history --scope global --json
rulecatcher govern --scope global
rulecatcher govern --scope alpha --recommendation review --json
rulecatcher graph --scope global --format mermaid
rulecatcher graph --scope syntax --layer rules --rule-type next_kind --format table
rulecatcher graph --scope alpha --layer metasystem --format dot
rulecatcher graph --scope alpha --layer metasystem --json
rulecatcher conflicts --scope syntax --json
rulecatcher lint examples/runes_broken.txt
printf 'state => closed\n' | rulecatcher lint - --label incoming-check
printf 'door closed\n' | rulecatcher lint - --scope syntax --json
rulecatcher normalize examples/runes_broken.txt
rulecatcher normalize examples/runes_broken.txt --in-place
printf 'state => closed\nmode => safe\n' | rulecatcher session --scope alpha --label live --format jsonl
printf 'door => open\nwindow => closed\n' | rulecatcher session --scope beta --label learn-beta --learn --format jsonl
printf 'state => open\nstate => open\nstate => open\n' | rulecatcher session --scope alpha --label teach --learn --triage --format jsonl
printf 'state => open\nstate => open\nstate => open\n' | rulecatcher session --scope alpha --label auto --learn --triage --apply-triage adopt --format jsonl

All commands accept --db to choose a database path. The default is .rulecatcher.db in the current working directory. Commands that read or apply rules also accept --scope, so multiple emergent DSLs can live in one database without contaminating each other. Use catch --replace-scope when you want to discard a scope and rebuild it from a new artifact set instead of accumulating more history.

Rule Types in the MVP

The MVP learns two rule families:

next_token: for a repeated prefix history, expect a specific next token
next_kind: for a repeated syntax-class prefix, expect a specific token class

Example:

state => open
state => open
state => open

This can induce a rule like:

["state", "=>"] -> "open"

If future text says:

state => closed

then lint can flag the drift and normalize can suggest open.

Class rules are what let the tool generalize beyond exact strings. If it sees enough examples of:

identifier => identifier

it can learn reusable syntax expectations like:

[<IDENTIFIER>] -> <OPERATOR>
[<IDENTIFIER>, <OPERATOR>] -> <IDENTIFIER>

That means novel text such as door closed can be flagged as structurally broken even if door and closed never appeared in the corpus.

Keyword-Aware Tokenization

By default the tokenizer classifies every word as <IDENTIFIER>. For notations where certain words are structural keywords (e.g. state, as in a state-graph DSL), that conflation hides grammar: a declaration like state x ... and a transition like name --> ... share the [<BOS>, IDENTIFIER] prefix and pull the line-initial rule's confidence down.

Pass --keyword (repeatable) to catch to lift chosen words into their own <KEYWORD> class:

rulecatcher catch grammar.txt --scope dsl --keyword state --keyword as

The keyword set is persisted per scope, so lint, normalize, and session --learn reconstruct the same token classes automatically — you only declare keywords at catch time. Omitting --keyword on a later catch reuses the scope's stored keywords; the default (no keywords ever set) is unchanged from prior behaviour.

Example

See examples/runes.txt and examples/runes_broken.txt.

Installation

For normal development:

python3 -m pip install --no-build-isolation -e .

The --no-build-isolation form matters in offline or restricted environments where pip cannot download build dependencies.

You can also run the module directly:

python3 -m rulecatcher --help

Machine Use

Most governance commands now support --json, which is the preferred mode when another LLM or automation wants to consume the tool programmatically.

High-value machine surfaces:

review --json
stack --json
explain --json
report --json
import-scope --json
compare --json
triage --json
apply-triage --json
history --json
govern --json
conflicts --json
graph --json
lint --json
normalize --json
session --format jsonl

This lets another model consume stable rule objects, explicit graph nodes and edges, lint violations, normalization suggestions, and conflict groups without scraping human-oriented text.

Scope Report

rulecatcher report is the one-call orientation surface for a live scope.

Example:

rulecatcher report --scope alpha --json
rulecatcher report --scope alpha

The report aggregates:

scope summary counts
active adopted stack
pending triage pressure
health attention on adopted rules
conflict groups
recent decision history
derived next-action items

This is the shell surface for asking: "what matters in this scope right now, and what should I do next?"

Rule Explain

rulecatcher explain answers the question "why this rule?" in one call.

It can resolve a rule by local id:

rulecatcher explain 3 --json

Or by stable signature:

rulecatcher explain --scope alpha --rule-type next_token --prefix-token state --prefix-token '=>' --expected-token open

The explanation payload includes:

the canonical rule object
rule health and recommendation
supporting evidence rows
decision history for the same rule signature
competing rules with the same prefix
observed transition alternatives for that prefix

This is the shell surface for asking: "what made this rule appear, how stable is it, and what else could have happened here?"

Structural Repair Hints

When a generalized next_kind rule catches a novel structural syntax break, rulecatcher normalize now surfaces grounded concrete token hints from the rule's observed evidence instead of only the abstract expected class.

For example, if a scope learned that identifiers are usually followed by =>, then input like:

door closed

can yield a suggestion such as:

consider inserting one of ['=>'] before 'closed'

These hints are intentionally advisory. They help another LLM repair syntax using observed evidence, but they do not pretend that a class-rule violation is always safe to auto-rewrite.

Portability

rulecatcher export-scope and rulecatcher import-scope make a learned language stack portable.

The snapshot includes:

artifacts bound to the scope
observed transitions
candidate and adopted rules with evidence
rule health counters
decision history

This gives another LLM a shareable JSON bundle for replaying a scope elsewhere. Use import-scope --replace-scope when you want the target scope to be rebuilt from the snapshot instead of merged into the existing local state.

Evolution Compare

rulecatcher compare lets another LLM inspect how one language state differs from another.

It accepts two references:

a local scope name such as alpha or scope:alpha
an exported snapshot path such as alpha.rulecatcher.json or snapshot:alpha.rulecatcher.json

Example:

rulecatcher compare scope:alpha scope:beta --json
rulecatcher compare alpha-before.rulecatcher.json scope:alpha

The comparison payload is signature-based rather than id-based, so it survives export/import and can report:

rules added, removed, or changed
observed transition deltas
decision-history additions and removals
artifact-path changes across corpora

This is the shell surface for asking: "what language did I have before, what language do I have now, and what actually changed?"

Candidate Triage

rulecatcher triage closes the gap between "a pattern exists" and "should this become part of the language stack?"

It only inspects pending rules and turns them into explicit proposals with recommendations:

adopt: strong candidate with no known conflict
review: candidate needs a deliberate decision because the evidence is mixed or it conflicts with the active stack
reject: candidate is currently weaker than a competing pending rule for the same prefix

By default, triage only emits the frontier proposals and then auto-focuses on core rules that are not mostly BOS/EOL boundary framing. Use --focus all when you want the full proposal set, including boundary-heavy rules. --focus core forces only the core subset.

This is the surface another LLM can use to ask: "you seem to be inventing this rule; do you want to keep it?"

Batch Application

rulecatcher apply-triage lets another LLM move from recommendation to action without issuing one adopt or reject call per rule.

It selects the same filtered proposal set that triage would expose, then applies the chosen recommendation family through the normal decision pathway so history still records provenance and rationale. Use --dry-run to preview the exact proposal set before mutating the stack. It respects the same focus policy as triage, so the default application path also prefers core rules over boundary-only boilerplate.

Decision History

rulecatcher history makes the evolution of the language stack inspectable.

Every explicit adopt or reject decision is now persisted with optional actor, source, and reason metadata. When adopting a rule automatically displaces an older adopted rule for the same prefix, that displacement is also recorded as an automatic history event.

This lets another LLM answer questions like:

"why is this rule in the stack?"
"what replaced the previous rule?"
"was this change deliberate or automatic?"

Rule Health

rulecatcher govern turns accumulated lint outcomes into a health surface for the active rule stack.

Recommendation heuristics in this version:

untested: the rule has never been evaluated during linting or live sessions
tentative: the rule has only clean hits so far, but not enough repetitions to call it stable
healthy: the rule has at least three clean hits and no violations
watch: the rule has started drifting and should be monitored
review: the rule is breaking often enough that the stack should probably be reconsidered

This is the governance surface that answers: "the history predicted this next, but reality did something else; is that rule still worth keeping?"

Live Sessions

rulecatcher session is the stream-oriented surface for real-time use.

It reads stdin line by line and emits one event per input line plus a final summary event. With --learn, the full stream is also folded back into the scope as a new learnable artifact after the line events are emitted. With --triage, the stream also emits a pending-rule proposal event so newly caught patterns can be surfaced immediately. With --apply-triage, a session can also batch-apply selected recommendation families directly after triage. Use --triage-focus when you want to force core or all proposal visibility in live runs.

Notes

The tokenizer is intentionally light and dependency-free.
This version optimizes for the governance loop, not perfect language generalization.
Graph and hypergraph projections can be layered on later from the SQLite store.

Metasystem-Transition Graph

rulecatcher graph now builds an explicit graph object first, then renders it for the shell.

Example:

rulecatcher graph --format mermaid
rulecatcher graph --scope syntax --layer rules --rule-type next_kind --format table
rulecatcher graph --scope alpha --layer metasystem --format dot
rulecatcher graph --scope alpha --layer metasystem --json

Available layers:

observed: raw prefix-to-next-token transitions
rules: explicit stacked rule geometry
metasystem: observed transitions plus rules, decision history, and rule health

The --json surface now includes graph.nodes, graph.edges, and summary, so another LLM can traverse the learned syntax geometry directly instead of reconstructing it from flat rows. --format dot gives a shell-friendly export path for Graphviz or other downstream tooling.

Project details

Release history Release notifications | RSS feed

0.1.1

Jun 29, 2026

This version

0.1.0

Jun 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rulecatcher-0.1.0.tar.gz (53.0 kB view details)

Uploaded Jun 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rulecatcher-0.1.0-py3-none-any.whl (48.7 kB view details)

Uploaded Jun 29, 2026 Python 3

File details

Details for the file rulecatcher-0.1.0.tar.gz.

File metadata

Download URL: rulecatcher-0.1.0.tar.gz
Upload date: Jun 29, 2026
Size: 53.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for rulecatcher-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a00893690651b74266ae820f7574ec1fbeb7abcae035fd103de8edaf1ea87780`
MD5	`b440525b51f98051b0cc8e8078bdd77f`
BLAKE2b-256	`d444ac3f84ca7e4bcfa6d5dcfd420b3b93b9dff0d68cbbe8b321001a4cd8923e`

See more details on using hashes here.

File details

Details for the file rulecatcher-0.1.0-py3-none-any.whl.

File metadata

Download URL: rulecatcher-0.1.0-py3-none-any.whl
Upload date: Jun 29, 2026
Size: 48.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for rulecatcher-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`83cfc6a407fde67de54e793b800ada921edcd1b804d91d73365a10ca5c071994`
MD5	`fd01a7eef0179c3db8944d47426d08d2`
BLAKE2b-256	`30ec880d7828917d14e0d6ac43a10fe5aa9ea3204565c16a6aa1d8ee8ad75c23`

See more details on using hashes here.

rulecatcher 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

RuleCatcher

Why SQLite

Commands

Rule Types in the MVP

Keyword-Aware Tokenization

Example

Installation

Machine Use

Scope Report

Rule Explain

Structural Repair Hints

Portability

Evolution Compare

Candidate Triage

Batch Application

Decision History

Rule Health

Live Sessions

Notes

Metasystem-Transition Graph

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes