SQLite-backed rule catcher for emergent LLM DSLs.
Project description
RuleCatcher
rulecatcher is a small SQLite-backed CLI for catching emergent syntax rules from symbolic text.
The first version is intentionally simple:
- Read one or more text files.
- Learn repeated next-token expectations from history.
- Store those expectations as candidate rules with evidence and confidence.
- Let a human adopt or reject the rules.
- Lint future text against the adopted rule stack.
- Suggest normalizations when the stack predicts a different next token.
This is not a full parser or reasoning engine. It is the first runtime loop for:
observe pattern
-> propose rule
-> adopt/reject
-> stack rule
-> lint future text
-> normalize drift
Why SQLite
The system needs durable internal state for:
- candidate rules
- rule evidence
- adoption status
- decision history
- source artifacts
- stack history
That becomes database-shaped quickly, so the MVP uses SQLite as the source of truth.
Commands
rulecatcher catch examples/runes.txt
cat examples/runes.txt | rulecatcher catch - --label shell-demo
rulecatcher review
rulecatcher review --scope syntax --rule-type next_kind
rulecatcher review --scope syntax --json
rulecatcher adopt 1 --actor silas --reason "stable core syntax"
rulecatcher reject 7 --source review-loop --reason "too noisy"
rulecatcher stack --scope global
rulecatcher explain 3 --json
rulecatcher explain --scope alpha --rule-type next_token --prefix-token state --prefix-token '=>' --expected-token open
rulecatcher report --scope alpha --json
rulecatcher export-scope --scope alpha > alpha.rulecatcher.json
rulecatcher import-scope alpha.rulecatcher.json --scope beta --replace-scope --json
rulecatcher compare scope:alpha scope:beta --json
rulecatcher compare alpha.rulecatcher.json scope:alpha
rulecatcher triage --scope global
rulecatcher triage --scope global --focus all
rulecatcher triage --scope global --all
rulecatcher triage --scope alpha --recommendation review --json
rulecatcher apply-triage --scope alpha --recommendation adopt --dry-run --json
rulecatcher apply-triage --scope alpha --recommendation adopt --focus core --actor silas
rulecatcher apply-triage --scope alpha --recommendation adopt --actor silas
rulecatcher history --scope global --json
rulecatcher govern --scope global
rulecatcher govern --scope alpha --recommendation review --json
rulecatcher graph --scope global --format mermaid
rulecatcher graph --scope syntax --layer rules --rule-type next_kind --format table
rulecatcher graph --scope alpha --layer metasystem --format dot
rulecatcher graph --scope alpha --layer metasystem --json
rulecatcher conflicts --scope syntax --json
rulecatcher lint examples/runes_broken.txt
printf 'state => closed\n' | rulecatcher lint - --label incoming-check
printf 'door closed\n' | rulecatcher lint - --scope syntax --json
rulecatcher normalize examples/runes_broken.txt
rulecatcher normalize examples/runes_broken.txt --in-place
printf 'state => closed\nmode => safe\n' | rulecatcher session --scope alpha --label live --format jsonl
printf 'door => open\nwindow => closed\n' | rulecatcher session --scope beta --label learn-beta --learn --format jsonl
printf 'state => open\nstate => open\nstate => open\n' | rulecatcher session --scope alpha --label teach --learn --triage --format jsonl
printf 'state => open\nstate => open\nstate => open\n' | rulecatcher session --scope alpha --label auto --learn --triage --apply-triage adopt --format jsonl
All commands accept --db to choose a database path. The default is .rulecatcher.db in the current working directory.
Commands that read or apply rules also accept --scope, so multiple emergent
DSLs can live in one database without contaminating each other.
Use catch --replace-scope when you want to discard a scope and rebuild it
from a new artifact set instead of accumulating more history.
Rule Types in the MVP
The MVP learns two rule families:
next_token: for a repeated prefix history, expect a specific next tokennext_kind: for a repeated syntax-class prefix, expect a specific token class
Example:
state => open
state => open
state => open
This can induce a rule like:
["state", "=>"] -> "open"
If future text says:
state => closed
then lint can flag the drift and normalize can suggest open.
Class rules are what let the tool generalize beyond exact strings. If it sees enough examples of:
identifier => identifier
it can learn reusable syntax expectations like:
[<IDENTIFIER>] -> <OPERATOR>
[<IDENTIFIER>, <OPERATOR>] -> <IDENTIFIER>
That means novel text such as door closed can be flagged as structurally
broken even if door and closed never appeared in the corpus.
Keyword-Aware Tokenization
By default the tokenizer classifies every word as <IDENTIFIER>. For notations
where certain words are structural keywords (e.g. state, as in a state-graph
DSL), that conflation hides grammar: a declaration like state x ... and a
transition like name --> ... share the [<BOS>, IDENTIFIER] prefix and pull
the line-initial rule's confidence down.
Pass --keyword (repeatable) to catch to lift chosen words into their own
<KEYWORD> class:
rulecatcher catch grammar.txt --scope dsl --keyword state --keyword as
The keyword set is persisted per scope, so lint, normalize, and
session --learn reconstruct the same token classes automatically — you only
declare keywords at catch time. Omitting --keyword on a later catch
reuses the scope's stored keywords; the default (no keywords ever set) is
unchanged from prior behaviour.
Example
See examples/runes.txt and examples/runes_broken.txt.
Installation
For normal development:
python3 -m pip install --no-build-isolation -e .
The --no-build-isolation form matters in offline or restricted environments
where pip cannot download build dependencies.
You can also run the module directly:
python3 -m rulecatcher --help
Machine Use
Most governance commands now support --json, which is the preferred mode when
another LLM or automation wants to consume the tool programmatically.
High-value machine surfaces:
review --jsonstack --jsonexplain --jsonreport --jsonimport-scope --jsoncompare --jsontriage --jsonapply-triage --jsonhistory --jsongovern --jsonconflicts --jsongraph --jsonlint --jsonnormalize --jsonsession --format jsonl
This lets another model consume stable rule objects, explicit graph nodes and edges, lint violations, normalization suggestions, and conflict groups without scraping human-oriented text.
Scope Report
rulecatcher report is the one-call orientation surface for a live scope.
Example:
rulecatcher report --scope alpha --json
rulecatcher report --scope alpha
The report aggregates:
- scope summary counts
- active adopted stack
- pending triage pressure
- health attention on adopted rules
- conflict groups
- recent decision history
- derived next-action items
This is the shell surface for asking: "what matters in this scope right now, and what should I do next?"
Rule Explain
rulecatcher explain answers the question "why this rule?" in one call.
It can resolve a rule by local id:
rulecatcher explain 3 --json
Or by stable signature:
rulecatcher explain --scope alpha --rule-type next_token --prefix-token state --prefix-token '=>' --expected-token open
The explanation payload includes:
- the canonical rule object
- rule health and recommendation
- supporting evidence rows
- decision history for the same rule signature
- competing rules with the same prefix
- observed transition alternatives for that prefix
This is the shell surface for asking: "what made this rule appear, how stable is it, and what else could have happened here?"
Structural Repair Hints
When a generalized next_kind rule catches a novel structural syntax break,
rulecatcher normalize now surfaces grounded concrete token hints from the
rule's observed evidence instead of only the abstract expected class.
For example, if a scope learned that identifiers are usually followed by =>,
then input like:
door closed
can yield a suggestion such as:
consider inserting one of ['=>'] before 'closed'
These hints are intentionally advisory. They help another LLM repair syntax using observed evidence, but they do not pretend that a class-rule violation is always safe to auto-rewrite.
Portability
rulecatcher export-scope and rulecatcher import-scope make a learned
language stack portable.
The snapshot includes:
- artifacts bound to the scope
- observed transitions
- candidate and adopted rules with evidence
- rule health counters
- decision history
This gives another LLM a shareable JSON bundle for replaying a scope elsewhere.
Use import-scope --replace-scope when you want the target scope to be rebuilt
from the snapshot instead of merged into the existing local state.
Evolution Compare
rulecatcher compare lets another LLM inspect how one language state differs
from another.
It accepts two references:
- a local scope name such as
alphaorscope:alpha - an exported snapshot path such as
alpha.rulecatcher.jsonorsnapshot:alpha.rulecatcher.json
Example:
rulecatcher compare scope:alpha scope:beta --json
rulecatcher compare alpha-before.rulecatcher.json scope:alpha
The comparison payload is signature-based rather than id-based, so it survives export/import and can report:
- rules added, removed, or changed
- observed transition deltas
- decision-history additions and removals
- artifact-path changes across corpora
This is the shell surface for asking: "what language did I have before, what language do I have now, and what actually changed?"
Candidate Triage
rulecatcher triage closes the gap between "a pattern exists" and "should this
become part of the language stack?"
It only inspects pending rules and turns them into explicit proposals with recommendations:
adopt: strong candidate with no known conflictreview: candidate needs a deliberate decision because the evidence is mixed or it conflicts with the active stackreject: candidate is currently weaker than a competing pending rule for the same prefix
By default, triage only emits the frontier proposals and then auto-focuses on
core rules that are not mostly BOS/EOL boundary framing. Use --focus all when
you want the full proposal set, including boundary-heavy rules. --focus core
forces only the core subset.
This is the surface another LLM can use to ask: "you seem to be inventing this rule; do you want to keep it?"
Batch Application
rulecatcher apply-triage lets another LLM move from recommendation to action
without issuing one adopt or reject call per rule.
It selects the same filtered proposal set that triage would expose, then
applies the chosen recommendation family through the normal decision pathway so
history still records provenance and rationale. Use --dry-run to preview the
exact proposal set before mutating the stack. It respects the same focus policy
as triage, so the default application path also prefers core rules over
boundary-only boilerplate.
Decision History
rulecatcher history makes the evolution of the language stack inspectable.
Every explicit adopt or reject decision is now persisted with optional
actor, source, and reason metadata. When adopting a rule automatically
displaces an older adopted rule for the same prefix, that displacement is also
recorded as an automatic history event.
This lets another LLM answer questions like:
- "why is this rule in the stack?"
- "what replaced the previous rule?"
- "was this change deliberate or automatic?"
Rule Health
rulecatcher govern turns accumulated lint outcomes into a health surface for
the active rule stack.
Recommendation heuristics in this version:
untested: the rule has never been evaluated during linting or live sessionstentative: the rule has only clean hits so far, but not enough repetitions to call it stablehealthy: the rule has at least three clean hits and no violationswatch: the rule has started drifting and should be monitoredreview: the rule is breaking often enough that the stack should probably be reconsidered
This is the governance surface that answers: "the history predicted this next, but reality did something else; is that rule still worth keeping?"
Live Sessions
rulecatcher session is the stream-oriented surface for real-time use.
It reads stdin line by line and emits one event per input line plus a final
summary event. With --learn, the full stream is also folded back into the
scope as a new learnable artifact after the line events are emitted. With
--triage, the stream also emits a pending-rule proposal event so newly caught
patterns can be surfaced immediately. With --apply-triage, a session can also
batch-apply selected recommendation families directly after triage. Use
--triage-focus when you want to force core or all proposal visibility in
live runs.
Notes
- The tokenizer is intentionally light and dependency-free.
- This version optimizes for the governance loop, not perfect language generalization.
- Graph and hypergraph projections can be layered on later from the SQLite store.
Metasystem-Transition Graph
rulecatcher graph now builds an explicit graph object first, then renders it
for the shell.
Example:
rulecatcher graph --format mermaid
rulecatcher graph --scope syntax --layer rules --rule-type next_kind --format table
rulecatcher graph --scope alpha --layer metasystem --format dot
rulecatcher graph --scope alpha --layer metasystem --json
Available layers:
observed: raw prefix-to-next-token transitionsrules: explicit stacked rule geometrymetasystem: observed transitions plus rules, decision history, and rule health
The --json surface now includes graph.nodes, graph.edges, and summary,
so another LLM can traverse the learned syntax geometry directly instead of
reconstructing it from flat rows. --format dot gives a shell-friendly export
path for Graphviz or other downstream tooling.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rulecatcher-0.1.0.tar.gz.
File metadata
- Download URL: rulecatcher-0.1.0.tar.gz
- Upload date:
- Size: 53.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a00893690651b74266ae820f7574ec1fbeb7abcae035fd103de8edaf1ea87780
|
|
| MD5 |
b440525b51f98051b0cc8e8078bdd77f
|
|
| BLAKE2b-256 |
d444ac3f84ca7e4bcfa6d5dcfd420b3b93b9dff0d68cbbe8b321001a4cd8923e
|
File details
Details for the file rulecatcher-0.1.0-py3-none-any.whl.
File metadata
- Download URL: rulecatcher-0.1.0-py3-none-any.whl
- Upload date:
- Size: 48.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83cfc6a407fde67de54e793b800ada921edcd1b804d91d73365a10ca5c071994
|
|
| MD5 |
fd01a7eef0179c3db8944d47426d08d2
|
|
| BLAKE2b-256 |
30ec880d7828917d14e0d6ac43a10fe5aa9ea3204565c16a6aa1d8ee8ad75c23
|