Learn rule-based models from examples and LLM interactions
Project description
RuleChef
Learn rule-based models from examples using LLM-powered synthesis.
Replace expensive LLM calls with fast, deterministic, inspectable rules.
What is RuleChef?
RuleChef learns regex, Python code, and spaCy patterns from labeled examples using LLM-powered synthesis. You provide examples, RuleChef generates rules, and those rules run locally without any LLM at inference time.
Why rules instead of LLMs?
- Cost: Rules cost nothing to run. No API calls, no tokens.
- Latency: Sub-millisecond per query vs hundreds of ms for LLM calls.
- Determinism: Same input always produces the same output.
- Inspectability: You can read, edit, and debug every rule.
- No drift: Rules don't change unless you change them.
Installation
pip install rulechef
Extras:
pip install rulechef[grex] # Regex pattern suggestions from examples
pip install rulechef[spacy] # spaCy token/dependency matcher patterns
pip install rulechef[agentic] # LLM-powered coordinator for adaptive learning
pip install rulechef[all] # Everything
Quick Start
Extraction
Extract answer spans from text:
from openai import OpenAI
from rulechef import RuleChef, Task, TaskType
client = OpenAI()
task = Task(
name="Q&A Extraction",
description="Extract answer spans from context",
input_schema={"question": "str", "context": "str"},
output_schema={"spans": "List[Span]"},
type=TaskType.EXTRACTION,
)
chef = RuleChef(task, client)
chef.add_example(
{"question": "When?", "context": "Built in 1991"},
{"spans": [{"text": "1991", "start": 9, "end": 13}]}
)
chef.add_example(
{"question": "When?", "context": "Released in 2005"},
{"spans": [{"text": "2005", "start": 12, "end": 16}]}
)
chef.learn_rules()
result = chef.extract({"question": "When?", "context": "Founded in 1997"})
print(result) # {"spans": [{"text": "1997", ...}]}
Named Entity Recognition (NER)
from pydantic import BaseModel
from typing import List, Literal
class Entity(BaseModel):
text: str
start: int
end: int
type: Literal["DRUG", "DOSAGE", "CONDITION"]
class NEROutput(BaseModel):
entities: List[Entity]
task = Task(
name="Medical NER",
description="Extract drugs, dosages, and conditions",
input_schema={"text": "str"},
output_schema=NEROutput,
type=TaskType.NER,
)
chef = RuleChef(task, client)
chef.add_example(
{"text": "Take Aspirin 500mg for headache"},
{"entities": [
{"text": "Aspirin", "start": 5, "end": 12, "type": "DRUG"},
{"text": "500mg", "start": 13, "end": 18, "type": "DOSAGE"},
{"text": "headache", "start": 23, "end": 31, "type": "CONDITION"},
]}
)
chef.learn_rules()
Classification
task = Task(
name="Intent Classification",
description="Classify banking customer queries",
input_schema={"text": "str"},
output_schema={"label": "str"},
type=TaskType.CLASSIFICATION,
text_field="text",
)
chef = RuleChef(task, client)
chef.add_example({"text": "what is the exchange rate?"}, {"label": "exchange_rate"})
chef.add_example({"text": "I want to know the rates"}, {"label": "exchange_rate"})
chef.add_example({"text": "my card hasn't arrived"}, {"label": "card_arrival"})
chef.learn_rules()
result = chef.extract({"text": "current exchange rate please"})
print(result) # {"label": "exchange_rate"}
Transformation
task = Task(
name="Invoice Parser",
description="Extract company and amount from invoices",
input_schema={"text": "str"},
output_schema={"company": "str", "amount": "str"},
type=TaskType.TRANSFORMATION,
)
chef = RuleChef(task, client)
chef.add_example(
{"text": "Invoice from Acme Corp for $1,500.00"},
{"company": "Acme Corp", "amount": "$1,500.00"}
)
chef.learn_rules()
Core Concepts
Task Types
| Type | Output | Use Case |
|---|---|---|
EXTRACTION |
{"spans": [...]} |
Find text spans (untyped) |
NER |
{"entities": [...]} |
Find typed entities with labels |
CLASSIFICATION |
{"label": "..."} |
Classify text into categories |
TRANSFORMATION |
Custom dict | Extract structured fields |
Rule Formats
| Format | Best For | Example |
|---|---|---|
RuleFormat.REGEX |
Keyword patterns, structured text | \b\d{4}\b |
RuleFormat.CODE |
Complex logic, multi-field extraction | def extract(input_data): ... |
RuleFormat.SPACY |
Linguistic patterns, POS/dependency | [{"POS": "PROPN", "OP": "+"}] |
from rulechef import RuleFormat
# Only generate regex rules (fastest, most portable)
chef = RuleChef(task, client, allowed_formats=[RuleFormat.REGEX])
# Only code rules (most flexible)
chef = RuleChef(task, client, allowed_formats=[RuleFormat.CODE])
Buffer-First Architecture
Examples go to a buffer first, then get committed to the dataset during learn_rules(). This enables batch learning and coordinator-driven decisions:
chef.add_example(input1, output1) # Goes to buffer
chef.add_example(input2, output2) # Goes to buffer
chef.add_correction(input3, wrong_output, correct_output) # High-priority signal
chef.learn_rules() # Buffer -> Dataset -> Synthesis -> Refinement
Corrections & Feedback
Corrections are the highest-value training signal -- they show exactly where the current rules fail:
result = chef.extract({"text": "some input"})
# Result was wrong! Correct it:
chef.add_correction(
{"text": "some input"},
model_output=result,
expected_output={"label": "correct_label"},
feedback="The rule matched too broadly"
)
# Task-level guidance
chef.add_feedback("Drug names always follow 'take' or 'prescribe'")
# Rule-level feedback
chef.add_feedback("This rule is too broad", level="rule", target_id="rule_id")
chef.learn_rules() # Re-learns with corrections prioritized
Evaluation
RuleChef includes built-in evaluation with entity-level precision, recall, and F1:
# Dataset-level evaluation
eval_result = chef.evaluate()
# Prints: Exact match, micro/macro P/R/F1, per-class breakdown
# Per-rule evaluation (find dead or harmful rules)
metrics = chef.get_rule_metrics()
# Shows: per-rule TP/FP/FN, sample matches, identifies dead rules
# Delete a bad rule
chef.delete_rule("rule_id")
Advanced Features
Synthesis Strategy
For multi-class tasks, RuleChef can synthesize rules one class at a time for better coverage:
# Auto-detect (default): per-class if >1 class, bulk otherwise
chef = RuleChef(task, client, synthesis_strategy="auto")
# Force per-class synthesis
chef = RuleChef(task, client, synthesis_strategy="per_class")
# Force single-prompt bulk synthesis
chef = RuleChef(task, client, synthesis_strategy="bulk")
Agentic Coordinator
The AgenticCoordinator uses LLM calls to guide the refinement loop, focusing on weak classes:
from rulechef import RuleChef, AgenticCoordinator
coordinator = AgenticCoordinator(client, model="gpt-4o-mini")
chef = RuleChef(task, client, coordinator=coordinator)
chef.learn_rules(max_refinement_iterations=10)
# Coordinator analyzes per-class metrics each iteration,
# tells the synthesis prompt which classes to focus on,
# and stops early when performance plateaus.
Rule Pruning
With prune_after_learn=True, the agentic coordinator audits rules after learning -- merging redundant rules and removing pure noise. A safety net reverts if F1 drops:
coordinator = AgenticCoordinator(client, prune_after_learn=True)
chef = RuleChef(task, client, coordinator=coordinator)
chef.learn_rules()
# After synthesis+refinement:
# 1. LLM analyzes rules + per-rule metrics
# 2. Merges similar patterns (e.g. two regexes → one)
# 3. Removes precision=0 rules (pure false positives)
# 4. Re-evaluates — reverts if F1 drops
In the CLI: learn --agentic --prune.
Incremental Patching
After the initial learn, you can patch existing rules without full re-synthesis:
chef.learn_rules() # Initial synthesis
chef.add_correction(...) # Add corrections
chef.learn_rules(incremental_only=True) # Patch, don't re-synthesize
Observation Mode
Collect training data from any LLM -- no task definition needed:
# Works with any LLM provider (Anthropic, Groq, local models, etc.)
chef = RuleChef(client=client, model="gpt-4o-mini") # No task needed
chef.add_observation({"text": "what's the exchange rate?"}, {"label": "exchange_rate"})
chef.learn_rules() # Auto-discovers the task schema
For raw LLM interactions where you don't know the schema:
chef.add_raw_observation(
messages=[{"role": "user", "content": "classify: what's the rate?"}],
response="exchange_rate",
)
chef.learn_rules() # Discovers task + maps observations + learns rules
For OpenAI-compatible clients, auto-capture with monkey-patching:
wrapped = chef.start_observing(openai_client, auto_learn=False)
response = wrapped.chat.completions.create(...) # Observed automatically
chef.learn_rules()
chef.stop_observing()
Pydantic Output Schemas
Use Pydantic models for type-safe, validated outputs with automatic label extraction:
from pydantic import BaseModel
from typing import List, Literal
class Entity(BaseModel):
text: str
start: int
end: int
type: Literal["PERSON", "ORG", "LOCATION"]
class Output(BaseModel):
entities: List[Entity]
task = Task(..., output_schema=Output, type=TaskType.NER)
# RuleChef automatically discovers labels: ["PERSON", "ORG", "LOCATION"]
grex: Regex Pattern Suggestions
When use_grex=True (default), grex analyzes your training examples and adds regex pattern hints to the synthesis prompt. The LLM sees concrete patterns alongside the raw examples, producing better rules — especially for structured data like dates, IDs, and amounts:
DATA EVIDENCE FROM TRAINING:
- DATE (5 unique): "2024-01-15", "2024-02-28", "2023-12-01", ...
Exact pattern: (2023\-12\-01|2024\-01\-15|2024\-02\-28|...)
Structural pattern: \d{4}\-\d{2}\-\d{2}
Install with pip install rulechef[grex]. Disable with use_grex=False.
Benchmark: Banking77
On the Banking77 intent classification dataset (5-class subset, 5-shot per class, regex-only):
| Metric | Value |
|---|---|
| Accuracy | 60.5% |
| Micro Precision | 100% |
| Macro F1 | 71.7% |
| Rules learned | 108 |
| Per-query latency | 0.19ms |
With agentic coordinator guiding 15 refinement iterations against a dev set. Zero false positives — rules never give a wrong answer, they just abstain when unsure. Full results and learned rules: benchmarks/results_banking77.json. Reproduce: python benchmarks/benchmark_banking77.py --classes beneficiary_not_allowed,card_arrival,disposable_card_limits,exchange_rate,pending_cash_withdrawal --shots 5 --max-iterations 15 --agentic.
CLI
Interactive CLI for quick experimentation across all task types:
export OPENAI_API_KEY=your_key
rulechef
The CLI walks you through a setup wizard (task name, type, labels, model, base URL) and drops you into a command loop:
Commands:
add Add a training example
correct Add a correction
extract Run extraction on input
learn Learn rules (--iterations N, --incremental, --agentic, --prune)
evaluate Evaluate rules against dataset
rules List learned rules (rules <id> for detail)
delete Delete a rule by ID
feedback Add feedback (task/rule level)
generate Generate synthetic examples with LLM
stats Show dataset statistics
help Show commands
quit Exit
Works with any OpenAI-compatible API (Groq, Together, Ollama, etc.) via the base URL prompt.
License
Apache 2.0 -- see LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rulechef-0.1.2.tar.gz.
File metadata
- Download URL: rulechef-0.1.2.tar.gz
- Upload date:
- Size: 809.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d713f2df170ec59e4215738b90da33f37859f01be5e55fae4f361f63cdd958c4
|
|
| MD5 |
e0bcbd5ff547718a65039fcfb522142d
|
|
| BLAKE2b-256 |
8169fbe581bb424ef7066e2cd38dd79e9f235574416fe660a650fdfc61193ad9
|
File details
Details for the file rulechef-0.1.2-py3-none-any.whl.
File metadata
- Download URL: rulechef-0.1.2-py3-none-any.whl
- Upload date:
- Size: 85.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f507686b948f2058cb6cdc73985de7f589b9694862bfcb822b67c5277ba801d4
|
|
| MD5 |
ae613ff96ed61d4eac1ee9fed2abd5c2
|
|
| BLAKE2b-256 |
fa53c3a4c7429c7b57fe4f12f27d74afa88e4ac7e35146545fe0a2e1a41a32b5
|