Adaptive model distillation with coaching — progressively replace expensive API calls with a fine-tuned local model
Project description
Apprentice
Adaptive model distillation with coaching. Start with frontier API models, progressively train a local model, then withdraw the expensive dependency — while maintaining quality guarantees.
How It Works
Apprentice manages the full lifecycle of distilling knowledge from remote frontier models (Claude, GPT, etc.) into specialized local models:
- Phase 1 — Cold Start: Every request goes to the remote API. Responses are collected as training data.
- Phase 2 — Reinforcement: The local model begins attempting responses alongside the remote. Outputs are compared via the confidence engine.
- Phase 3 — Steady State: The local model handles most requests. Adaptive sampling periodically checks quality against the remote, adjusting frequency based on correlation.
The caller submits a request and gets a response. They don't know whether it came from a local model, a remote API, or a blend of both.
Installation
pip install -e .
Quick Start
from apprentice import Apprentice
# Initialize from config
app = await Apprentice.create("apprentice.yaml")
# Send a request — routing is automatic
response = await app.run("classify_ticket", {
"text": "My payment didn't go through",
"metadata": {"source": "email"}
})
print(response.result) # {"category": "billing", "priority": 2}
print(response.source) # "local" or "remote" or "dual"
await app.close()
Configuration
See examples/apprentice.yaml for a complete example. Key sections:
tasks:
- name: classify_ticket
prompt_template: "Classify: {text}"
evaluator: structured_match
match_fields: [category, priority]
confidence_thresholds:
phase2: 50 # examples before Phase 2
phase3: 0.85 # correlation for Phase 3
remote:
provider: anthropic
model: claude-sonnet-4-5-20250929
api_key: env:ANTHROPIC_API_KEY
local:
backend: ollama
base_model: llama3.1:8b
budget:
monthly_limit_usd: 150.00
Architecture
25 components organized in two layers — 18 leaf implementations with zero cross-dependencies, wired together by 7 integration compositions:
Leaf Components
| Component | Purpose |
|---|---|
config_loader |
Load and validate YAML configuration |
task_registry |
Manage task type definitions and schemas |
data_models |
Shared Pydantic models across all components |
remote_api_client |
Multi-provider API abstraction (Anthropic, OpenAI, etc.) |
local_model_server |
Local model inference (Ollama, vLLM, llama.cpp) |
evaluators |
Response quality scoring (exact match, semantic, structured) |
phase_manager |
Phase 1/2/3 lifecycle and transitions |
rolling_window |
Sliding window correlation tracking |
sampling_scheduler |
Adaptive sampling frequency control |
training_data_store |
Training example collection and management |
fine_tuning_orchestrator |
Fine-tuning pipeline (LoRA, OpenAI, HuggingFace) |
model_validator |
Pre-promotion model quality validation |
budget_manager |
Multi-window spend tracking and enforcement |
router |
Request routing (local, remote, dual) |
apprentice_class |
Core Apprentice class — run, status, report |
cli |
Command-line interface |
audit_log |
Structured event logging (JSONL) |
report_generator |
Reports, metrics, and observability |
Integration Compositions
| Composition | Children | Purpose |
|---|---|---|
config_and_registry |
config_loader, task_registry, data_models | Configuration + type system |
confidence_engine |
evaluators, phase_manager, rolling_window | Quality tracking pipeline |
external_interfaces |
remote_api_client, local_model_server | External service adapters |
training_pipeline |
training_data_store, fine_tuning_orchestrator, model_validator | Training lifecycle |
unified_interface |
apprentice_class, cli | User-facing API + CLI |
reporting |
audit_log, report_generator | Observability layer |
root |
all 6 compositions above | Full system composition root |
CLI
apprentice run config.yaml # Start the system
apprentice status config.yaml # Show current phase, confidence, budget
apprentice report config.yaml # Generate summary report
Development
make dev # Install with dev + lint dependencies
make test # Run all 2,064 tests
make test-quick # Stop on first failure
make lint # Run ruff linter
make lint-fix # Auto-fix lint issues
make clean # Remove build artifacts
Built With
This project was built using Pact — a contract-first multi-agent software engineering framework. Pact decomposed the task into 25 components, generated contracts and tests for each, then implemented them using iterative Claude Code sessions that write code, run tests, and fix failures autonomously.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file apprentice_ai-0.2.0.tar.gz.
File metadata
- Download URL: apprentice_ai-0.2.0.tar.gz
- Upload date:
- Size: 394.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69227ecd89bf89c94d6f97cd3fad2dd693c49543f099abe03ff68441499c7a8c
|
|
| MD5 |
689ea81cda8c3c3c241c939fbded3b8c
|
|
| BLAKE2b-256 |
9f296d2480ffe704169ee9c3ab1b9befb9e80b35bb1bcf695dafb1dd7df929c0
|
Provenance
The following attestation bundles were made for apprentice_ai-0.2.0.tar.gz:
Publisher:
publish.yml on jmcentire/apprentice
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
apprentice_ai-0.2.0.tar.gz -
Subject digest:
69227ecd89bf89c94d6f97cd3fad2dd693c49543f099abe03ff68441499c7a8c - Sigstore transparency entry: 970406555
- Sigstore integration time:
-
Permalink:
jmcentire/apprentice@756af1c140ce7d5a25ba25cf32bdfeb98bbb00a6 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/jmcentire
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@756af1c140ce7d5a25ba25cf32bdfeb98bbb00a6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file apprentice_ai-0.2.0-py3-none-any.whl.
File metadata
- Download URL: apprentice_ai-0.2.0-py3-none-any.whl
- Upload date:
- Size: 179.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
22718e400a53e9cc3a142fac658a435f31a4d92d67c46025e9ce5f8f4137b0ad
|
|
| MD5 |
318bea8e633f8c4aaede03408bc59904
|
|
| BLAKE2b-256 |
ca021ee5270c770b263f056bcfdd2dc8d4438fc8d3daf67dd8a5dd92d76b80a0
|
Provenance
The following attestation bundles were made for apprentice_ai-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on jmcentire/apprentice
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
apprentice_ai-0.2.0-py3-none-any.whl -
Subject digest:
22718e400a53e9cc3a142fac658a435f31a4d92d67c46025e9ce5f8f4137b0ad - Sigstore transparency entry: 970406565
- Sigstore integration time:
-
Permalink:
jmcentire/apprentice@756af1c140ce7d5a25ba25cf32bdfeb98bbb00a6 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/jmcentire
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@756af1c140ce7d5a25ba25cf32bdfeb98bbb00a6 -
Trigger Event:
push
-
Statement type: