BYOK reasoning amplification SDK + CLI coding agent. Plan-first, token-efficient, works with any LLM.
Project description
essarion-build
A BYOK reasoning amplification SDK + CLI coding agent, by Essarion. Bring your own model provider; essarion-build provides the reasoning loop, the grounding context, the bundled software-development skills, and the structured outputs. The CLI agent gives you an interactive, plan-first coding experience powered by the same SDK.
essarion-build is not a wrapper for any single LLM. It is a deliberate plan → draft → self-check pipeline that turns whatever model you wire in into a more thoughtful coder. The default is tuned to amplify cheap models — making a small, fast GPT reason about coding the way a senior engineer would, while spending a fraction of the tokens of a single-shot generation from a bigger model.
┌──────────────────────────────────────────┐
│ essarion-build SDK (this repo) │
│ │
your task ─► │ ┌──────────┐ ┌──────────┐ ┌────────┐ │ ─► structured
│ │ plan │ →│ draft │ →│selfcheck│ │ output
│ └──────────┘ └──────────┘ └────────┘ │ (plan, code,
│ + bundled skills + repo + docs + diffs │ defense, …)
│ │
└──────────────────────────────────────────┘
▲
any LLM provider (BYOK)
OpenRouter · Anthropic · OpenAI · Gemini
Ollama (local) · custom
The essarion CLI coding agent
Type essarion to launch the interactive coding agent. It uses the same SDK
under the hood — the agent IS the SDK, dressed in a TUI:
$ essarion
essarion build · CLI coding agent
by Essarion · amplifies any LLM with senior-engineer reasoning
session 20260528-195838-5e4b
cwd /home/you/work/myproject
model openrouter/openai/gpt-4o-mini
budget $0.000 / $1.00
skills 54 bundled, picker mode auto
type your task to begin · /help for commands · /quit to exit
────────────────────────────────────────────────────────────
you: review src/auth.py for JWT alg=none confusion
skills: auth_security web_security secure_coding scope_discipline error_handling
auto-loaded: src/auth.py
── plan ──
┌──────────────────────────────────────────────────────────────┐
│ plan │
│ 1. Verify the JWT lib's alg=none handling │
│ 2. Add an explicit allow-list of allowed algorithms │
│ 3. Reject tokens whose alg header is missing │
│ │
│ tradeoffs │
│ • chosen: whitelist (HS256 only) — closes alg=none family │
│ • rejected: blacklist — every new algorithm becomes risky │
│ │
│ verdict │
│ do not ship without resolving step 2 │
└──────────────────────────────────────────────────────────────┘
approve plan? (Enter=approve, e=edit, s=skip-to-draft, c=cancel) _
Why use the agent (over Claude Code / Codex / Aider / Cursor)
- Adaptive reasoning depth. Defaults to
effort="auto"— a tiny triage call sizes every task and routes trivial work to a 1-call plan while reserving the deep critique→revise loop for tasks with real stakes. You see the depth it chose. Override live with/effort deep. This is the whole bet: better reasoning, paid for only where it counts. - Plan-first interactivity. You see the plan and verdict BEFORE the draft is paid for. Edit it, reject it, or send it back. No other agent does this — most jump straight to writing code.
- Live token-budget meter + projected cost. Every session has a
configurable USD budget. Before each turn the agent prints a projected
cost based on your current context size. After each turn you see the
actual spend.
/cost <path>lets you estimate against a hypothetical context before sending it. - Smart skill selection. The 54 bundled skills aren't all loaded every turn — a fast keyword picker chooses the 3-5 most relevant ones. Big context savings on every call.
- Multi-model arbitrage. Plan + selfcheck on a cheap model;
--escalate <bigger-model>only kicks in if selfcheck rejects. Cheap by default, smart when it matters. - Project-aware.
essarion initcreates.essarion/{config.toml, sessions/, memory.md}per repo. The agent auto-detects the project root from.essarion/,.git/,pyproject.toml, etc. Per-project memory (/remember <fact>) and config flow into every turn. - Inline tool execution during planning. The model can emit
<tool_call name="read_file">…</tool_call>inside its plan; the agent runs the read-only tool (read_file, grep, glob, list_dir, find_files), folds the result back as a note, and re-plans. Up to 3 rounds. No user friction. - Background tasks.
/bg npm run devruns in parallel. The agent keeps working; completion notices fire between turns. /quit cleanly kills non-detached tasks via SIGTERM → SIGKILL on the process group. - Streamed draft output.
/stream onshows code as it's written, token by token. - Auto-verify + undo. Configure
[verify].auto=trueand the agent runs your test suite after every applied change. If it fails,/undoreverts the last change. - Reasoning-trace persistence. Every session saved to
<project>/.essarion/sessions/(or~/.essarion/sessions/). Replay withessarion --resume <id>. - The whole SDK is yours. Anything you can do in the agent, you can
do in code — same
reason(),generate(),Conversationcalls.
Quick commands
essarion # interactive REPL
essarion --task "review src/auth.py" # one-shot non-interactive
essarion --provider anthropic --model claude-sonnet-4-6
essarion --budget 5.00 --escalate claude-sonnet-4-6 # cheap+escalate
essarion --resume 20260528-195838-5e4b # continue a saved session
essarion --skills all # load every skill (vs auto)
essarion --effort deep # force deep reasoning every turn
essarion --effort quick # force minimal reasoning (cheapest)
# Subcommands fall through to the original CLI
essarion skills # list bundled skills
essarion providers
essarion reason "task" --json
essarion generate "task" --stream
Project folders
Run essarion init inside a repo and you'll get a .essarion/
directory with:
config.toml— per-project defaults (provider, model, budget, skills mode)sessions/— saved sessions live with the project instead of in~.gitignore— keeps sessions out of git
Once initialized, any time you launch essarion from anywhere inside
the project tree, the agent walks up to the project root, anchors the
sandbox there, and loads .essarion/config.toml. No .essarion/?
The agent still finds the project root via .git/, pyproject.toml,
package.json, Cargo.toml, go.mod, etc., and falls back to
~/.essarion/sessions/ for storage.
Background tasks
Long-running commands shouldn't block the agent. Start them in the background and they run in parallel while you keep planning:
you: /bg npm run dev
started [a3f9c1] pid=14852 · npm run dev
you: /bg pytest -q
started [b7e221] pid=14855 · pytest -q
you: /bg
background tasks
┏━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┓
┃ id ┃ status ┃ name ┃ elapsed ┃ exit ┃
┡━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━┩
│ a3f9c1 │ running │ npm run dev │ 12.4s │ │
│ b7e221 │ done │ pytest -q │ 4.7s │ 0 │
└────────┴─────────┴─────────────┴─────────┴──────┘
you: review src/auth.py
── plan ──
…(plan as normal while npm run dev keeps serving)…
[bg] [b7e221] pytest -q → done (exit 0, 4.7s) ← notice flushes here
/bg <cmd>— start one/bg detached <cmd>— start one that survives/quit/bg— list every task with status/bg show <id>— recent stdout/stderr/bg wait <id> [seconds]— block until done/bg kill <id>— terminate/bg clear— forget finished tasks
The same tools are registered with the SDK's tool registry, so the
model can call start_background / check_background /
wait_background / kill_background / list_background itself
during reasoning. The footer always shows bg N running when tasks
are alive; completion notices print between turns; /quit cleanly
kills every non-detached task (SIGTERM → grace → SIGKILL via process
group, so dev-server children die too).
Slash commands (inside the REPL)
Type /help inside the agent for the categorized view. The headline ones:
session
| Command | Description |
|---|---|
/whoami |
one-screen status: project + model + memory + budget + bg tasks |
/history |
list this session's turns |
/save / /load |
persist / list saved sessions |
/quit |
exit (saves; kills non-detached bg tasks) |
planning & workflows
| Command | Description |
|---|---|
/ask <q> |
quick reason() only, no draft phase (Q&A) |
/review <target> |
shortcut: workflows.review(<target>) |
/fix <bug> / /tests <target> / /refactor <target> |
other workflows |
/security <target> / /perf <target> |
security / performance review |
/docs <target> / /pr <target> / /explain <target> |
docs, PR description, code explanation |
models & cost
| Command | Description |
|---|---|
/model <p>/<m> |
switch provider/model mid-session |
/escalate <m> |
set escalation model (cheap → strong on reject) |
/budget [N] |
show or set USD budget |
/cost |
session cost ledger (per turn + total) |
/cost <path> |
estimate the cost of a turn against a path/dir |
/stream [on|off] |
toggle streamed draft output (token-by-token) |
skills & memory
| Command | Description |
|---|---|
/skills [auto|all|none] |
switch picker mode |
/remember <fact> |
append to .essarion/memory.md (per-project) |
/forget <pattern> |
remove facts matching a substring |
project & files
| Command | Description |
|---|---|
/cd <path> |
change sandbox directory |
/pwd |
print sandbox cwd |
changes & verify
| Command | Description |
|---|---|
/diff |
unified diff of every change this session |
/undo |
revert the most recent agent-applied change |
/commit [msg] |
git-commit the session's touched files |
/verify [cmd] |
run the project's check command (tests/lint) |
background
| Command | Description |
|---|---|
/bg <cmd> |
start a background shell command |
/bg [show|wait|kill|clear] <id> |
manage tasks |
safety
| Command | Description |
|---|---|
/yolo |
toggle auto-approval of side-effect tools |
Install
pip install essarion-build
Set your provider key. The default is OpenRouter, which gives you access to ~any model through one API:
export OPENROUTER_API_KEY=sk-or-...
Or pick another:
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=...
# Ollama needs no key; runs locally
Quick start
from essarion_build import Context, reason, generate
ctx = (
Context()
.with_all_skills() # 54 bundled coding-practice skills
.add_repo("./") # ground in your codebase
.add_docs("https://datatracker.ietf.org/doc/html/rfc7519")
)
# Pure reasoning — returns a plan, no code yet
r = reason("harden JWT signature check", context=ctx)
print(r.plan) # the numbered reasoning trace
print(r.tradeoffs) # what was considered and rejected
print(r.verdict) # "ship" or "do not ship without X"
print(r.usage.total_tokens) # token cost across the whole loop
# Reason + produce code — returns reasoning AND a snippet
g = generate("harden JWT signature check", context=ctx)
print(g.code) # the proposed change
print(g.reasoning) # the underlying Reasoning object
print(g.defense) # one-paragraph "why this is safe to ship"
print(g.usage.total_tokens) # token cost across plan + draft + selfcheck
Adaptive reasoning effort — deep when it matters, cheap when it doesn't
The headline feature. A one-line rename shouldn't cost the same as hardening a JWT validator. The effort parameter spends tokens proportional to task difficulty:
| effort | reason() calls | what it does |
|---|---|---|
quick |
1 | plan only — trivial tasks |
standard |
2 | plan + adversarial self-check (default) |
deep |
4 | plan → critique the plan → revise it → self-check |
max |
6 | + explore an alternative plan → synthesize the best of both |
auto |
1 triage + above | a tiny triage call sizes the task 1–5, then routes to quick/standard/deep automatically |
from essarion_build import reason, Context
ctx = Context().with_all_skills().add_repo("./")
# Let Essarion size the task. Trivial → 1-2 calls; security-critical → 4.
r = reason("harden JWT signature check", context=ctx, effort="auto")
print(r.effort) # e.g. "deep" — triage decided this one was worth it
# Or pin the depth yourself.
r = reason("rename `cfg` to `config`", context=ctx, effort="quick") # 1 call
r = reason("design the migration", context=ctx, effort="max") # 6 calls
Why this is cheap AND deep: the extra calls in deep/max refine the plan — which is short — before any code is written. You pay a few hundred tokens to catch a design flaw, not thousands to regenerate a bad draft. The auto triage call caps its own output, so sizing a task costs almost nothing.
Set a global default or seed from the environment:
import essarion_build
essarion_build.configure(effort="auto") # every call sizes itself
# or: export ESSARION_EFFORT=auto
The essarion CLI agent defaults to effort="auto" — it sizes every task you give it and tells you the depth it chose. Change it live with /effort deep.
Bundled software-dev skills
A core idea: cheap models cost less but reason worse. essarion_build closes the gap by injecting senior-engineer skills into the context — short, focused markdown briefs the model reads alongside the task.
from essarion_build import Context, list_skills, load_skill
list_skills()
# [
# 'accessibility', 'api_design', 'auth_security', 'caching', 'cli_design',
# 'cloud_infra', 'code_review', 'code_smells', 'code_style', 'concurrency',
# 'data_modeling', 'database_design', 'debugging', 'dependency_injection',
# 'dependency_management', 'documentation', 'dx', 'error_handling',
# 'event_driven', 'feature_flags', 'git_workflow', 'go_idioms',
# 'incident_response', 'internationalization', 'kubernetes',
# 'llm_integration', 'logging', 'microservices', 'migrations',
# 'observability', 'performance', 'python_idioms', 'react_patterns',
# 'refactoring', 'release_engineering', 'rust_idioms', 'scope_discipline',
# 'secure_coding', 'sql_idioms', 'state_management', 'testing',
# 'typescript_idioms'
# ]
# Pick the ones relevant to your task...
ctx = Context().with_skills(["secure_coding", "auth_security", "error_handling"])
# ...or load them all (recommended default for coding tasks)
ctx = Context().with_all_skills()
# Or read one inline
print(load_skill("secure_coding"))
Each skill is a short, actionable brief. secure_coding covers input validation, output encoding, secret handling, and crypto defaults. scope_discipline covers staying within scope. testing covers what to test and how. The full set is bundled with the package; no network calls.
Custom skills
Need project-specific guidance? Inject your own:
ctx = (
Context()
.with_all_skills()
.with_custom_skill("house_style", open("./docs/style.md").read())
.with_skills_dir("./team_skills") # every *.md file in the dir
)
High-level workflows
For the common cases, skip the prompt engineering — call a workflow.
from essarion_build import workflows, Context
# Review a target with the review-default skill set
r = workflows.review("src/auth.py", repo="./")
# Fix a bug end-to-end (plan → patch → defense)
g = workflows.fix_bug("payment endpoint hangs on null email", repo="./")
# Generate tests for a public surface
g = workflows.write_tests("class JWTValidator", repo="./")
# Refactor with behavior-preserving guarantees
g = workflows.refactor("UserService.god_method", repo="./")
# Write docs in the existing house style
g = workflows.docs("public Context API", repo="./")
Each workflow picks a sensible default skill set, frames the task, and runs the loop. They're thin on purpose — drop down to reason() / generate() when you need full control.
Recipes
For the most common asks, skip prompt engineering — pull a recipe:
from essarion_build import Context, recipes, reason
task, skills = recipes.audit_for_race_conditions("the booking flow")
ctx = Context().with_skills(skills).add_repo("./src/booking")
r = reason(task, context=ctx)
Recipes ship for: race conditions, N+1 queries, type-annotation passes,
runbooks, API design, data migrations, hot-path optimization, endpoint
hardening, schema design. See essarion_build.recipes.list_recipes().
Streaming
See progress in real time. Yields a phase_start/token/phase_end/usage/complete event sequence:
from essarion_build import stream_generate, Context
for event in stream_generate("write a JWT validator", context=Context().with_all_skills()):
if event.kind == "token":
print(event.text, end="", flush=True)
elif event.kind == "phase_end":
print(f"\n--- {event.phase} done ---")
elif event.kind == "complete":
print(f"\nFinal usage: {event.usage}")
Providers with native token streams (Anthropic) emit fine-grained tokens; buffered providers emit one chunk per phase. The UI surface is the same either way.
Async API
areason() / agenerate() mirror the sync API:
import asyncio
from essarion_build import Context, agenerate
async def main():
g = await agenerate("write a JWT validator", context=Context().with_all_skills())
print(g.code)
asyncio.run(main())
Parallel batches
For independent tasks, run them concurrently:
from essarion_build import batch_generate, run_batch, Context
tasks = [
"review src/auth.py",
"review src/billing.py",
"review src/notifications.py",
]
results = run_batch(batch_generate(tasks, context=Context().with_all_skills(),
max_concurrency=4))
print(f"{len(results.ok)} succeeded, {len(results.errors)} failed")
One task's failure doesn't fail the rest — failures are returned as Exception instances in the result list.
Multi-turn conversations
When tasks build on each other, use Conversation. Each turn's plan + verdict is auto-summarized into the context so the next call can refer back:
from essarion_build import Conversation, Context
conv = Conversation(context=Context().with_all_skills().add_repo("./"))
# Turn 1: agree on the schema
r1 = conv.reason("design a users-and-orgs schema with row-level multitenancy")
# Turn 2: write the migration — the prior plan + verdict are in the context
g2 = conv.generate("write the SQL migration for the schema above")
# Turn 3: tests — same context, plus turns 1+2
g3 = conv.generate("write integration tests for the migration")
print(conv.usage.total_tokens) # aggregated across all 3 turns
print(len(conv.history)) # 3
forked = conv.fork() # branch for what-if scenarios
Diff-focused context
Reviewing a change? Don't load the whole repo; load the diff:
import subprocess
from essarion_build import workflows, Context
diff = subprocess.check_output(["git", "diff", "main"]).decode()
ctx = Context().with_all_skills().add_diff(diff, title="main..HEAD")
r = workflows.review("the change above", context=ctx)
Token budgeting and usage tracking
Every reason() and generate() result carries a usage field with prompt, completion, total, and provider-reported cached token counts:
r = reason("...", context=ctx)
print(r.usage)
# Usage(prompt_tokens=2618, completion_tokens=373, total_tokens=2991, cached_tokens=0)
Cap the per-call budget without changing the module default:
g = generate("...", context=ctx, max_tokens=1500)
Pre-flight estimate before sending:
print(ctx.estimate_tokens()) # ~1.2k? send it. ~120k? trim.
print(ctx.total_chars()) # raw character count
Or set the budget globally:
import essarion_build
essarion_build.configure(max_tokens=2000)
The runtime divides the budget across the 2 calls (reason) or 3 calls (generate) in the loop, plus any one-shot tag-repair retries (see below). Usage from those retries is included in the total.
Response cache
For dev iteration, skip duplicate provider calls:
from essarion_build import LiteRuntime, ResponseCache, CachingProvider, build_provider
provider = build_provider(name="anthropic", api_key=None, model="claude-sonnet-4-6")
cache = ResponseCache("./.essarion-cache")
cached_provider = CachingProvider(provider, cache, provider_name="anthropic")
# Use it as you would any provider:
rt = LiteRuntime(cached_provider)
Identical (system, messages, max_tokens) tuples are served from disk; cache misses populate it.
Post-generation validators
Cheap, deterministic checks on generated code (no model calls):
from essarion_build.validators import validate
g = generate("write a Python function …", context=ctx)
issues = validate(g.code, kind="python")
for i in issues:
print(f"{i.severity}: {i.message} (line {i.line})")
Built-in validators: python (syntax, bare except, mutable defaults, dangerous calls, TODO markers), json (RFC 8259 strictness), diff (unified-diff header check). Register your own with register_validator(kind, fn).
Cheap-model survival kit
Small models drop XML tags. When the model returns a selfcheck without the <defense> tag, the runtime asks once for just the missing tag(s) and merges the result. If even the repair pass fails, you get a typed ReasoningFormatError — not a silently empty defense field.
You don't have to opt in; this happens automatically inside LiteRuntime.
The @reasoned decorator
Mark functions you want the essarion-build CLI to enumerate. In normal Python execution the original body runs unchanged — the decorator just records the function in a module-level registry.
from essarion_build import reasoned
@reasoned(context=ctx)
def parse_jwt(token: str) -> Claims:
"""Parse a JWT and return validated claims."""
... # body is yours; the CLI uses this entry for future reason+generate runs
BYOK and providers
The Provider seam keeps essarion_build model-agnostic. v0.3 ships six concrete providers; the model you run is your choice:
| Provider | Env var | Default model | Notes |
|---|---|---|---|
openrouter (default) |
OPENROUTER_API_KEY |
openai/gpt-4o-mini |
Routes to ~any model behind one OpenAI-compatible API. The cheap-default story. |
anthropic |
ANTHROPIC_API_KEY |
(provide one, e.g. claude-sonnet-4-6) |
Direct to the Claude API. Uses prompt caching on the system block. Streaming supported. |
openai |
OPENAI_API_KEY |
(provide one, e.g. gpt-4o-mini) |
Direct to OpenAI. |
gemini |
GEMINI_API_KEY or GOOGLE_API_KEY |
(provide one, e.g. gemini-2.0-flash) |
Direct to Google Gemini. |
ollama |
(none) | (provide one, e.g. llama3.2) |
Local OSS models. Set OLLAMA_BASE_URL if not on localhost:11434. |
stub |
(none) | n/a | In-memory scripted responses for tests. |
Switch globally or per-call:
import essarion_build
# Stay on OpenRouter but use a stronger model
essarion_build.configure(model="anthropic/claude-sonnet-4.6") # OpenRouter slug
# Or switch provider entirely
essarion_build.configure(provider="anthropic", model="claude-sonnet-4-6")
# Per-call override
generate("...", provider="openai", model="gpt-4o-mini", max_tokens=1500)
Register a custom provider
from essarion_build import register_provider
class _MyProvider:
def __init__(self, *, api_key=None, model: str) -> None:
self.model = model
def complete(self, *, system, messages, max_tokens):
... # return ProviderResponse(text=..., usage=Usage(...))
register_provider("my-llm", lambda *, api_key, model: _MyProvider(api_key=api_key, model=model))
# Now usable just like a built-in:
generate("...", provider="my-llm", model="my-model-v1")
Async siblings (register_async_provider, build_async_provider, AsyncProvider protocol) follow the same shape.
Structured output (JSON-schema mode)
When you want a typed payload instead of free text, use generate_json:
from pydantic import BaseModel
from essarion_build import Context, generate_json
class ReviewFinding(BaseModel):
file: str
line: int
severity: str # "info" | "warning" | "error"
description: str
suggested_fix: str
parsed, gen = generate_json(
"review src/auth.py for a single finding",
schema=ReviewFinding,
context=Context().with_skill("code_review"),
)
finding = ReviewFinding(**parsed)
print(finding.severity, finding.suggested_fix)
If the model emits invalid JSON (or fails Pydantic validation), the
runtime automatically reframes the task with the validation error and
asks once more. After the second failure, SchemaValidationError is
raised with the last output for inspection.
Pass a raw JSON schema dict if you don't want Pydantic in the mix:
parsed, gen = generate_json("…", schema={"type": "object", "required": ["a", "b"]}, …)
Async sibling: agenerate_json(...).
Compaction
When your Context exceeds the token budget:
from essarion_build import Context, compact, truncate_files, keep_only_files
ctx = Context().with_all_skills().add_repo("./src")
print(ctx.estimate_tokens()) # e.g. 380_000
# Drop low-signal sections (repo files first, then docs) to fit a budget.
ctx = compact(ctx, max_tokens=40_000)
# Or cap each file body individually with a truncation marker.
ctx = truncate_files(ctx, max_chars_per_file=4_000)
# Or filter the file set by glob.
ctx = keep_only_files(ctx, patterns=["src/auth/*", "src/billing/*"])
compact() never drops skills, notes, or diffs — they are the
high-signal content.
Evals
A model-driven SDK without evals is a benchmark waiting to regress.
essarion_build.evals is a thin harness for running a labeled benchmark
against any callable runner:
from essarion_build.evals import EvalCase, contains_all, run_eval
from essarion_build import Context, reason
CASES = [
EvalCase(task="audit JWT alg=none confusion", expected="whitelist, alg=none"),
EvalCase(task="audit a SQL injection", expected="parameterized, prepared"),
]
def runner(task: str):
r = reason(task, context=Context().with_skill("secure_coding"))
return r.plan, r.usage # tuple → usage flows into the report
report = run_eval(CASES, runner, contains_all, name="security-v1")
print(report.summary()) # → "security-v1: 2/2 passed (100%, mean 1.00). Tokens: …"
# Compare against a baseline for CI gating
delta = report.delta(baseline_report)
if delta["regressed"]:
raise SystemExit(f"Regressions: {delta['regressed']}")
Built-in scorers: exact_match, contains_all, keyword_overlap.
Roll your own — it's just a function returning Score.
Tools (model-side)
essarion_build ships a small, opt-in tool surface that works on every
provider (no native tool-use required):
from essarion_build import register_tool, run_tools_in_plan, Context, tool_manifest
@register_tool("read_file", description="read a file from disk")
def _read(path: str) -> str:
return open(path).read()
ctx = Context().with_all_skills().add_note(tool_manifest())
# The model emits <tool_call name='read_file'>{"path": "…"}</tool_call>;
# you evaluate them before the next turn:
plan_with_results = run_tools_in_plan(plan_text, allow={"read_file"})
The allow set is a security boundary — unknown or disallowed tools
become inline <tool_result error="true">…</tool_result> instead of
running. Use this for the small-tool case (look up the schema, fetch a
URL); for full agent loops, build directly on Provider.complete().
CLI
The package installs an essarion-build console command. Useful for one-off coding tasks, CI scripts, and editor integrations:
# List bundled skills
essarion-build skills
# Print a skill's body
essarion-build skills --show secure_coding
# List recognized providers
essarion-build providers
# List bundled workflows
essarion-build workflows
# Estimate token cost of a context before sending
essarion-build estimate --repo ./ --json
# Run a reason() loop
essarion-build reason "review the auth flow" --repo ./src --json
# Stream a generate() loop's output
essarion-build generate "write a JWT validator" --repo ./src --stream
# Pipe a task from stdin
git diff main | essarion-build reason "review this change" -
Testing your essarion-build code
Wire StubProvider in for deterministic, no-network tests of your own workflows:
from essarion_build import Context, LiteRuntime, StubProvider, reason
def test_my_workflow():
stub = StubProvider(responses=[
"<plan>1</plan><tradeoffs>-</tradeoffs><verdict>ship</verdict>",
"<verdict>ship</verdict>",
])
r = reason("anything", context=Context(), _runtime=LiteRuntime(stub))
assert "ship" in r.verdict
assert stub.call_count == 2
Async sibling: AsyncStubProvider + AsyncLiteRuntime. See essarion_build.testing for helpers.
Error handling
Provider failures map to typed exceptions all rooted at EssarionError:
| Exception | When |
|---|---|
ProviderAuthError |
HTTP 401/403 (bad or missing key) |
ProviderRateLimitError |
HTTP 429 after exhausting retries |
ProviderHTTPError |
Other non-2xx, or network errors after retries |
ProviderResponseError |
Provider returned 2xx but the body was unparseable |
ReasoningFormatError |
Model output still missing required tags after one repair pass |
CloudRuntimeNotAvailable |
runtime="cloud" requested (not yet shipped) |
ProviderNotAvailable |
Unknown provider name |
ContextError |
Bad input to a Context method |
Transient HTTP errors (429, 5xx, connection errors) get up to 2 retries with exponential backoff before surfacing.
Lite vs Cloud
Both runtimes implement the same protocol:
| Runtime | What it does | Status in v0.3 |
|---|---|---|
LiteRuntime (default) |
Drives the 3-step reasoning loop locally via your provider key. Fast to set up. | Available |
CloudRuntime |
Sends the task to build.essarion.com for a heavier reasoning loop, longer context, and real Sourcipedia grounding. |
Stub — raises CloudRuntimeNotAvailable |
generate("...", runtime="cloud") # raises in v0
essarion_build.configure(runtime="cloud") # configure now, callable when Cloud ships
Interop hooks (stubs in v0.3)
These exist on Context so the API surface is right; implementations land when upstream APIs are exposed.
ctx.add_sourcipedia_topic("jwt") # placeholder source entry
ctx.add_agent_skill("auth_review") # Anthropic Agents skill manifest reference
from essarion_build.auth import from_platform_api # raises NotImplementedError in v0
Out of scope for v0.3
No plugin loader (custom providers + custom skills cover the same surface), no embeddings/RAG (use the Context.add_repo(include=...) filter), no telemetry, no model-side tool use.
License
Apache-2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file essarion_build-0.3.0.tar.gz.
File metadata
- Download URL: essarion_build-0.3.0.tar.gz
- Upload date:
- Size: 247.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91c91cf4b774dd458d80f418f4be7ce696a3129b8a40d88400f70315eaa9d920
|
|
| MD5 |
5452dcd0c4caf23f7fc245a7f4d7336d
|
|
| BLAKE2b-256 |
2b5823da52ad5cb588e335c77a4d07f10f121a08fe0b928398d29b90b1ba5d45
|
Provenance
The following attestation bundles were made for essarion_build-0.3.0.tar.gz:
Publisher:
publish.yml on hapi-developer/essarion_build
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
essarion_build-0.3.0.tar.gz -
Subject digest:
91c91cf4b774dd458d80f418f4be7ce696a3129b8a40d88400f70315eaa9d920 - Sigstore transparency entry: 1662692305
- Sigstore integration time:
-
Permalink:
hapi-developer/essarion_build@f001de7f6ae0607e0bc5f625f1b69ee520c443ba -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/hapi-developer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f001de7f6ae0607e0bc5f625f1b69ee520c443ba -
Trigger Event:
push
-
Statement type:
File details
Details for the file essarion_build-0.3.0-py3-none-any.whl.
File metadata
- Download URL: essarion_build-0.3.0-py3-none-any.whl
- Upload date:
- Size: 219.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73aef82c55c58b0612eecfb01fdef2b350c569dc2ebb3170a298615a32eb830a
|
|
| MD5 |
516c3c09c78ac4b0e0fbc0ca78aee9e1
|
|
| BLAKE2b-256 |
2695028caec12b043ddb5796ef51cb1137a174b85b7f2b4fe769eb0b84eb5d24
|
Provenance
The following attestation bundles were made for essarion_build-0.3.0-py3-none-any.whl:
Publisher:
publish.yml on hapi-developer/essarion_build
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
essarion_build-0.3.0-py3-none-any.whl -
Subject digest:
73aef82c55c58b0612eecfb01fdef2b350c569dc2ebb3170a298615a32eb830a - Sigstore transparency entry: 1662692453
- Sigstore integration time:
-
Permalink:
hapi-developer/essarion_build@f001de7f6ae0607e0bc5f625f1b69ee520c443ba -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/hapi-developer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f001de7f6ae0607e0bc5f625f1b69ee520c443ba -
Trigger Event:
push
-
Statement type: