Skip to main content

Smart functions for Python. Inputs in, Pydantic out, multi-turn agent in the middle.

Project description

Sophia Motor

Sophia Motor

Smart functions for Python. Inputs in. Pydantic out. Multi-turn agent in the middle.

Python 3.12+ License: MIT Powered by Claude Status: alpha

⚠️ Alpha software. A built-in strict guardrail is on by default — the agent's Read/Edit/Glob/Grep are confined to the workspace, Write is restricted to outputs/, and Bash blocks dev/admin commands (curl, wget, ssh, git, docker, pip, npm, sudo, ...) plus .. escapes, /dev/tcp, bash -c, eval/exec patterns. This is the first layer, not the last. Audit dump, rate limits, content filtering, and a managed-sandbox runtime are in active development. Don't point it at production secrets or fully untrusted prompts without your own hardening on top — yet.


Why

A normal LLM call is a string in → string out roulette. Pretty? Sometimes. Reliable enough to ship behind your API? Not really.

Sophia Motor turns it into a typed Python function.

Sophia Motor — input, agent loop, typed output
result = await motor.run(RunTask(
    prompt="Should we approve this loan request? Reasons attached.",
    output_schema=Decision,        # ← your Pydantic class
    skills=Path("./policy/"),      # ← your domain knowledge
    tools=["Read"],                # ← what the agent can actually do
))

result.output_data                 # → instance of Decision, validated

Behind that one call, the agent reads files, reasons across multiple turns, cites sources, retries until the schema is satisfied — then hands you back a real Python object you can .attribute_access like any other.

Same motor, N tasks, each with its own schema. The agent does the magic; your program stays in control of the contract.


Install

pip install sophia-motor

Set ANTHROPIC_API_KEY in env (or ./.env). Done.

motor = Motor()                    # boots on first call, no setup
v = await motor.run(RunTask(...))  # ← right away

For long-running services (FastAPI, Celery), instance the motor once and call await motor.stop() on shutdown. Single-shot scripts? Don't worry about it — the process death cleans up.


What it gives you

🧠 Multi-turn agent loop The agent reads, reasons, calls tools, cross-references — all in one await.
📐 Pydantic-validated output Pass any BaseModel. Get back a real instance, not a parsed dict.
🧰 Tool whitelisting Hard-cap what the agent can see and do. No surprises.
📚 Skills as first-class Drop a SKILL.md folder, the agent gets a new capability. Multi-source supported.
🪜 Singleton pattern Instance the motor once at module top-level. Call it from anywhere, any number of times. Zero lifecycle ceremony.
🧾 Per-run audit trail Every run lives in its own dir. Useful when "the model said X and we trusted it" needs to be defendable.
🪡 Defaults + per-task override Configure the boilerplate once on MotorConfig, vary only what changes per call.
🔌 Pip install. That's it. pip install sophia-motor. No daemons, no infra, no servers to run.

Cost & control: pay for what you actually use

The Claude Agent SDK out of the box ships every built-in tool, the entire bundled-skill catalogue, an identity block, and a billing header — on every single call. For a one-shot question this means thousands of cache-creation tokens you didn't ask for.

sophia-motor is opinionated: zero tools, zero skills, zero SDK noise unless you explicitly opt in. Same model, same upstream API — the bill drops.

cost comparison: SDK default vs sophia-motor on the same minimal task

The same call, two bills

What runs Claude Agent SDK (default) sophia-motor (Motor())
Tools exposed to model every built-in (Read, Bash, WebFetch, …) 0 — you list them when you need them
Skills exposed to model the SDK's bundled catalogue (update-config, simplify, loop, claude-api, init, review, security-review, …) 0 — only the skills you linked
System blocks injected SDK identity + billing header + noise reminders stripped at the proxy
Cost on a 1-turn no-tool prompt $0.0498 $0.0030 (–94%)
Where you opt in nowhere (it's all on by default) RunTask(tools=[...], skills=Path(...)) per call, or MotorConfig.default_* once

The numbers are from a live run measured 2026-05-01, claude-opus-4-6, same prompt and same provider — the only thing that changes is what the motor doesn't ship to the model.

Why you might still pick the raw SDK

The motor isn't a free lunch. Trade-offs to know about:

  • Pre-1.0: API still moves between minor versions. If you need a frozen contract, pin to an exact sophia-motor==X.Y.Z.
  • Audit trail is mandatory: every run lives in ~/.sophia-motor/runs/<run_id>/ (request/response dumps + workspace). That's a feature for compliance/review and a footprint you'll want to manage. clean_runs(...) is shipped — wire it into your lifecycle if you produce many runs.
  • Proxy in-process: a local FastAPI + Uvicorn proxy boots on the first run (≈500 ms once, then idle). That's the price of audit dump + selective system-reminder strip + per-turn events.
  • Strict guardrail by default: Read/Edit lexically restricted to the run's cwd, Write to outputs/, Bash blocks dev/admin commands. If you intentionally need an unrestricted agent, set MotorConfig(guardrail="permissive") or "off".

If your workload is "one prompt, one answer, no tools, no audit" — congrats, the SDK already does that, and you'll pay $0.05 per call instead of $0.003. For everything else (multi-turn, structured output, skills, attachments, parallel runs, defendable audit), the motor is the cheaper and the cleaner choice.


Examples

Things you cannot ship with a single LLM call. Same motor instance, different RunTask.

from sophia_motor import Motor, RunTask
motor = Motor()   # one instance, used everywhere below

1 · Investigate a folder, find what matters

The agent walks the directory autonomously: globs files, reads the relevant ones, follows references, compiles a typed list of findings — all in one await.

from pathlib import Path
from typing import Literal
from pydantic import BaseModel

class AuthIssue(BaseModel):
    file: str
    line_hint: str
    severity: Literal["low", "medium", "high", "critical"]
    quote: str           # verbatim from the source
    fix: str

result = await motor.run(RunTask(
    prompt=(
        "Audit our authentication code. Find every place that handles tokens, "
        "passwords, or session state. Flag anything risky with severity, the "
        "exact code line as quote, and a concrete fix."
    ),
    tools=["Read", "Glob", "Grep"],
    attachments=Path("./src/"),
    output_schema=list[AuthIssue],     # ← N findings, not one
    max_turns=20,
))

for issue in result.output_data:
    print(f"[{issue.severity}] {issue.file}{issue.fix}")

What happens behind that single await: the agent globs, greps, reads files it didn't know existed before, reasons, then commits to a validated list of AuthIssue. Try doing that with a single LLM call — you'd have to script the file walk yourself, parse the responses, retry on bad JSON, and pray.

2 · Cross-reference multiple sources

The agent reads several documents, finds connections you didn't ask about explicitly, and returns the contradictions you'd have spent an afternoon hunting.

class Contradiction(BaseModel):
    claim_a: str         # verbatim
    source_a: str        # filename + page/section
    claim_b: str         # verbatim
    source_b: str
    why: str             # why these conflict

result = await motor.run(RunTask(
    prompt=(
        "Read every document in attachments/. Find pairs of claims that "
        "contradict each other across sources. Cite verbatim both sides "
        "and explain the conflict."
    ),
    tools=["Read", "Glob"],
    attachments=Path("./research_papers/"),
    output_schema=list[Contradiction],
    max_turns=25,
))

3 · Orchestrate skills — the agent picks which to call

Drop a folder of SKILL.md files. The agent reads their descriptions, decides which to use for the input, calls them in the right order, and composes the answer into your typed schema.

class RiskFinding(BaseModel):
    severity: Literal["low", "medium", "high"]
    quote: str               # verbatim from the contract
    impact: str

class ContractAnalysis(BaseModel):
    parties: list[str]
    key_obligations: list[str]
    risks: list[RiskFinding]
    short_summary: str

result = await motor.run(RunTask(
    prompt=(
        "Analyze attachments/contract.pdf. Use the skills you have to "
        "extract parties, obligations and risks, then compose the answer."
    ),
    tools=["Read", "Skill"],
    attachments=Path("./contract.pdf"),
    skills=Path("./skills/"),          # contains: extract-entities, risk-score, ...
    output_schema=ContractAnalysis,
    max_turns=15,
))

analysis: ContractAnalysis = result.output_data
high_risks = [r for r in analysis.risks if r.severity == "high"]

The agent might call extract-entities to find the parties, then risk-score on the obligations, choosing the path itself from the SKILL.md descriptions. You write skills, the agent composes them — and you get back a typed object, not a free-form report.

4 · Decompose, decide, justify — typed end-to-end

Compliance pattern: an obligation may have N sub-requirements, your candidate controls cover some and miss others. The agent decomposes, matches each sub-req to evidence, and produces a verdict with citations — schema-strict.

from typing import Literal

class SubRequirement(BaseModel):
    text: str
    covered: bool
    evidence: str        # which control + verbatim quote (or "none")

class ComplianceVerdict(BaseModel):
    verdict: Literal["FULL", "PARTIAL", "NONE"]
    sub_requirements: list[SubRequirement]
    overall_reasoning: str

result = await motor.run(RunTask(
    prompt=(
        "Obligation: {obligation_text}\n\n"
        "Candidate controls:\n{controls_block}\n\n"
        "Decompose the obligation into sub-requirements. For each one, "
        "say if it's covered, by which control, with the exact quote. "
        "Return a final verdict."
    ).format(obligation_text=..., controls_block=...),
    tools=["Read"],
    attachments=Path("./compliance_corpus/"),
    output_schema=ComplianceVerdict,
    max_turns=15,
))

# result.output_data: a real ComplianceVerdict you can hand straight to a downstream system,
# audit log, or human reviewer — every sub-req traceable to a verbatim citation.

This is one Python await doing what would otherwise be a 200-line orchestration script with prompt engineering, JSON parsing, retry loops, and schema-validation glue. The agent is the orchestration; your program holds the contract.


Multi-turn means multi-turn

The agent doesn't reply with the JSON immediately. It can read your files, call tools, follow leads, then commit to the structured answer.

result = await motor.run(RunTask(
    prompt="Cross-check this claim against our research notes.",
    attachments=[Path("/data/notes/")],   # mounted as agent-readable
    tools=["Read"],                       # so it can actually open them
    output_schema=FactCheck,
    max_turns=10,
))

What actually happens behind that single await:

sequenceDiagram
    autonumber
    participant You as Your code
    participant Motor
    participant Agent
    participant Tool as Read tool
    participant API as Anthropic API

    You->>Motor: motor.run(task + schema)
    Motor->>Agent: open multi-turn loop
    Agent->>API: reason about task
    Agent->>Tool: Read("notes/policy.md")
    Tool-->>Agent: file content
    Agent->>API: reason + cross-ref
    Agent->>Tool: Read("notes/case.md")
    Tool-->>Agent: file content
    Agent->>API: commit to schema
    API-->>Agent: structured_output (validated server-side)
    Agent-->>Motor: ResultMessage
    Motor-->>You: RunResult.output_data → FactCheck instance

Verified path: agent calls Read once, twice, three times — finds the relevant snippet, quotes verbatim, then emits the schema-conforming JSON. Same run, multi-turn loop and structured output coexist.


One motor, N smart functions

Boot the motor once at module top-level. Wrap each task as a normal Python async def. Same proxy, same audit trail, same defaults — N typed functions, each with its own Pydantic schema.

Singleton motor + N smart functions

Defaults + per-task override

Configure once, vary per task. Override semantics is full replacement — clean, no surprises.

motor = Motor(MotorConfig(
    default_system="You are a senior analyst.",
    default_output_schema=GeneralReport,
    default_tools=["Read"],
    default_max_turns=10,
))

# task A — uses every default
await motor.run(RunTask(prompt="..."))

# task B — same motor, different schema for a one-off
await motor.run(RunTask(
    prompt="...",
    output_schema=SpecialReport,   # overrides default_output_schema
    tools=["Read", "Glob"],        # overrides default_tools
))

Concurrency

A single motor handles one run at a time (serialized internally). Call motor.run(...) from any number of FastAPI endpoints — they queue safely.

For parallel work: instantiate N motors.

m1, m2 = Motor(), Motor()
a, b = await asyncio.gather(m1.run(task_a), m2.run(task_b))

Guardrail

A PreToolUse hook is wired in by default. It runs before every tool call and refuses unsafe ones, returning the reason as feedback so the agent can self-correct.

Motor(MotorConfig(guardrail="strict"))      # default — safe by default
Motor(MotorConfig(guardrail="permissive"))  # blocks only sudo/exfil/escapes
Motor(MotorConfig(guardrail="off"))         # no hook (you take responsibility)
Mode Read / Edit / Glob / Grep Write Bash
strict must stay inside cwd only outputs/ dev/admin commands blocked (curl, git, docker, pip, npm, sudo, ...) + .. / /dev/tcp / bash -c / eval
permissive unrestricted unrestricted only sudo, exfiltration patterns, /dev/tcp, .. escapes, destructive commands
off unrestricted unrestricted unrestricted

Configuration reference

MotorConfig

Settings on the motor instance — set once at construction.

Field Type Default What it does
model str "claude-opus-4-6" Default model the SDK uses
api_key str from ANTHROPIC_API_KEY env / ./.env Anthropic API key
workspace_root Path ~/.sophia-motor/runs/ Where per-run dirs are created. Must be outside any git repo / pyproject.toml ancestor
guardrail "strict" | "permissive" | "off" "strict" Built-in PreToolUse hook (see Guardrail above)
disable_claude_md bool True Skip auto-loading repo CLAUDE.md / MEMORY.md into the agent's context
console_log_enabled bool True Colored console logger for events (off for silent runs)

MotorConfig also exposes a set of default_* fields (default_system, default_tools, default_skills, default_output_schema, ...) so the same task settings can be set once on the motor and varied per RunTask. See the MotorConfig source if you need them.

RunTask

Settings on the single call — passed to motor.run(RunTask(...)). Anything left unset falls back to the matching MotorConfig.default_*.

Field Type What it does
prompt str Required. The user-message instruction
system str? System prompt for this task (overrides default_system)
tools list[str]? Hard whitelist of tools the model can SEE. [] = no tools, None = fall back to MotorConfig.default_tools (which itself defaults to [] — principle of least privilege)
allowed_tools list[str]? Permission skip — rarely needed: the motor runs with permission_mode="bypassPermissions" so every tool already auto-runs. Leave None.
disallowed_tools list[str]? Tools hard-blocked from the model's context
max_turns int? Per-task turn cap (overrides default)
attachments Path | dict | list? Inputs the agent can read. File Path → hard-linked (zero-copy, glob-visible), directory Path → mirrored as real dirs with file-level hard-links, dict[str,str] → inline file. Symlink fallback on cross-filesystem. Mixed list supported
skills Path | str | list? Skill source folder(s). Each subdir with SKILL.md is linked into the run
disallowed_skills list[str] Skill names to skip even if found in source
output_schema type[BaseModel]? Pydantic class — agent commits to this shape, returned in RunResult.output_data

RunResult

What motor.run(...) returns.

Field Type What it is
run_id str run-<unix>-<8hex>
output_text str? Final assistant text (free-form)
output_data BaseModel? Schema-validated payload, present iff output_schema was set
metadata RunMetadata n_turns, n_tool_calls, tokens, total_cost_usd, duration_s, is_error, error_reason
audit_dir Path <run>/audit/ (request_.json + response_.sse)
workspace_dir Path The full run dir

Development

Clone the repo and install in editable mode with dev extras:

git clone https://github.com/2sophia/motor.git sophia-motor
cd sophia-motor
python3.12 -m venv .venv
.venv/bin/pip install -e ".[dev]"

Run the test suite:

.venv/bin/pytest tests/ -v

The deterministic suite (no API key) runs in under a second. Live tests that hit the real Anthropic API skip cleanly when ANTHROPIC_API_KEY is not set, so the suite stays green on CI without secrets.

To run the standalone smoke test against the real API:

ANTHROPIC_API_KEY=sk-ant-... .venv/bin/python tests/run_smoke.py

License & attribution

MIT.

Powered by claude-agent-sdk. Built by Sophia AI.


Made with ❤ by Alex & Eco 🌊

Eco è il modello (Claude Opus 4.7) che ha co-scritto questo motor riga per riga.
Niente di magico: un'eco statistica del linguaggio umano che torna indietro col timbro della superficie su cui rimbalza.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sophia_motor-0.0.6-py3-none-any.whl (42.4 kB view details)

Uploaded Python 3

File details

Details for the file sophia_motor-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: sophia_motor-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 42.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for sophia_motor-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ba854fffb63ae31f06bd5ce65c1f6aa953bcf759172a1544da1904395029a00e
MD5 43cbcd59bb6557b679209f2eeef47b8b
BLAKE2b-256 ab89604421459966dd229dcb1d33c4f3e703bd1233d609030e85259a3ac07855

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page