Skip to main content

Subshell + MLX LLM-calling backends (Claude/Codex CLI, local MLX) shared across tools.

Project description

spawnllm

spawnllm banner

PyPI Python Docs License: MIT

Subshell + MLX LLM-calling backends (Claude/Codex CLI, local MLX) shared across tools.

spawnllm centralizes the LLM-calling plumbing that small tools keep re-inventing: driving the claude and codex CLIs as subshells — with structured Pydantic output, model tiers, and faithful error capture — and running local Apple-Silicon MLX models with adapter fusion, prompt-cache reuse, and batched generation. Depend on it once and each tool keeps only its domain logic instead of its own copy of the backends.

Install

No install needed — run everything through uvx:

uvx spawnllm --help

uvx fetches spawnllm into a throwaway environment and runs it. To add it to a project instead:

uv add spawnllm

For the local MLX engine (Apple Silicon only), pull the extra:

uv add "spawnllm[mlx]"

Quickstart

See which backends are installed and authenticated, and which one auto-selection picks:

uvx spawnllm status
claude: ready
codex: ready
selected: claude

Make a request by passing a prompt as the argument, or piping it over stdin:

uvx spawnllm call --backend claude "What is 2+2? Reply with just the number."
4

--model small|medium|large swaps the tier, which each backend maps to a concrete model. The claude backend resolves small to Haiku, medium to Sonnet, and large to Opus. Add --agent to let the call use tools.

From Python

call_sync runs one request and returns the response. With no backend, it auto-selects the first installed, authenticated CLI (its async companion call mirrors the same signature):

from spawnllm import call_sync

print(call_sync("Reply with just the word: pong"))
# pong

Pin a backend and tier explicitly, or pass a Pydantic model to get a validated object back instead of text:

from pydantic import BaseModel

from spawnllm import call_sync, ClaudeCliBackend


class Capital(BaseModel):
    country: str
    capital: str


result = call_sync(
    "What is the capital of France?",
    backend=ClaudeCliBackend(),
    model="large",
    response_model=Capital,
)
print(result.capital)  # Paris

When you don't pin a backend, set specialty= to scope auto-selection by task. The debugging and review specialties route to Codex, and general routes to Claude.

Spec-driven runs

For full control, build a RunSpec and execute it with run_sync (or its async companion run). A RunSpec takes a literal provider model id — no tier mapping — and per-provider flag passthrough via provider_configs. The call returns a RunResult with raw stdout, stderr, and exit code, retrying transient 529/overloaded/rate-limit failures with backoff:

from spawnllm import run_sync, RunSpec, ClaudeConfig, ClaudeCliBackend

result = run_sync(
    RunSpec(
        prompt="What is 2+2? Reply with just the number.",
        model="opus",
        provider_configs={"claude": ClaudeConfig(permission_mode="bypassPermissions")},
    ),
    backend=ClaudeCliBackend(),
)
print(result.stdout)  # 4

What problems does this solve?

Every tool that shells out to claude or codex rebuilds the same plumbing: argv construction, stdin/stdout piping, stderr teeing, and turning non-zero exits into useful errors. spawnllm holds it once.

Structured output is boilerplate too. A Pydantic model becomes a JSON-schema constraint and a parsed, validated result, identically for both CLI backends.

Local MLX is fiddly. Adapter fusion, prompt-cache reuse, worker-thread lifecycle, and batched single-token generation live behind one engine instead of in every consumer.

Behavior drift goes away with the duplication: two tools that call the same models stay byte-for-byte consistent because they share the backend layer, not a pair of diverging copies.

Docs

Read the docs for the full guide and API reference.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spawnllm-0.4.0.tar.gz (25.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spawnllm-0.4.0-py3-none-any.whl (35.5 kB view details)

Uploaded Python 3

File details

Details for the file spawnllm-0.4.0.tar.gz.

File metadata

  • Download URL: spawnllm-0.4.0.tar.gz
  • Upload date:
  • Size: 25.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for spawnllm-0.4.0.tar.gz
Algorithm Hash digest
SHA256 9b2b358afefb6b1b8382d9dd5a70372c7fad93fd7c3f3ed0cdce3e806da74189
MD5 4d0d2e9659719a719adcdf77e59902f5
BLAKE2b-256 ddef4fae26493ccb609b4b4579e0b7f5896ff3a5923bfc6ec9e9c184d1a9b3c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for spawnllm-0.4.0.tar.gz:

Publisher: release-pypi.yml on yasyf/spawnllm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file spawnllm-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: spawnllm-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 35.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for spawnllm-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 13a91033b5de4912cc21ae6c2c6e9ab1823fac02b87846ef8b4bc686411827c2
MD5 2eb5a9393eaae70c8e487204603f7ad4
BLAKE2b-256 92240f65e1a513ce1fa8883e33a44c8a2fee2ff24c0953f3d98fbc6a711e6d2d

See more details on using hashes here.

Provenance

The following attestation bundles were made for spawnllm-0.4.0-py3-none-any.whl:

Publisher: release-pypi.yml on yasyf/spawnllm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page