Python SDK for OpenAI-compatible inference endpoints (Courier and friends) with auto tool-call loops, structured outputs, and Whisper.
Project description
encode
One Python entry-point for any OpenAI-compatible LLM, with an auto-agent loop, durable sessions, and the brain/hands/session primitives from Anthropic's Managed Agents post baked in.
import encode
def get_weather(city: str) -> dict:
"""Get weather by city."""
return {"city": city, "temp_f": 72}
out = encode.relay(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Weather in Denver?"}],
tools=[get_weather],
).response
print(out.content) # "It's 72°F in Denver."
That's the whole "hello world." The model called get_weather, encode dispatched it, fed the result back, and returned the final answer.
What encode is
A Python SDK for any OpenAI-compatible inference endpoint — Courier, OpenAI, vLLM, LM Studio, Ollama, Together, Groq, and more. One relay() function spans /v1/chat/completions and /v1/responses. Pass tools and it runs the loop to completion. Pass a Session and the run is durable. Pass a Terminal and your agent has a shell.
Why encode
- Auto tool loop. Drop a Python function in
tools=; encode introspects the signature, builds the schema, runs the loop, and feeds results back. No decorators, no manual loop scaffolding. - Both endpoints, one API.
relay()auto-routes between/v1/chat/completionsand/v1/responses— same handle, same tool loop, same intercept, same streaming consumer. - Sessions as Pydantic event logs. Append-only, BYO storage.
session.model_dump()to your DB;Session.model_validate()to resume — across processes, machines, or days. - Mid-loop context engineering. Intercept callbacks can
append,insert,replace,edit_last_tool_result, orcompactthe conversation that goes into the next iteration. Real context engineering in the harness, not just observation. ToolExecutorseam. Swap dispatch (local → remote → MCP → sub-agent) without touching the harness.- Terminal as a first-class primitive. Persistent bash subprocess that retains cwd/env/venvs across calls. Wrap one in a closure and your agent has a shell.
- Full sync/async parity. Every helper has a
*_asynctwin; async tool callables and async intercept callbacks just work. - Streaming with the loop intact. Stream tokens and tool calls in real time across both endpoints.
60-second tour
import json
import encode
# 1. Plain chat
encode.relay(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "hi"}],
).response.content
# 2. With a tool — auto-loop runs until the model stops calling tools
def lookup(q: str) -> dict:
"""Look something up."""
return {"q": q, "answer": "42"}
encode.relay(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What's the meaning of life?"}],
tools=[lookup],
).response.content
# 3. With a Session — durable, resumable, BYO persistence
session = encode.Session.open()
encode.relay(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "remember: my name is Alex"}],
session=session,
).response
# Persist anywhere — file, Postgres, Redis, S3...
open("/tmp/agent.json", "w").write(json.dumps(session.model_dump(), default=str))
# Resume anywhere
resumed = encode.Session.model_validate(json.loads(open("/tmp/agent.json").read()))
encode.relay(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "what's my name?"}],
session=resumed,
).response.content # → "Alex"
Install
pip install encode
Python 3.10+. Configure with a .env (auto-loaded) or kwargs:
# .env
ENCODE_API_KEY=sk-your-key
ENCODE_BASE_URL=https://api.openai.com/v1
OPENAI_API_KEY / OPENAI_BASE_URL are also picked up if ENCODE_* aren't set.
What you can build
Multi-turn chat with stateful Messages — pass a Messages instance and it grows itself across relay() calls.
m = encode.Messages().system("Be brief.")
m.user("name three colors")
encode.relay(model="m", messages=m).response # m now has the assistant turn too
An agent that stops when it submits a final answer.
def watcher(event):
if any(tc.name == "submit_final" for tc in event.tool_calls):
event.stop()
encode.relay(..., tools=[search, submit_final]).intercept(watcher).response
Mid-loop context compaction — trim, summarize, redact, or rewrite history without subclassing anything.
def trim(event):
event.edit_last_tool_result(lambda c: c[:1000])
encode.relay(..., tools=[noisy_tool], on_intercept=trim).response
A bash sandbox tool.
class BashSandbox:
def __init__(self):
self._term = None
def as_tool(self):
sandbox = self
def bash(command: str) -> dict:
"""Run a bash command in a persistent shell."""
self._term = self._term or encode.Terminal()
r = self._term.run(command, timeout=10.0)
return {"output": r.output, "exit_code": r.exit_code, "cwd": r.cwd}
return bash
encode.relay(..., tools=[BashSandbox().as_tool()]).response
Docs
→ docs.md is the entry point — concept map + a link grid into focused topic pages:
quickstart · concepts · relay() · messages · tools · intercept · sessions · executors · terminal · streaming · structured output · whisper · async · errors · cookbook
Status & compatibility
- Python 3.10+
- macOS / Linux for
Terminal(pexpect-backed) - OpenAI-compatible endpoints: OpenAI, Courier, vLLM, LM Studio, Ollama, Together, Groq, and others
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file courier_encode-0.1.3.tar.gz.
File metadata
- Download URL: courier_encode-0.1.3.tar.gz
- Upload date:
- Size: 128.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a5b3e8e4dac5a55a11dbd966e35e608917a89a0fc18bfa85a3f743fe8a2cd6a
|
|
| MD5 |
02efddc372bab1252a26e0ee7f45480c
|
|
| BLAKE2b-256 |
761a8c42f18f99a05be8166e6ec68dcc1589923fdcdbfb296c5494bd12febfe1
|
File details
Details for the file courier_encode-0.1.3-py3-none-any.whl.
File metadata
- Download URL: courier_encode-0.1.3-py3-none-any.whl
- Upload date:
- Size: 49.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d513bf18fd30f8e6f0afb52826e3f6471c3bac218d527ab7250e407e70d7edf0
|
|
| MD5 |
a133e9a6a15356c74fbfd9e663c0a24e
|
|
| BLAKE2b-256 |
05dc2791ce9d395425458edc6b25c88bded44031d2dc56bf847082534def3c6c
|