Reliability-first orchestration framework for LLM workflows
Project description
tracechain
Observability for LLM workflows. Trace runs, steps, and model calls with full support for streaming, async, retries, tool calling, and multi-turn conversations.
from tracechain import workflow, step, observe_llm
@workflow(name="rag_pipeline")
def rag_pipeline(query: str) -> str:
docs = retrieve(query)
return generate(query, docs)
@step(name="retrieve", retries=2)
def retrieve(query: str) -> list[str]:
return vector_db.search(query)
@step(name="generate")
def generate(query: str, docs: list[str]) -> str:
with observe_llm("gpt_call", model="gpt-4o", prompt=query) as obs:
resp = openai_client.chat.completions.create(
model="gpt-4o",
messages=[system_msg, *build_context(docs), user_msg(query)],
tools=[citation_tool], # tool calling — fully supported
)
obs.record(resp) # auto-extracts tokens and cost
return resp.choices[0].message.content
Installation
pip install tracechain # core only
pip install 'tracechain[openai]' # + OpenAI client
pip install 'tracechain[anthropic]' # + Anthropic client
pip install 'tracechain[otel]' # + OpenTelemetry export
pip install 'tracechain[all]' # everything
Requires Python ≥ 3.9. No mandatory dependencies beyond httpx and python-dotenv.
Zero-infrastructure quickstart (local mode)
No backend required. Data goes to a local SQLite file.
TRACECHAIN_MODE=local python my_app.py
# Or in code:
from tracechain import TraceChainClient
client = TraceChainClient(mode="local", db_path="./traces.db")
Everything works identically in local mode — the schema, the decorator API, evaluations.
Core concepts
@workflow — the outermost boundary
Wraps a function representing one end-to-end execution. Creates a run record with input, output, total cost, and total tokens.
from tracechain import workflow
@workflow(name="summarize")
def summarize(text: str) -> str:
...
# Async works identically:
@workflow(name="summarize")
async def summarize(text: str) -> str:
...
@step — named, timed, retriable substeps
Each step creates a child record under the current run with exponential backoff and jitter.
from tracechain import step
@step(name="fetch_docs", retries=3, retry_delay=0.5, retry_max_delay=10.0)
def fetch_docs(query: str) -> list[dict]:
return requests.get(f"/search?q={query}").json()
observe_llm() — watch any LLM call
The primary API for tracing model calls. You own the call; TraceChain just observes. Supports any model, any client, any parameters.
from tracechain import observe_llm
with observe_llm("chat", model="gpt-4o", provider="openai", prompt=user_message) as obs:
resp = openai_client.chat.completions.create(
model="gpt-4o",
messages=conversation_history, # multi-turn
tools=tools, # function calling
response_format={"type": "json_object"},
)
obs.record(resp) # auto-extracts from ChatCompletion
# Async:
async with observe_llm("chat", model="claude-3-5-sonnet", provider="anthropic") as obs:
resp = await async_client.messages.create(...)
obs.record(resp) # auto-extracts from Anthropic Message
# Custom / local LLM — pass values manually:
with observe_llm("ollama", model="llama3", provider="ollama") as obs:
result = ollama.generate(model="llama3", prompt=prompt)
obs.record(
response=result["response"],
input_tokens=result["prompt_eval_count"],
output_tokens=result["eval_count"],
)
Auto-extraction supports OpenAI ChatCompletion and Anthropic Message objects via duck-typing — no hard import of either library required.
Streaming
with observe_llm("stream", model="gpt-4o", is_stream=True) as obs:
response_text = ""
for chunk in openai_client.chat.completions.create(..., stream=True):
token = chunk.choices[0].delta.content or ""
if token:
obs.on_chunk() # records time-to-first-token on first call
response_text += token
obs.record(response=response_text, input_tokens=in_t, output_tokens=out_t)
@llm_step (batteries-included, simple mode)
For prototypes where you don't need the full LLM API surface. Just return a prompt string — TraceChain calls the model for you.
from tracechain import llm_step
@llm_step(name="answer", model="gpt-4o-mini", provider="openai")
def answer(query: str) -> str:
return f"Answer this question concisely: {query}"
result = answer("What is the capital of France?")
Supports streaming:
@llm_step(name="stream_answer", model="gpt-4o-mini", stream=True)
def stream_answer(query: str) -> str:
return f"Answer: {query}"
for token in stream_answer("What is AI?"):
print(token, end="", flush=True)
For production use, prefer
observe_llm().@llm_stepdoes not support system prompts, conversation history, tool calling, or structured output.
Evaluations
from tracechain import workflow, evaluate_run
@workflow(name="qa_pipeline")
def qa_pipeline(question: str) -> str:
answer = generate(question)
evaluate_run(
output=answer,
reference=question,
scores={"relevance": 0.9, "faithfulness": 0.85},
passed=True,
)
return answer
OpenTelemetry
from tracechain import configure_otel
configure_otel("my-service") # console exporter
configure_otel("my-service", exporter="otlp") # Jaeger / Tempo / Grafana
# Bring your own TracerProvider:
from opentelemetry.sdk.trace import TracerProvider
configure_otel("my-service", tracer_provider=my_provider)
Emits spans following the OpenTelemetry GenAI semantic conventions:
| Span | Key attributes |
|---|---|
workflow.<name> |
tracechain.workflow.name |
step.<name> |
tracechain.step.name, tracechain.step.type |
llm.<name> |
gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, tracechain.llm.is_stream, tracechain.llm.ttft_ms |
Spans are correctly nested (workflow → step → llm) via context.attach/detach.
Configuration
| Environment variable | Default | Description |
|---|---|---|
TRACECHAIN_BACKEND_URL |
http://localhost:8000 |
Backend API URL |
TRACECHAIN_ENABLED |
true |
Set false to disable all tracing |
TRACECHAIN_TIMEOUT |
5000 |
HTTP timeout in milliseconds |
TRACECHAIN_MODE |
http |
http (backend) or local (SQLite) |
TRACECHAIN_DB_PATH |
./tracechain.db |
SQLite file path (local mode only) |
Silent-on-failure by design. Backend errors never raise exceptions or crash your application.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tracechain-0.1.0.tar.gz.
File metadata
- Download URL: tracechain-0.1.0.tar.gz
- Upload date:
- Size: 43.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b4fefc486fbdffbb4085bdd3bce1e4b06a1f1178a579d8cdf7920fee85a20ae
|
|
| MD5 |
1b9f5d2facbcbd50e86f71a87e574cd5
|
|
| BLAKE2b-256 |
190b4361cbcf26e1e154d73983fcaa9dd875a7b2ea8025c23dfdb3649a6d8d35
|
File details
Details for the file tracechain-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tracechain-0.1.0-py3-none-any.whl
- Upload date:
- Size: 28.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ff5119229d782076f5856c551d77c2e7827a0f94fddfaa5a3067ab017fc59b9
|
|
| MD5 |
738818a607f4d1b496ac9ede64f84a8b
|
|
| BLAKE2b-256 |
4f0858e112c106f6c49bbf4aa7e3fc1ae4396f650a4d1e2a5f18c2275b35c8cb
|