Skip to main content

Reliability-first orchestration framework for LLM workflows

Project description

tracechain

Observability for LLM workflows. Trace runs, steps, and model calls with full support for streaming, async, retries, tool calling, and multi-turn conversations.

from tracechain import workflow, step, observe_llm

@workflow(name="rag_pipeline")
def rag_pipeline(query: str) -> str:
    docs = retrieve(query)
    return generate(query, docs)

@step(name="retrieve", retries=2)
def retrieve(query: str) -> list[str]:
    return vector_db.search(query)

@step(name="generate")
def generate(query: str, docs: list[str]) -> str:
    with observe_llm("gpt_call", model="gpt-4o", prompt=query) as obs:
        resp = openai_client.chat.completions.create(
            model="gpt-4o",
            messages=[system_msg, *build_context(docs), user_msg(query)],
            tools=[citation_tool],          # tool calling — fully supported
        )
        obs.record(resp)                    # auto-extracts tokens and cost
    return resp.choices[0].message.content

Installation

pip install tracechain                        # core only
pip install 'tracechain[openai]'              # + OpenAI client
pip install 'tracechain[anthropic]'           # + Anthropic client
pip install 'tracechain[otel]'                # + OpenTelemetry export
pip install 'tracechain[all]'                 # everything

Requires Python ≥ 3.9. No mandatory dependencies beyond httpx and python-dotenv.


Zero-infrastructure quickstart (local mode)

No backend required. Data goes to a local SQLite file.

TRACECHAIN_MODE=local python my_app.py
# Or in code:
from tracechain import TraceChainClient

client = TraceChainClient(mode="local", db_path="./traces.db")

Everything works identically in local mode — the schema, the decorator API, evaluations.


Core concepts

@workflow — the outermost boundary

Wraps a function representing one end-to-end execution. Creates a run record with input, output, total cost, and total tokens.

from tracechain import workflow

@workflow(name="summarize")
def summarize(text: str) -> str:
    ...

# Async works identically:
@workflow(name="summarize")
async def summarize(text: str) -> str:
    ...

@step — named, timed, retriable substeps

Each step creates a child record under the current run with exponential backoff and jitter.

from tracechain import step

@step(name="fetch_docs", retries=3, retry_delay=0.5, retry_max_delay=10.0)
def fetch_docs(query: str) -> list[dict]:
    return requests.get(f"/search?q={query}").json()

observe_llm() — watch any LLM call

The primary API for tracing model calls. You own the call; TraceChain just observes. Supports any model, any client, any parameters.

from tracechain import observe_llm

with observe_llm("chat", model="gpt-4o", provider="openai", prompt=user_message) as obs:
    resp = openai_client.chat.completions.create(
        model="gpt-4o",
        messages=conversation_history,       # multi-turn
        tools=tools,                         # function calling
        response_format={"type": "json_object"},
    )
    obs.record(resp)                         # auto-extracts from ChatCompletion

# Async:
async with observe_llm("chat", model="claude-3-5-sonnet", provider="anthropic") as obs:
    resp = await async_client.messages.create(...)
    obs.record(resp)                         # auto-extracts from Anthropic Message

# Custom / local LLM — pass values manually:
with observe_llm("ollama", model="llama3", provider="ollama") as obs:
    result = ollama.generate(model="llama3", prompt=prompt)
    obs.record(
        response=result["response"],
        input_tokens=result["prompt_eval_count"],
        output_tokens=result["eval_count"],
    )

Auto-extraction supports OpenAI ChatCompletion and Anthropic Message objects via duck-typing — no hard import of either library required.

Streaming

with observe_llm("stream", model="gpt-4o", is_stream=True) as obs:
    response_text = ""
    for chunk in openai_client.chat.completions.create(..., stream=True):
        token = chunk.choices[0].delta.content or ""
        if token:
            obs.on_chunk()          # records time-to-first-token on first call
            response_text += token
    obs.record(response=response_text, input_tokens=in_t, output_tokens=out_t)

@llm_step (batteries-included, simple mode)

For prototypes where you don't need the full LLM API surface. Just return a prompt string — TraceChain calls the model for you.

from tracechain import llm_step

@llm_step(name="answer", model="gpt-4o-mini", provider="openai")
def answer(query: str) -> str:
    return f"Answer this question concisely: {query}"

result = answer("What is the capital of France?")

Supports streaming:

@llm_step(name="stream_answer", model="gpt-4o-mini", stream=True)
def stream_answer(query: str) -> str:
    return f"Answer: {query}"

for token in stream_answer("What is AI?"):
    print(token, end="", flush=True)

For production use, prefer observe_llm(). @llm_step does not support system prompts, conversation history, tool calling, or structured output.


Evaluations

from tracechain import workflow, evaluate_run

@workflow(name="qa_pipeline")
def qa_pipeline(question: str) -> str:
    answer = generate(question)
    evaluate_run(
        output=answer,
        reference=question,
        scores={"relevance": 0.9, "faithfulness": 0.85},
        passed=True,
    )
    return answer

OpenTelemetry

from tracechain import configure_otel

configure_otel("my-service")                     # console exporter
configure_otel("my-service", exporter="otlp")    # Jaeger / Tempo / Grafana

# Bring your own TracerProvider:
from opentelemetry.sdk.trace import TracerProvider
configure_otel("my-service", tracer_provider=my_provider)

Emits spans following the OpenTelemetry GenAI semantic conventions:

Span Key attributes
workflow.<name> tracechain.workflow.name
step.<name> tracechain.step.name, tracechain.step.type
llm.<name> gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, tracechain.llm.is_stream, tracechain.llm.ttft_ms

Spans are correctly nested (workflow → step → llm) via context.attach/detach.


Configuration

Environment variable Default Description
TRACECHAIN_BACKEND_URL http://localhost:8000 Backend API URL
TRACECHAIN_ENABLED true Set false to disable all tracing
TRACECHAIN_TIMEOUT 5000 HTTP timeout in milliseconds
TRACECHAIN_MODE http http (backend) or local (SQLite)
TRACECHAIN_DB_PATH ./tracechain.db SQLite file path (local mode only)

Silent-on-failure by design. Backend errors never raise exceptions or crash your application.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tracechain-0.1.0.tar.gz (43.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tracechain-0.1.0-py3-none-any.whl (28.1 kB view details)

Uploaded Python 3

File details

Details for the file tracechain-0.1.0.tar.gz.

File metadata

  • Download URL: tracechain-0.1.0.tar.gz
  • Upload date:
  • Size: 43.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for tracechain-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1b4fefc486fbdffbb4085bdd3bce1e4b06a1f1178a579d8cdf7920fee85a20ae
MD5 1b9f5d2facbcbd50e86f71a87e574cd5
BLAKE2b-256 190b4361cbcf26e1e154d73983fcaa9dd875a7b2ea8025c23dfdb3649a6d8d35

See more details on using hashes here.

File details

Details for the file tracechain-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tracechain-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 28.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for tracechain-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3ff5119229d782076f5856c551d77c2e7827a0f94fddfaa5a3067ab017fc59b9
MD5 738818a607f4d1b496ac9ede64f84a8b
BLAKE2b-256 4f0858e112c106f6c49bbf4aa7e3fc1ae4396f650a4d1e2a5f18c2275b35c8cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page