OpenInference auto-instrumentation for baml-py
Project description
OpenInference BAML Instrumentation
Python auto-instrumentation library for BAML (baml-py).
The traces emitted by this instrumentation follow the OpenInference semantic conventions and are fully OpenTelemetry compatible. They can be sent to any OpenTelemetry collector, such as Arize Phoenix.
Installation
pip install openinference-instrumentation-baml
Quickstart
In this example we will instrument a BAML application and observe traces via Arize Phoenix.
Install packages.
pip install openinference-instrumentation-baml "baml-py>=0.200" arize-phoenix-otel
Assuming you have a BAML project with a generated Python client (e.g. my_app.baml_client), instrument it as follows:
from phoenix.otel import register
tracer_provider = register(
batch=True,
auto_instrument=True, # automatically discovers openinference-instrumentation-baml
)
# That's it! All BAML function calls will now emit traces.
Or, if you prefer manual setup:
from openinference.instrumentation.baml import BamlInstrumentor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
endpoint = "http://127.0.0.1:6006/v1/traces"
tracer_provider = trace_sdk.TracerProvider()
tracer_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))
BamlInstrumentor().instrument(
tracer_provider=tracer_provider,
baml_client_module="my_app.baml_client",
)
Now run your application and observe the traces in Phoenix.
python your_file.py
How It Works
BAML generates a DoNotUseDirectlyCallManager class that all LLM function calls pass through. This instrumentor patches its call_function_async and call_function_sync methods to:
- Inject a per-call
Collectorto capture theFunctionLog - Extract trace data (model name, input/output messages, token counts, timing)
- Emit an OpenTelemetry span with OpenInference semantic conventions
Auto-Discovery
When using auto_instrument=True via phoenix.otel.register(), the instrumentor automatically scans loaded modules for a BAML generated client. This works as long as the baml_client module has been imported before register() is called.
If auto-discovery fails, pass the module explicitly:
BamlInstrumentor().instrument(
tracer_provider=tracer_provider,
baml_client_module="my_app.baml_client",
)
Captured Attributes
The following OpenInference attributes are populated on each span:
| Attribute | Source |
|---|---|
openinference.span.kind |
"LLM" |
llm.system |
"baml" |
llm.provider |
BAML client provider (e.g. "openai-generic") |
llm.model_name |
Model name from the HTTP request body |
llm.input_messages.* |
Parsed from the LLM request messages |
llm.output_messages.* |
Parsed from the LLM response choices |
llm.token_count.prompt |
Input token count |
llm.token_count.completion |
Output token count |
llm.token_count.total |
Sum of prompt + completion tokens |
llm.invocation_parameters |
Request parameters (temperature, max_tokens, etc.) |
input.value |
BAML function arguments (JSON) |
output.value |
LLM response content |
Limitations
Streaming calls are not instrumented. Only call_function_async and call_function_sync (non-streaming) are patched. For streaming calls (create_async_stream / create_sync_stream), the stream is consumed asynchronously by user code, so the instrumentor cannot reliably determine when the stream completes to capture the full response. Streaming support may be added in a future release.
Provider-specific attribute parsing. Input/output messages, model name, and invocation parameters are parsed from the raw HTTP request/response bodies, which vary by provider. The following providers are supported:
| Provider | llm.input_messages |
llm.output_messages |
llm.model_name |
llm.invocation_parameters |
Cache tokens |
|---|---|---|---|---|---|
openai, openai-generic, openrouter, ollama |
✓ | ✓ | ✓ | ✓ | cache_read (via BAML) |
anthropic |
✓ | ✓ | ✓ | ✓ | cache_read + cache_write |
For unsupported providers, a one-time warning is logged and these attributes are skipped. Token counts (prompt, completion, total, cache_read) are always extracted from BAML's provider-agnostic Usage object regardless of provider.
Disclaimer
This is not an official OpenInference library. It is a community-maintained extension and is provided as-is without warranty. The author is not responsible for any issues arising from its use.
The OpenInference project does not currently accept large-scale contributions, so this instrumentor is maintained separately. Contributions and feedback from the community are welcome. If the OpenInference team decides to build official BAML instrumentation in the future, users are encouraged to migrate to the official version.
More Info
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openinference_instrumentation_baml-0.1.0-py3-none-any.whl.
File metadata
- Download URL: openinference_instrumentation_baml-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04a9a032b338edd3dfc667ed4adb0b9a6c6264300ef22eff0ba03a0d048bc931
|
|
| MD5 |
4f6fce1a70fd5627ce7f09d1bf335751
|
|
| BLAKE2b-256 |
6d541be79aa6eb86ef3e138240d273ab1dc2ace3407e031b49db15d3e50a0d9c
|