LoongSuite BFCL v4 (Berkeley Function Call Leaderboard) instrumentation
Project description
LoongSuite BFCL v4 Instrumentation
LoongSuite Python instrumentation for the Berkeley Function Call
Leaderboard v4
(bfcl-eval, package bfcl_eval).
Span Topology
ENTRY enter_ai_application_system gen_ai.span.kind=ENTRY, op=enter
└─ AGENT invoke_agent {test_entry_id} gen_ai.span.kind=AGENT, op=invoke_agent
├─ STEP react step gen_ai.span.kind=STEP, op=react
│ ├─ LLM chat {model} (created by downstream vendor SDK probe)
│ └─ TOOL execute_tool {fn} gen_ai.span.kind=TOOL, op=execute_tool
└─ STEP react step
└─ ...
This instrumentation deliberately does not create LLM spans. They are emitted by the downstream vendor SDK probe (OpenAI / Anthropic / Google / DashScope / LiteLLM / etc.) so that token usage and request payloads stay in sync with the SDK that actually performed the request.
Installation
pip install loongsuite-instrumentation-bfclv4
Usage
opentelemetry-instrument bfcl generate \
--model gpt-4o-2024-11-20-FC \
--test-category simple_python \
--num-threads 2
Or programmatically:
from opentelemetry.instrumentation.bfclv4 import BFCLv4Instrumentor
BFCLv4Instrumentor().instrument()
# ... run BFCL ...
BFCLv4Instrumentor().uninstrument()
Compatibility With Downstream LLM SDK Probes
| Scenario | Recommended downstream probe |
|---|---|
| OpenAI / OpenAI Responses / OSS via vLLM / SGLang / DeepSeek (OpenAI-compatible) | opentelemetry-instrumentation-openai |
| Anthropic / Claude | loongsuite-instrumentation-claude-agent-sdk |
| Gemini / Google | loongsuite-instrumentation-google-adk |
| Qwen / DashScope | loongsuite-instrumentation-dashscope |
| LiteLLM | loongsuite-instrumentation-litellm |
OSS Provider Notes
For OSS handlers (vLLM / SGLang served via the OpenAI-compatible API), the
BFCL probe sets gen_ai.provider.name to vllm / sglang / oss and adds
bfcl.oss.backend for disambiguation. Downstream OpenAI probes will still
report gen_ai.provider.name=openai on the LLM span; this is expected.
Custom Attributes
| Attribute | Where | Description |
|---|---|---|
gen_ai.framework = bfclv4 |
ENTRY/AGENT/STEP/TOOL | Framework tag |
bfcl.test_category |
ENTRY/AGENT | Test category |
bfcl.num_threads |
ENTRY | Configured thread pool size |
bfcl.test_case_count |
ENTRY | Number of test cases |
bfcl.run_ids |
ENTRY | Whether the run targeted specific IDs |
bfcl.test_entry_id |
AGENT | Test entry id |
bfcl.turn_idx |
STEP | Multi-turn turn index (0-based) |
bfcl.query_mode |
STEP | FC or prompting |
bfcl.oss.backend |
AGENT/STEP | vllm / sglang / unknown (only OSS) |
bfcl.tool.duration_is_estimated |
TOOL | True (latency is averaged across batch) |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file loongsuite_instrumentation_bfclv4-0.6.0-py3-none-any.whl.
File metadata
- Download URL: loongsuite_instrumentation_bfclv4-0.6.0-py3-none-any.whl
- Upload date:
- Size: 27.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e802ea328feb986a6c23aab94ded5411184ca560510894bb3b9246e91f9f2fac
|
|
| MD5 |
287b879a835595e711e944789a5c8784
|
|
| BLAKE2b-256 |
bc0e882c94d95617f999a49dc82c842b45f66fa85e9691bbf56ed230d9ce9d84
|