Langfuse v4 tracing for Google Gemini and Anthropic Claude with automatic token counting and cost tracking
Project description
Langfuse Custom Tracer
Langfuse v4 tracing for Google Gemini and Anthropic Claude with automatic cost tracking
๐ฏ What is This?
A lightweight Python library that adds observability and cost tracking to your LLM applications using Langfuse.
- Automatic token counting for all supported LLM providers
- Cost calculation with real-time pricing
- Nested trace visualization in Langfuse
- Simple context manager API built on OpenTelemetry
- Zero setup - works with just API keys
๐ Quick Start
1. Install
# Basic installation
pip install langfuse-custom-tracer
# With environment variable support
pip install langfuse-custom-tracer[env]
# With Gemini support
pip install langfuse-custom-tracer[gemini]
# With Anthropic support
pip install langfuse-custom-tracer[anthropic]
# Everything (all providers)
pip install langfuse-custom-tracer[all]
2. Get API Keys
- Langfuse: Sign up at cloud.langfuse.com
- Gemini: Get key from ai.google.dev (optional)
- Anthropic: Get key from console.anthropic.com (optional)
3. Set Environment Variables
Create a .env file:
# Langfuse (get from your dashboard)
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
# Gemini API (optional)
GEMINI_API_KEY=...
# Anthropic API (optional)
ANTHROPIC_API_KEY=...
4. Use It (Gemini Example)
import os
from langfuse_custom_tracer import load_env, create_langfuse_client, GeminiTracer
import google.generativeai as genai
# Load environment variables
load_env()
# Initialize
lf = create_langfuse_client(
os.getenv("LANGFUSE_SECRET_KEY"),
os.getenv("LANGFUSE_PUBLIC_KEY")
)
tracer = GeminiTracer(lf)
# Configure Gemini
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel("gemini-2.0-flash")
# Use with tracing
with tracer.trace("invoice-processing", input={"file": "invoice.pdf"}) as span:
with tracer.generation("extract-data", model="gemini-2.0-flash",
input="Extract name, amount, date") as gen:
response = model.generate_content("Extract name, amount, date from invoice")
usage = tracer.extract_usage(response, model="gemini-2.0-flash")
gen.update(output=response.text, usage_details=usage)
span.update(output="Extraction complete")
tracer.flush() # Send to Langfuse
4b. Use It (Anthropic Example)
import os
from langfuse_custom_tracer import load_env, create_langfuse_client, AnthropicTracer
from anthropic import Anthropic
# Load environment variables
load_env()
# Initialize
lf = create_langfuse_client(
os.getenv("LANGFUSE_SECRET_KEY"),
os.getenv("LANGFUSE_PUBLIC_KEY")
)
tracer = AnthropicTracer(lf)
# Create Anthropic client
client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
# Use with tracing
with tracer.trace("invoice-processing", input={"file": "invoice.pdf"}) as span:
with tracer.generation("extract-data", model="claude-3-5-sonnet-20241022",
input="Extract name, amount, date") as gen:
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
messages=[{"role": "user", "content": "Extract name, amount, date from invoice"}]
)
usage = tracer.extract_usage(response, model="claude-3-5-sonnet-20241022")
gen.update(output=response.content[0].text, usage_details=usage)
span.update(output="Extraction complete")
tracer.flush() # Send to Langfuse
๐ What You'll See in Langfuse
Dashboard View
๐ Trace: invoice-processing (ID: trace-123)
โโ โฑ Duration: 2.3s
โโ ๐ค User: (none set)
โโ ๐ท๏ธ Tags: [production, batch]
โ
โโ๐ Generation: extract-data
โโ Model: gemini-2.0-flash
โโ Status: โ
Success
โโ Tokens: Input 156 | Output 89 | Total 245
โโ Cost: $0.00023
โ โโ Input: $0.000234 (156 tokens @ $0.15/1M)
โ โโ Output: $0.000053 (89 tokens @ $0.60/1M)
โ โโ Total: $0.000287
โโ Latency: 1.8s
โโ Output: "Name: John Doe, Amount: $500, Date: 2025-03-31"
Cost Aggregation
All calls are automatically aggregated on the dashboard:
- Total tokens: 245,300 across all traces
- Total cost: $0.18 for the day
- By model: Gemini 2.0 Flash: $0.15, Gemini 1.5 Pro: $0.03
Nested Traces
Langfuse automatically detects nesting via OpenTelemetry context:
with tracer.trace("main-pipeline"): # Parent span
with tracer.trace("step-1"): # Child span 1
with tracer.generation(...): # Grandchild span
...
with tracer.trace("step-2"): # Child span 2
...
Result in Langfuse: Clean hierarchical tree
๐ฎ Full API Reference
create_langfuse_client()
lf = create_langfuse_client(
secret_key="sk-lf-...", # Required
public_key="pk-lf-...", # Required
host="https://cloud.langfuse.com" # Optional, default EU
)
Hosts:
- EU:
https://cloud.langfuse.com(default) - US:
https://us.cloud.langfuse.com
load_env()
Load environment variables from .env file:
from langfuse_custom_tracer import load_env
# Load from .env in current directory
load_env()
# Load from custom file
load_env(".env.production")
Requires python-dotenv. Install with: pip install langfuse-custom-tracer[env]
BaseTracer.trace()
Create a root span (top-level trace):
with tracer.trace(
name="my-pipeline",
input={"file": "data.csv"},
metadata={"version": "1.0"},
user_id="user-123",
session_id="session-456",
tags=["production", "batch"]
) as span:
# Do work here
span.update(output={"rows_processed": 1000})
Parameters:
name(str): Span nameinput(any): Input data (shown in Langfuse)metadata(dict): Custom metadatauser_id(str): User identifiersession_id(str): Session identifiertags(list): String tags for filtering
BaseTracer.generation()
Create a generation span (LLM call):
with tracer.generation(
name="extract",
model="gemini-2.0-flash",
input="Extract data",
metadata={"temperature": 0.7}
) as gen:
response = model.generate_content("Extract data")
usage = tracer.extract_usage(response, model="gemini-2.0-flash")
gen.update(output=response.text, usage_details=usage)
Parameters:
name(str): Generation namemodel(str): Model identifierinput(any): Prompt/inputmetadata(dict): Custom metadata
GeminiTracer.extract_usage() / AnthropicTracer.extract_usage()
Extract token counts and calculate costs:
# Gemini
usage = tracer.extract_usage(
response, # Gemini response object
model="gemini-2.0-flash" # Model name for pricing
)
# Anthropic
usage = tracer.extract_usage(
response, # Anthropic message object
model="claude-3-5-sonnet-20241022" # Model name for pricing
)
# Returns:
# {
# "input": 156, # Prompt tokens
# "output": 89, # Completion tokens
# "total": 245, # Total tokens
# "unit": "TOKENS",
# "inputCost": 0.000234, # Input cost in USD
# "outputCost": 0.000053, # Output cost in USD
# "totalCost": 0.000287, # Total cost in USD
# "cachedTokens": 10 # (optional) cached tokens (Gemini & Anthropic)
# }
BaseTracer.flush()
Send pending traces to Langfuse (blocking):
tracer.flush() # Wait for all events to be sent
Required for short-lived scripts. Long-running servers batch automatically.
๐ง Supported Models
Gemini โ
All Google Gemini models with Q1 2026 pricing:
| Model | Input | Output | Cache |
|---|---|---|---|
| gemini-2.5-pro | $1.25/1M | $10.00/1M | $0.3125/1M |
| gemini-2.0-flash | $0.15/1M | $0.60/1M | $0.0375/1M |
| gemini-2.0-flash-lite | $0.075/1M | $0.30/1M | $0.01875/1M |
| gemini-1.5-pro | $1.25/1M | $5.00/1M | $0.3125/1M |
| gemini-1.5-flash | $0.075/1M | $0.30/1M | $0.01875/1M |
| gemini-1.5-flash-8b | $0.0375/1M | $0.15/1M | $0.01/1M |
Anthropic Claude โ
All Claude models with Q1 2026 pricing (with prompt caching support):
| Model | Input | Output | Cache Read | Cache Write |
|---|---|---|---|---|
| claude-3-5-sonnet-20241022 | $3.00/1M | $15.00/1M | $0.30/1M | $3.75/1M |
| claude-3-5-haiku-20241022 | $0.80/1M | $4.00/1M | $0.08/1M | $1.00/1M |
| claude-3-opus-20250219 | $15.00/1M | $75.00/1M | $1.50/1M | $18.75/1M |
| claude-3-sonnet-20250229 | $3.00/1M | $15.00/1M | $0.30/1M | $3.75/1M |
| claude-3-haiku-20250307 | $0.80/1M | $4.00/1M | $0.08/1M | $1.00/1M |
๐ Project Structure
langfuse-custom-tracer/
โโโ langfuse_custom_tracer/
โ โโโ __init__.py # Package exports
โ โโโ client.py # Langfuse client setup
โ โโโ tracers/
โ โโโ __init__.py
โ โโโ base.py # BaseTracer (abstract)
โ โโโ gemini.py # GeminiTracer (concrete, 20 tests)
โ โโโ anthropic.py # AnthropicTracer (concrete, 40 tests)
โโโ tests/
โ โโโ conftest.py # Pytest fixtures
โ โโโ test_base_tracer.py # 15 tests
โ โโโ test_gemini_tracer.py # 20 tests
โ โโโ test_anthropic_tracer.py # 40 tests
โ โโโ test_client.py # 12 tests
โโโ examples/
โ โโโ env_setup_example.py # Usage example
โโโ SETUP.md # Setup guide
โโโ TESTING.md # Testing guide
โโโ pyproject.toml # Package config
๐งช Testing
87 unit tests with 97% coverage:
# Run all tests
pytest
# Run with coverage report
pytest --cov
# Run specific test
pytest tests/test_gemini_tracer.py::TestGeminiTracer::test_extract_usage_basic -v
# Run Anthropic tests
pytest tests/test_anthropic_tracer.py -v
All tests pass โ
Test Coverage Breakdown:
- BaseTracer: 15 tests, 100% coverage
- GeminiTracer: 20 tests, 100% coverage
- AnthropicTracer: 40 tests, 100% coverage
- Client: 12 tests, 81% coverage (uncovered error handling in optional deps)
- Total: 87 tests, 97% coverage
๐ Security
- Never commit
.envfiles - Already in.gitignore - API keys required - Will raise
ImportErrorif missing - HTTPS only - All Langfuse communication encrypted
- No keys in code - Always use environment variables
๐ Examples
Example 1: Gemini Extraction Task
from langfuse_custom_tracer import create_langfuse_client, GeminiTracer
import google.generativeai as genai
import os
lf = create_langfuse_client(
os.getenv("LANGFUSE_SECRET_KEY"),
os.getenv("LANGFUSE_PUBLIC_KEY")
)
tracer = GeminiTracer(lf)
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel("gemini-2.0-flash")
# Simple extraction
with tracer.trace("email-analysis") as span:
with tracer.generation("extract", model="gemini-2.0-flash",
input="Extract sender, subject, body") as gen:
response = model.generate_content(
"From the email below, extract sender, subject, body:\n..."
)
usage = tracer.extract_usage(response, model="gemini-2.0-flash")
gen.update(output=response.text, usage_details=usage)
tracer.flush()
Example 1b: Anthropic Extraction Task
from langfuse_custom_tracer import create_langfuse_client, AnthropicTracer
from anthropic import Anthropic
import os
lf = create_langfuse_client(
os.getenv("LANGFUSE_SECRET_KEY"),
os.getenv("LANGFUSE_PUBLIC_KEY")
)
tracer = AnthropicTracer(lf)
client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
# Simple extraction with Claude
with tracer.trace("email-analysis") as span:
with tracer.generation("extract", model="claude-3-5-sonnet-20241022",
input="Extract sender, subject, body") as gen:
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
messages=[{
"role": "user",
"content": "From the email below, extract sender, subject, body:\n..."
}]
)
usage = tracer.extract_usage(response, model="claude-3-5-sonnet-20241022")
gen.update(output=response.content[0].text, usage_details=usage)
tracer.flush()
Example 2: Multi-Step Pipeline
with tracer.trace("document-processing", user_id="user-123",
metadata={"doc_type": "invoice"}) as span:
# Step 1: Extract text
with tracer.trace("step-1-extract"):
with tracer.generation("ocr", model="gemini-2.0-flash-lite"):
text = model.generate_content("Extract text from image")
# ...
# Step 2: Classify
with tracer.trace("step-2-classify"):
with tracer.generation("classify", model="gemini-2.0-flash"):
classification = model.generate_content(f"Classify: {text}")
# ...
# Step 3: Extract fields
with tracer.trace("step-3-extract-fields"):
with tracer.generation("extract", model="gemini-2.0-flash"):
fields = model.generate_content(f"Extract fields: {text}")
# ...
tracer.flush()
In Langfuse you'll see:
- Total latency: sum of all steps
- Total cost: $0.0015
- Token breakdown by step
- Each step as a child span
Example 3: Error Handling
with tracer.trace("risky-operation"):
with tracer.generation("call", model="gemini-2.0-flash"):
try:
response = model.generate_content("...")
usage = tracer.extract_usage(response)
gen.update(output=response.text, usage_details=usage)
except Exception as e:
gen.update(status_code=500, error=str(e))
raise
tracer.flush()
๐ Documentation
- SETUP.md - Installation and configuration
- TESTING.md - Testing guide and running tests
- examples/env_setup_example.py - More examples
๐ค Contributing
This is an early-stage project. Contributions welcome!
Next features:
- Additional LLM providers (Ollama, Groq, Azure, Anthropic)
- Async support
- Batch operations
- Response filtering
๐ License
MIT - See LICENSE file
๐ Support
- Documentation: Read the docs
- Issues: Report bugs on GitHub
- Questions: Check TESTING.md for common issues
Built with โค๏ธ for the LLM community
Langfuse is open-source observability for LLM applications
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langfuse_custom_tracer-1.0.1.tar.gz.
File metadata
- Download URL: langfuse_custom_tracer-1.0.1.tar.gz
- Upload date:
- Size: 30.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9fc4071d9484b0f3bf2ee0f2849900969c0eae7dd598665f59f5ba9a7ce1b638
|
|
| MD5 |
702f27d25be6f47c424e19175f380b97
|
|
| BLAKE2b-256 |
3bac0b8319832252108455bd0e605d859815ee79eade3e33a1e5553a5d606732
|
File details
Details for the file langfuse_custom_tracer-1.0.1-py3-none-any.whl.
File metadata
- Download URL: langfuse_custom_tracer-1.0.1-py3-none-any.whl
- Upload date:
- Size: 20.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d2de9ea1180d1c61be6076ad5b8816d47acbcdc70a3ff99ce18e8c005c35ca54
|
|
| MD5 |
6f7b73bfbd9fe3d1742bdf465f2c2a8c
|
|
| BLAKE2b-256 |
ce882d6c9effa522b167cee512778f0fbb193155718065b0142d86946ece95f7
|