Skip to main content

Langfuse v4 tracing for Google Gemini and Anthropic Claude with automatic token counting and cost tracking

Project description

Langfuse Custom Tracer

Langfuse v4 tracing for Google Gemini, Ollama, Groq, Azure OpenAI, and Anthropic

Tests Coverage Python License

๐ŸŽฏ What is This?

A lightweight Python library that adds observability and cost tracking to your LLM applications using Langfuse.

  • Automatic token counting for all supported LLM providers
  • Cost calculation with real-time pricing
  • Nested trace visualization in Langfuse
  • Simple context manager API built on OpenTelemetry
  • Zero setup - works with just API keys

๐Ÿš€ Quick Start

1. Install

# Basic installation
pip install langfuse-custom-tracer

# With environment variable support
pip install langfuse-custom-tracer[env]

# With Gemini support
pip install langfuse-custom-tracer[gemini]

# Everything
pip install langfuse-custom-tracer[all]

2. Get API Keys

3. Set Environment Variables

Create a .env file:

# Langfuse (get from your dashboard)
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PUBLIC_KEY=pk-lf-...

# Gemini API
GEMINI_API_KEY=...

4. Use It

import os
from langfuse_custom_tracer import load_env, create_langfuse_client, GeminiTracer
import google.generativeai as genai

# Load environment variables
load_env()

# Initialize
lf = create_langfuse_client(
    os.getenv("LANGFUSE_SECRET_KEY"),
    os.getenv("LANGFUSE_PUBLIC_KEY")
)
tracer = GeminiTracer(lf)

# Configure Gemini
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel("gemini-2.0-flash")

# Use with tracing
with tracer.trace("invoice-processing", input={"file": "invoice.pdf"}) as span:
    with tracer.generation("extract-data", model="gemini-2.0-flash",
                          input="Extract name, amount, date") as gen:
        response = model.generate_content("Extract name, amount, date from invoice")
        usage = tracer.extract_usage(response, model="gemini-2.0-flash")
        gen.update(output=response.text, usage=usage)
    span.update(output="Extraction complete")

tracer.flush()  # Send to Langfuse

๐Ÿ“Š What You'll See in Langfuse

Dashboard View

๐Ÿ“ˆ Trace: invoice-processing (ID: trace-123)
โ”œโ”€ โฑ Duration: 2.3s
โ”œโ”€ ๐Ÿ‘ค User: (none set)
โ”œโ”€ ๐Ÿท๏ธ Tags: [production, batch]
โ”‚
โ””โ”€๐Ÿ“ Generation: extract-data
   โ”œโ”€ Model: gemini-2.0-flash
   โ”œโ”€ Status: โœ… Success
   โ”œโ”€ Tokens: Input 156 | Output 89 | Total 245
   โ”œโ”€ Cost: $0.00023
   โ”‚  โ”œโ”€ Input: $0.000234 (156 tokens @ $0.15/1M)
   โ”‚  โ”œโ”€ Output: $0.000053 (89 tokens @ $0.60/1M)
   โ”‚  โ””โ”€ Total: $0.000287
   โ”œโ”€ Latency: 1.8s
   โ””โ”€ Output: "Name: John Doe, Amount: $500, Date: 2025-03-31"

Cost Aggregation

All calls are automatically aggregated on the dashboard:

  • Total tokens: 245,300 across all traces
  • Total cost: $0.18 for the day
  • By model: Gemini 2.0 Flash: $0.15, Gemini 1.5 Pro: $0.03

Nested Traces

Langfuse automatically detects nesting via OpenTelemetry context:

with tracer.trace("main-pipeline"):        # Parent span
    with tracer.trace("step-1"):           # Child span 1
        with tracer.generation(...):       # Grandchild span
            ...
    with tracer.trace("step-2"):           # Child span 2
        ...

Result in Langfuse: Clean hierarchical tree

๐ŸŽฎ Full API Reference

create_langfuse_client()

lf = create_langfuse_client(
    secret_key="sk-lf-...",                # Required
    public_key="pk-lf-...",                # Required
    host="https://cloud.langfuse.com"      # Optional, default EU
)

Hosts:

  • EU: https://cloud.langfuse.com (default)
  • US: https://us.cloud.langfuse.com

load_env()

Load environment variables from .env file:

from langfuse_custom_tracer import load_env

# Load from .env in current directory
load_env()

# Load from custom file
load_env(".env.production")

Requires python-dotenv. Install with: pip install langfuse-custom-tracer[env]

BaseTracer.trace()

Create a root span (top-level trace):

with tracer.trace(
    name="my-pipeline",
    input={"file": "data.csv"},
    metadata={"version": "1.0"},
    user_id="user-123",
    session_id="session-456",
    tags=["production", "batch"]
) as span:
    # Do work here
    span.update(output={"rows_processed": 1000})

Parameters:

  • name (str): Span name
  • input (any): Input data (shown in Langfuse)
  • metadata (dict): Custom metadata
  • user_id (str): User identifier
  • session_id (str): Session identifier
  • tags (list): String tags for filtering

BaseTracer.generation()

Create a generation span (LLM call):

with tracer.generation(
    name="extract",
    model="gemini-2.0-flash",
    input="Extract data",
    metadata={"temperature": 0.7}
) as gen:
    response = model.generate_content("Extract data")
    usage = tracer.extract_usage(response, model="gemini-2.0-flash")
    gen.update(output=response.text, usage=usage)

Parameters:

  • name (str): Generation name
  • model (str): Model identifier
  • input (any): Prompt/input
  • metadata (dict): Custom metadata

GeminiTracer.extract_usage()

Extract token counts and calculate costs:

usage = tracer.extract_usage(
    response,                           # Gemini response object
    model="gemini-2.0-flash"           # Model name for pricing
)

# Returns:
# {
#     "input": 156,              # Prompt tokens
#     "output": 89,              # Completion tokens
#     "total": 245,              # Total tokens
#     "unit": "TOKENS",
#     "inputCost": 0.000234,     # Input cost in USD
#     "outputCost": 0.000053,    # Output cost in USD
#     "totalCost": 0.000287,     # Total cost in USD
#     "cachedTokens": 10         # (optional) cached tokens
# }

BaseTracer.flush()

Send pending traces to Langfuse (blocking):

tracer.flush()  # Wait for all events to be sent

Required for short-lived scripts. Long-running servers batch automatically.

๐Ÿ”ง Supported Models

Gemini โœ…

All Google Gemini models with Q1 2026 pricing:

Model Input Output Cache
gemini-2.5-pro $1.25/1M $10.00/1M $0.3125/1M
gemini-2.0-flash $0.15/1M $0.60/1M $0.0375/1M
gemini-2.0-flash-lite $0.075/1M $0.30/1M $0.01875/1M
gemini-1.5-pro $1.25/1M $5.00/1M $0.3125/1M
gemini-1.5-flash $0.075/1M $0.30/1M $0.01875/1M
gemini-1.5-flash-8b $0.0375/1M $0.15/1M $0.01/1M

Coming Soon โณ

  • Ollama (local models)
  • Groq (fast inference)
  • Azure OpenAI (enterprise)
  • Anthropic Claude (frontier models)

๐Ÿ“ Project Structure

langfuse-custom-tracer/
โ”œโ”€โ”€ langfuse_custom_tracer/
โ”‚   โ”œโ”€โ”€ __init__.py              # Package exports
โ”‚   โ”œโ”€โ”€ client.py                # Langfuse client setup
โ”‚   โ””โ”€โ”€ tracers/
โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚       โ”œโ”€โ”€ base.py              # BaseTracer (abstract)
โ”‚       โ””โ”€โ”€ gemini.py            # GeminiTracer (concrete)
โ”œโ”€โ”€ tests/
โ”‚   โ”œโ”€โ”€ conftest.py              # Pytest fixtures
โ”‚   โ”œโ”€โ”€ test_base_tracer.py      # 15 tests
โ”‚   โ”œโ”€โ”€ test_gemini_tracer.py    # 20 tests
โ”‚   โ””โ”€โ”€ test_client.py           # 12 tests
โ”œโ”€โ”€ examples/
โ”‚   โ””โ”€โ”€ env_setup_example.py     # Usage example
โ”œโ”€โ”€ SETUP.md                      # Setup guide
โ”œโ”€โ”€ TESTING.md                    # Testing guide
โ””โ”€โ”€ pyproject.toml               # Package config

๐Ÿงช Testing

47 unit tests with 96% coverage:

# Run all tests
pytest

# Run with coverage report
pytest --cov

# Run specific test
pytest tests/test_gemini_tracer.py::TestGeminiTracer::test_extract_usage_basic -v

All tests pass โœ…

๐Ÿ” Security

  • Never commit .env files - Already in .gitignore
  • API keys required - Will raise ImportError if missing
  • HTTPS only - All Langfuse communication encrypted
  • No keys in code - Always use environment variables

๐Ÿ“š Examples

Example 1: Simple Extraction Task

from langfuse_custom_tracer import create_langfuse_client, GeminiTracer
import google.generativeai as genai
import os

lf = create_langfuse_client(
    os.getenv("LANGFUSE_SECRET_KEY"),
    os.getenv("LANGFUSE_PUBLIC_KEY")
)
tracer = GeminiTracer(lf)
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel("gemini-2.0-flash")

# Simple extraction
with tracer.trace("email-analysis") as span:
    with tracer.generation("extract", model="gemini-2.0-flash",
                          input="Extract sender, subject, body") as gen:
        response = model.generate_content(
            "From the email below, extract sender, subject, body:\n..."
        )
        usage = tracer.extract_usage(response)
        gen.update(output=response.text, usage=usage)

tracer.flush()

Example 2: Multi-Step Pipeline

with tracer.trace("document-processing", user_id="user-123",
                 metadata={"doc_type": "invoice"}) as span:
    
    # Step 1: Extract text
    with tracer.trace("step-1-extract"):
        with tracer.generation("ocr", model="gemini-2.0-flash-lite"):
            text = model.generate_content("Extract text from image")
            # ...
    
    # Step 2: Classify
    with tracer.trace("step-2-classify"):
        with tracer.generation("classify", model="gemini-2.0-flash"):
            classification = model.generate_content(f"Classify: {text}")
            # ...
    
    # Step 3: Extract fields
    with tracer.trace("step-3-extract-fields"):
        with tracer.generation("extract", model="gemini-2.0-flash"):
            fields = model.generate_content(f"Extract fields: {text}")
            # ...

tracer.flush()

In Langfuse you'll see:

  • Total latency: sum of all steps
  • Total cost: $0.0015
  • Token breakdown by step
  • Each step as a child span

Example 3: Error Handling

with tracer.trace("risky-operation"):
    with tracer.generation("call", model="gemini-2.0-flash"):
        try:
            response = model.generate_content("...")
            usage = tracer.extract_usage(response)
            gen.update(output=response.text, usage=usage)
        except Exception as e:
            gen.update(status_code=500, error=str(e))
            raise

tracer.flush()

๐Ÿ“– Documentation

๐Ÿค Contributing

This is an early-stage project. Contributions welcome!

Next features:

  • Additional LLM providers (Ollama, Groq, Azure, Anthropic)
  • Async support
  • Batch operations
  • Response filtering

๐Ÿ“ License

MIT - See LICENSE file

๐Ÿ™‹ Support

  • Documentation: Read the docs
  • Issues: Report bugs on GitHub
  • Questions: Check TESTING.md for common issues

Built with โค๏ธ for the LLM community

Langfuse is open-source observability for LLM applications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langfuse_custom_tracer-0.3.0.tar.gz (22.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langfuse_custom_tracer-0.3.0-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file langfuse_custom_tracer-0.3.0.tar.gz.

File metadata

  • Download URL: langfuse_custom_tracer-0.3.0.tar.gz
  • Upload date:
  • Size: 22.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for langfuse_custom_tracer-0.3.0.tar.gz
Algorithm Hash digest
SHA256 32cd62c73eac6682361c0bfbf942eb182525e3d3f81c16ba0f7459d62c08a01f
MD5 22cf6996946c2e1e3048faa8b65d779c
BLAKE2b-256 36473536efe12b42367cfa77140f723f0fae41ac65dfdb6e48cc7c4d249b8a28

See more details on using hashes here.

File details

Details for the file langfuse_custom_tracer-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langfuse_custom_tracer-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b142a0ca053405f0c008aaced3fc40f5ff69de8e61cbd2caefb1d2ee4095f5d
MD5 776fb1913f4a45b5d25d8bca2157a6d9
BLAKE2b-256 7df242571b91f496dce119dc1ad6ceb1d23020eafc4ebb13491ee7e9f04112f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page