Langfuse integration for fasteval - evaluate production traces with fasteval metrics
Project description
fasteval-langfuse
Langfuse integration for fasteval - evaluate production traces with fasteval's research-backed metrics.
Installation
pip install fasteval-core fasteval-langfuse
Quick Start
Evaluate Production Traces
Fetch traces from Langfuse and evaluate them with fasteval metrics:
from fasteval_langfuse import langfuse_traces
from fasteval_langfuse.sampling import RandomSamplingStrategy
import fasteval as fe
@fe.correctness(threshold=0.8)
@fe.hallucination(threshold=0.9)
@langfuse_traces(
project="production",
filter_tags=["customer-support"],
time_range="last_24h",
sampling=RandomSamplingStrategy(sample_size=200)
)
def test_production_traces(trace_id, input, output, context, metadata):
# Evaluate the trace
fe.score(output, input=input)
# Run with pytest - scores automatically pushed to Langfuse
# pytest test_production.py -v
Sampling Strategies
Reduce evaluation costs with intelligent sampling:
from fasteval_langfuse.sampling import (
RandomSamplingStrategy,
StratifiedSamplingStrategy,
ScoreBasedSamplingStrategy,
)
# Random sampling - 200 random traces
@langfuse_traces(
project="prod",
sampling=RandomSamplingStrategy(sample_size=200, seed=42)
)
def test_random_sample(trace_id, input, output, context, metadata):
fe.score(output, input=input)
# Stratified sampling - even distribution across user types
@langfuse_traces(
project="prod",
sampling=StratifiedSamplingStrategy(
strata_key="metadata.user_type",
samples_per_stratum=30
)
)
def test_across_segments(trace_id, input, output, context, metadata):
fe.score(output, input=input)
# Score-based sampling - focus on failures
@langfuse_traces(
project="prod",
sampling=ScoreBasedSamplingStrategy(
score_name="user_rating",
low_score_threshold=3.0,
low_score_rate=1.0, # 100% of low ratings
high_score_rate=0.05 # 5% of high ratings
)
)
def test_failures(trace_id, input, output, context, metadata):
fe.score(output, input=input)
Built-in Sampling Strategies
- NoSamplingStrategy: Evaluate all matching traces (default)
- RandomSamplingStrategy: Unbiased random sampling
- StratifiedSamplingStrategy: Even distribution across groups
- ScoreBasedSamplingStrategy: Oversample low-scoring traces
- RecentFirstSamplingStrategy: Prioritize recent traces
Dataset Integration
Evaluate against Langfuse datasets. All dataset columns are passed as parameters - declare what you need:
from fasteval_langfuse import langfuse_dataset
# Basic usage
@fe.correctness(threshold=0.8)
@langfuse_dataset(name="qa-golden-set", version="v2")
def test_qa_dataset(input, expected_output):
response = my_agent(input)
fe.score(response, expected_output, input=input)
# Using custom metadata fields
@fe.correctness(threshold=0.8)
@langfuse_dataset(name="qa-golden-set", version="v2")
def test_with_metadata(input, expected_output, difficulty, category):
# difficulty and category come from item.metadata
response = my_agent(input)
fe.score(response, expected_output, input=input)
# Only what you need
@fe.correctness(threshold=0.8)
@langfuse_dataset(name="inputs-only")
def test_minimal(input):
# Only declare input, ignore other fields
response = my_agent(input)
fe.score(response, input=input)
Configuration
from fasteval_langfuse import configure_langfuse, LangfuseConfig
configure_langfuse(LangfuseConfig(
public_key="pk-...", # Or from LANGFUSE_PUBLIC_KEY env
secret_key="sk-...", # Or from LANGFUSE_SECRET_KEY env
host="https://cloud.langfuse.com", # Or self-hosted
default_project="production",
auto_push_scores=True, # Push scores back automatically
score_name_prefix="fasteval_", # Prefix for score names
))
RAG Evaluation with Context
The decorator automatically extracts context from trace metadata:
@fe.faithfulness(threshold=0.8)
@fe.contextual_precision(threshold=0.7)
@langfuse_traces(
project="prod",
filter_tags=["rag"]
)
def test_rag_quality(trace_id, input, output, context, metadata):
# context is auto-extracted from metadata keys:
# - "context", "retrieved_docs", "documents", "retrieval_context"
# Or manually extract if needed:
if not context:
context = metadata.get("custom_docs_key")
fe.score(output, context=context, input=input)
Benefits
- 💰 Cost Reduction: Reduce LLM evaluation costs by 90%+ with sampling
- ⚡ Faster Feedback: Evaluate in minutes vs hours
- 📊 Research-Backed Metrics: Use fasteval's validated evaluation metrics
- 🎯 Focus on Issues: Oversample failures with ScoreBasedSamplingStrategy
- ✅ Zero Instrumentation: Evaluate existing traces without code changes
- 🔄 Automatic Scoring: Evaluation results automatically sync to Langfuse
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fasteval_langfuse-2.1.5.tar.gz.
File metadata
- Download URL: fasteval_langfuse-2.1.5.tar.gz
- Upload date:
- Size: 11.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1de4ec21f510c8406652f01af9c507b6f63257a56e2a2d26d5af58885f51d19b
|
|
| MD5 |
e1faef28e62fcb2d75fbc8196f2265ce
|
|
| BLAKE2b-256 |
225b6997efd7db8ff903920c5e2655f065561319e88034fb49e5bf5982793c07
|
Provenance
The following attestation bundles were made for fasteval_langfuse-2.1.5.tar.gz:
Publisher:
release.yml on intuit/fasteval
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fasteval_langfuse-2.1.5.tar.gz -
Subject digest:
1de4ec21f510c8406652f01af9c507b6f63257a56e2a2d26d5af58885f51d19b - Sigstore transparency entry: 1146412732
- Sigstore integration time:
-
Permalink:
intuit/fasteval@61528a5c55d62f63811de6e108e893835de6a976 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/intuit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@61528a5c55d62f63811de6e108e893835de6a976 -
Trigger Event:
pull_request
-
Statement type:
File details
Details for the file fasteval_langfuse-2.1.5-py3-none-any.whl.
File metadata
- Download URL: fasteval_langfuse-2.1.5-py3-none-any.whl
- Upload date:
- Size: 16.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60e6bed00f45e2cffb78853a95f15498bc6bc8dc423ca7dd2f6f9993644bf982
|
|
| MD5 |
3f3100c047c059df653d86d1ffdf3f72
|
|
| BLAKE2b-256 |
fec4f6efb26517e4367fba80f2c2403bfe72d86beb1d0b56ac36b0a3973e2ff7
|
Provenance
The following attestation bundles were made for fasteval_langfuse-2.1.5-py3-none-any.whl:
Publisher:
release.yml on intuit/fasteval
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fasteval_langfuse-2.1.5-py3-none-any.whl -
Subject digest:
60e6bed00f45e2cffb78853a95f15498bc6bc8dc423ca7dd2f6f9993644bf982 - Sigstore transparency entry: 1146412823
- Sigstore integration time:
-
Permalink:
intuit/fasteval@61528a5c55d62f63811de6e108e893835de6a976 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/intuit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@61528a5c55d62f63811de6e108e893835de6a976 -
Trigger Event:
pull_request
-
Statement type: