Production-ready LLM client with audit trail, deterministic caching, and provenance tracking
Project description
LLM Ledger
A production-ready Python SDK wrapping LiteLLM with deterministic caching, provenance tracking, and full audit trail.
Features
- ✅ Deterministic Caching - SHA256-based content hashing for reproducible results
- ✅ Provenance Tracking - Full metadata and source tracking
- ✅ Audit Trail - Complete ledger of all LLM interactions
- ✅ Token Accounting - Automatic token usage and cost estimation
- ✅ Multi-Provider - Supports all LiteLLM providers (Anthropic, OpenAI, Azure, etc.)
- ✅ Request Persistence - SQLite/PostgreSQL storage with full audit trail
- ✅ Retry Logic - Automatic retry with exponential backoff via LiteLLM
- ✅ Async Support - Full async/await support for high throughput
- ✅ Type Safe - Complete type hints with Pydantic models
Installation
pip install llm-ledger
Or install from source:
git clone <repo>
cd llm-ledger
pip install -e .
Quick Start
Basic Usage
from llm_ledger import LedgerClient
client = LedgerClient()
response = client.quick_complete("Explain quantum computing in simple terms")
print(response)
Fluent Builder Pattern
response = (
client.completion()
.model("claude-sonnet-4")
.system("You are a helpful assistant")
.user("Summarize the key points from this document...")
.temperature(0.0)
.max_tokens(4000)
.with_metadata(
workflow_id="document_processing",
chunk_id="section_3"
)
.execute()
)
print(f"Response: {response.content}")
print(f"Tokens: {response.usage.total_tokens}")
print(f"Cost: ${response.cost_estimate:.4f}")
With Full Provenance
from llm_ledger import LLMRequest, ProvenanceMetadata
metadata = ProvenanceMetadata(
workflow_id="data_processing",
chunk_id="chunk_12",
section_id="section_3.2",
source_id="document_v2.json",
character_range=(1500, 2300),
tags={"stage": "extraction"}
)
request = LLMRequest(
model="claude-sonnet-4",
messages=[
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Analyze this text..."}
],
temperature=0.0,
metadata=metadata
)
response = client.complete(request)
Async Usage
import asyncio
async def process_chunks():
requests = [create_request(chunk) for chunk in chunks]
responses = await asyncio.gather(*[
client.complete_async(req) for req in requests
])
return responses
responses = asyncio.run(process_chunks())
Configuration
Environment Variables
Create a .env file:
# API Keys
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
# Database
LLM_DATABASE_URL=postgresql://user:pass@localhost/llm_gateway
# Cache
LLM_CACHE_BACKEND=redis # or "memory"
REDIS_URL=redis://localhost:6379/0
LLM_CACHE_TTL=3600 # seconds, 0 = no expiration
# Features
LLM_ENABLE_CACHE=true
LLM_ENABLE_PERSISTENCE=true
# Defaults
LLM_DEFAULT_MODEL=claude-sonnet-4
LLM_DEFAULT_TEMPERATURE=0.0
LLM_DEFAULT_MAX_TOKENS=4000
Programmatic Configuration
from llm_ledger import LedgerClient
from llm_ledger.cache import RedisCache
client = LedgerClient(
cache_backend=RedisCache("redis://localhost:6379/0"),
database_url="postgresql://localhost/llm_gateway",
enable_cache=True,
enable_persistence=True,
default_model="claude-sonnet-4"
)
Caching
The SDK uses deterministic SHA256-based caching:
# First call - executes against LLM
response1 = client.quick_complete("What is machine learning?", temperature=0.0)
print(response1.from_cache) # False
print(response1.latency_ms) # e.g., 1200ms
# Second call - returns from cache
response2 = client.quick_complete("What is machine learning?", temperature=0.0)
print(response2.from_cache) # True
print(response2.cache_key) # SHA256 hash
# Different parameters = cache miss
response3 = client.quick_complete("What is machine learning?", temperature=0.5)
print(response3.from_cache) # False
Cache statistics:
stats = client.get_stats()
print(stats["cache_hit_rate"]) # e.g., 0.67
Persistence & Querying
All requests and responses are automatically persisted:
# Query by provenance metadata
requests = client.query_requests(
workflow_id="data_processing",
chunk_id="chunk_12"
)
# Retrieve specific request/response
request = client.get_request(request_id)
response = client.get_response(response_id)
# Get token usage summary
from datetime import datetime
usage = client.persistence.get_token_usage_summary(
start_date=datetime(2024, 1, 1),
end_date=datetime(2024, 12, 31)
)
Advanced Features
Custom Retry Logic
LiteLLM handles retries automatically, configure via:
response = client.complete(
request,
num_retries=5,
timeout=600 # 10 minutes
)
Batch Processing
requests = [create_request(text) for text in texts]
responses = await asyncio.gather(*[
client.complete_async(req) for req in requests
])
Cache Control
# Bypass cache
response = client.complete(request, use_cache=False)
# Clear entire cache
client.clear_cache()
# Invalidate specific request
client.cache.invalidate(request)
Production Deployment
With PostgreSQL and Redis
from llm_ledger import LedgerClient
from llm_ledger.cache import RedisCache
client = LedgerClient(
cache_backend=RedisCache("redis://prod-redis:6379/0"),
database_url="postgresql://user:pass@prod-db/llm_gateway",
enable_cache=True,
enable_persistence=True,
cache_ttl=86400 # 24 hours
)
With Docker Compose
services:
app:
environment:
- LLM_DATABASE_URL=postgresql://postgres:password@db/llm_gateway
- REDIS_URL=redis://redis:6379/0
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
db:
image: postgres:15
redis:
image: redis:7
Testing
from llm_ledger.testing import MockLLMClient
# For unit tests
mock_client = MockLLMClient()
mock_client.add_response("test prompt", "test response")
response = mock_client.quick_complete("test prompt")
assert response == "test response"
License
Apache 2.0 License
Contributing
Contributions welcome! Please see CONTRIBUTING.md for guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_ledger-0.1.1.tar.gz.
File metadata
- Download URL: llm_ledger-0.1.1.tar.gz
- Upload date:
- Size: 15.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
885b4d2879673e77a6dbc8410d05d2789376206ffeb743de27c76529cbe62db3
|
|
| MD5 |
423833fecd91b75e390a90005960b821
|
|
| BLAKE2b-256 |
b628ae2b5f0c945068a97386ea1142e5e44d84fa580eb1b78da626acbe9c0ac9
|
File details
Details for the file llm_ledger-0.1.1-py3-none-any.whl.
File metadata
- Download URL: llm_ledger-0.1.1-py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08f75b6bfdc8f32e960de307267696170e2b02e9cabc56a7eb4dcf702c24da17
|
|
| MD5 |
603e90381d1debfe6cf6d1bc1a9b97fc
|
|
| BLAKE2b-256 |
cd8703de8e36718f6b52fe0c31d79bf8bb699262d882e4f27ae0d1bbdfbf9e3e
|