Mine behavioral invariants from production logs and auto-generate tests

These details have not been verified by PyPI

Project links

Project description

Sediment logo

Sediment

Mine behavioral invariants from LLM production logs. Auto-generate tests.

210 tests passing zero required dependencies

Quickstart · CLI · Invariant types · CI integration · Formats · API

Sediment reads your production logs, discovers what your LLM system actually does (not what you think it does), and turns those discoveries into runnable pytest tests and CI checks.

pip install sediment
sediment discover logs/prod.jsonl

Discovered 14 invariants from logs/prod.jsonl

[structural]  output_never_empty          confidence=100%  support=2841
[structural]  output_always_json          confidence=98%   support=2784
[pattern]     no_email_in_output          confidence=100%  support=2841  ← PII guard
[pattern]     no_credit_card_in_output    confidence=100%  support=2841  ← PII guard
[statistical] latency_p95_threshold       confidence=94%   support=2672  p95=1240ms
[temporal]    output_length_drift         confidence=91%   support=2841
[semantic]    semantic_consistency        confidence=87%   support=2841
...

What it does

Step	Description
Ingest	Reads logs in any format — JSONL, CSV, Parquet, gzip, OpenAI, LangSmith, OTel, and more
Infer	Auto-detects format and field schema (input, output, latency, model, session, …)
Discover	Mines behavioral invariants across 7 miner types
Generate	Writes a pytest test file you can drop straight into CI
Track	Saves a baseline and alerts when production behavior drifts

Install

pip install sediment                          # core — zero required dependencies
pip install "sediment[parquet]"               # + Parquet / Arrow support
pip install "sediment[avro]"                  # + Avro support
pip install "sediment[cloud]"                 # + S3 / GCS / Azure Blob sources
pip install "sediment[openai]"                # + OpenAI embedding backend
pip install "sediment[sentence-transformers]" # + sentence-transformers backend
pip install "sediment[config]"                # + .sediment.yml config file support
pip install "sediment[full]"                  # everything

Quickstart

Python API

from sediment import LogAnalyzer

a = LogAnalyzer("logs/prod.jsonl")

# Inspect what was detected
print(a.summary())

# Discover invariants
invariants = a.discover(min_confidence=0.8)
for inv in invariants:
    print(inv)

# Generate a pytest test file
a.emit_tests("test_invariants.py", function_hint="call_llm")
# → Run with: pytest test_invariants.py -v

# Generate an interactive HTML report
a.report("report.html")

CLI

# Explore what's in your logs
sediment summary logs/prod.jsonl

# Discover and print invariants
sediment discover logs/prod.jsonl --min-confidence 0.8

# Save a baseline for future staleness checks
sediment save logs/prod.jsonl baseline.json

# Check for drift against a new batch of logs
sediment check-staleness logs/today.jsonl baseline.json

# Compare invariants between two log snapshots
sediment compare logs/v1.jsonl logs/v2.jsonl

# Generate an HTML report
sediment report logs/prod.jsonl -o report.html

# Scaffold a .sediment.yml config and first baseline
sediment init logs/prod.jsonl

Invariant types

Sediment runs 7 miner types in parallel. Each produces typed, confidence-annotated invariants.

Structural

What your outputs always look like:

output_never_empty — output is always non-null, non-empty
output_length_range — character length stays within observed bounds
output_always_json — every output is valid JSON
output_json_keys_consistent — JSON outputs always contain the same keys
output_type_consistent — output type (str / list / dict) is stable

Statistical

Distributional properties of your system:

latency_p95_threshold — p95 latency stays under threshold
cost_p95_threshold — p95 cost per request stays under threshold
error_rate — error rate at or below observed baseline
model_consistency — a single model is used throughout

Pattern — PII & safety

What must never appear in outputs:

no_email_in_output — no email addresses leaked 🔴 critical
no_phone_us_in_output — no US phone numbers leaked 🔴 critical
no_ssn_in_output — no Social Security numbers leaked 🔴 critical
no_credit_card_in_output — no credit card numbers leaked (Luhn-validated) 🔴 critical
no_ipv4_in_output — no IP addresses leaked 🔴 critical

PII detection uses validated regex — SSNs checked against SSA rules, credit cards validated with the Luhn algorithm, phone numbers validated against NANP rules.

Relational

Input → output relationships:

output_minimum_length — outputs stay above a safe minimum length relative to input
refusal_rate — model refuses or apologises within observed bounds
input_output_length_correlation — longer inputs produce longer outputs (when expected)

Semantic

Meaning-level consistency:

semantic_consistency — outputs remain semantically similar to baseline
semantic_outliers — no outputs diverge more than 2σ from the centroid
near_duplicate_outputs — outputs are not near-identical (flags stuck / looping models)

Temporal

Drift over time:

output_length_drift — output length distribution hasn't shifted
latency_drift — latency distribution hasn't shifted
model_drift — model hasn't silently changed
error_rate_drift — error rate hasn't crept up

Session

Multi-turn conversation patterns:

session_turn_count_range — turns per session stays within expected range
session_avg_turns — average session length is stable
session_user_return_rate — returning user rate is stable

Staleness tracking & CI

Save a baseline once

sediment save logs/prod.jsonl .sediment-baseline.json

Check daily in CI

sediment check-staleness logs/today.jsonl .sediment-baseline.json
# exits 1 if any invariants are violated

Staleness Report — checked 2024-03-15 09:00 UTC
Original discovery: 2024-03-01  source: logs/prod.jsonl

  ✓ Holds:    11/14
  ↓ Degraded:  2/14   (confidence dropped > 10pp)
  ✗ Violated:  1/14   (confidence dropped > 30pp)  ← CI fails here
  ? Missing:   0/14

GitHub Actions

# .github/workflows/sediment.yml
- name: Check invariant staleness
  run: sediment check-staleness ${{ env.LOG_SOURCE }} .sediment-baseline.json

pytest plugin

Collect *.sediment.json baselines as native pytest test items:

pytest --sediment-source=logs/today.jsonl

Each invariant becomes a separate test. Violated invariants fail; degraded ones warn.

Compare two releases

sediment compare logs/v1.jsonl logs/v2.jsonl

Sediment Compare: logs/v1.jsonl  →  logs/v2.jsonl
────────────────────────────────────────────────────────────
  New:        2   invariants appeared
  Removed:    0   invariants disappeared
  Improved:   3   confidence increased ≥5%
  Degraded:   1   confidence decreased ≥5%
  Stable:    10   no meaningful change

⚠️  DEGRADED  latency_p95_threshold  87% (-8%)
✅ No regressions detected.

Supported formats

Format	Auto-detected	Notes
JSONL / NDJSON	✅	Streaming, nested field paths
JSON array	✅	`[{…}, {…}]`
CSV / TSV	✅	Any delimiter, quoted fields
logfmt	✅	`key=value key="quoted value"`
Apache / nginx	✅	Combined log format
Parquet	✅	Requires `pyarrow`
Avro	✅	Requires `fastavro`
gzip	✅	`.jsonl.gz`, `.csv.gz`, etc.
OpenAI API logs	✅	Auto-detected
LangSmith traces	✅	Auto-detected
LangFuse generations	✅	Auto-detected
OpenTelemetry GenAI	✅	Auto-detected
Helicone	✅	Auto-detected
W&B Weave	✅	Auto-detected
MLflow traces	✅	Auto-detected
Datadog LLM Obs	✅	Auto-detected
S3 / GCS / Azure Blob	✅	Requires `sediment[cloud]`
stdin	✅	`sediment discover -`

Glob patterns, directories, and cloud URIs all work:

LogAnalyzer("logs/*.jsonl.gz")
LogAnalyzer("logs/")
LogAnalyzer("s3://my-bucket/logs/*.jsonl")
LogAnalyzer("-")   # stdin

Sampling

For large log files:

LogAnalyzer("huge.jsonl", sample=10_000, sampling_strategy="importance")

Strategy	Description
`random`	Uniform random sample (default)
`stratified`	Preserves output-length distribution
`importance`	Oversamples rare / anomalous entries
`time_windowed`	Weights recent entries higher

Configuration

Create .sediment.yml in your project root (or run sediment init logs/prod.jsonl):

# .sediment.yml
min_confidence: 0.8
min_support: 2
baseline: .sediment-baseline.json

types:
  - structural
  - statistical
  - pattern
  - relational
  - semantic
  - temporal
  - session

# sample: 10000
# sampling_strategy: random

report:
  format: html
  output: sediment_report.html

All CLI commands pick this up automatically.

Custom miners

from sediment.discovery.base import InvariantResult

def apology_rate_miner(entries):
    count = sum(1 for e in entries if "sorry" in str(e.output).lower())
    rate  = count / len(entries)
    return [InvariantResult(
        id="apology_rate",
        type="custom",
        description=f"Model apologises in {rate:.0%} of responses",
        confidence=1.0 - rate,
        support=count,
        total=len(entries),
        severity="warning" if rate > 0.1 else "info",
    )]

results = LogAnalyzer("logs.jsonl").register_miner(apology_rate_miner).discover()

Embedding backends

Used by the semantic miner. Swap for better accuracy:

from sediment.embeddings.openai_emb import OpenAIEmbedder

a = LogAnalyzer("logs.jsonl")
results = a.discover(embedder=OpenAIEmbedder(api_key="sk-..."))

Backend	Class	Quality	Install
TF-IDF	`TfidfEmbedder`	Basic	built-in
OpenAI `text-embedding-3-small`	`OpenAIEmbedder`	High	`sediment[openai]`
`all-MiniLM-L6-v2`	`SentenceTransformerEmbedder`	High	`sediment[sentence-transformers]`

Schema evolution detection

Detects when field names change mid-stream — e.g. a deploy that renamed prompt → input:

drifts = LogAnalyzer("logs.jsonl").check_schema_evolution()
for d in drifts:
    print(d)
# [SCHEMA DRIFT] input: 'prompt' → 'input'  (around entry 5000, early=94% late=97%)

Jupyter

a = LogAnalyzer("logs.jsonl")
a.show()   # renders interactive HTML report inline

API reference

LogAnalyzer(
    source,                      # file, glob, directory, s3://, gs://, az://, or "-"
    schema=None,                 # override inferred schema
    sample=None,                 # max entries to load
    sampling_strategy="random",  # random | stratified | importance | time_windowed
    format_hint=None,            # skip auto-detection
)

# Exploration
.summary()                        → Summary
.infer()                          → SchemaMap
.entries()                        → Iterator[LogEntry]
async .async_entries()            → AsyncIterator[LogEntry]

# Discovery
.discover(
    min_confidence=0.8,
    min_support=2,
    types=None,                  # list of miner type strings, or None for all
    dedup=True,
    embedder=None,
)                                 → list[InvariantResult]

# Output
.emit_tests(output_path, min_confidence=0.8, function_hint="my_function")
.report(output_path, fmt="html", min_confidence=0.5)
.show(min_confidence=0.5)        # Jupyter inline display

# Staleness
.save_invariants(path, min_confidence=0.8)
.check_staleness(invariants_path) → StalenessReport
.check_schema_evolution()         → list[SchemaDrift]

# Extension
.register_miner(fn)               → LogAnalyzer  (chainable)

Development

git clone https://github.com/sediment-py/sediment
cd sediment
pip install -e ".[dev]"
pytest tests/ -v

210 tests · zero required dependencies · Python 3.9+

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Mar 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sediment-0.1.1.tar.gz (129.7 kB view details)

Uploaded Mar 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sediment-0.1.1-py3-none-any.whl (82.9 kB view details)

Uploaded Mar 19, 2026 Python 3

File details

Details for the file sediment-0.1.1.tar.gz.

File metadata

Download URL: sediment-0.1.1.tar.gz
Upload date: Mar 19, 2026
Size: 129.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for sediment-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`f785eede9ab2ce1d9f8d1707488a2f59efb00867a694b1be399455d934e0a5de`
MD5	`1326e72594b41675c85a5619d023228a`
BLAKE2b-256	`a438734550e918fed9bece6058d289903137cd0be61ef47a1a71d76ebf4e7956`

See more details on using hashes here.

File details

Details for the file sediment-0.1.1-py3-none-any.whl.

File metadata

Download URL: sediment-0.1.1-py3-none-any.whl
Upload date: Mar 19, 2026
Size: 82.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for sediment-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d9ce5a9ad8e0ba7d53c2ef1a5a3b31f50140a904d39296ad7cd862d16ca31d40`
MD5	`e8048ea711e7297bd2dcd6b11bbcb3f7`
BLAKE2b-256	`9868bf24f3ba6fc5fc4dfa9e4f12351c653f1398404ef2bf40727eb55a5baa5b`

See more details on using hashes here.

sediment 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Sediment

What it does

Install

Quickstart

Python API

CLI

Invariant types

Structural

Statistical

Pattern — PII & safety

Relational

Semantic

Temporal

Session

Staleness tracking & CI

Save a baseline once

Check daily in CI

GitHub Actions

pytest plugin

Compare two releases

Supported formats

Sampling

Configuration

Custom miners

Embedding backends

Schema evolution detection

Jupyter

API reference

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes