Valgrind for Time-Series ML โ automatically detect look-ahead bias in data science pipelines.
Project description
๐ต๏ธ Temporal Leaks: Valgrind for Time-Series ML
Look-ahead bias is the silent killer of quant strategies and forecasting models.
Your backtest shows 40% annual returns. You deploy. You lose money.
Somewhere in your feature pipeline, a rolling average peeked at tomorrow's prices.
temporal-leaks catches this automatically โ before it costs you.
The Problem: Future Data in Your Past Features
In time-series machine learning, look-ahead bias (also called data leakage or future leakage) occurs when a feature computed for timestamp t inadvertently uses data from timestamps t+1, t+2, โฆ t+n.
This is devastatingly easy to introduce:
# BUG: center=True means the window is centred โ it looks forward AND backward
df["roll_mean"] = df["price"].rolling(window=5, center=True).mean()
# BUG: shift(-1) reads the NEXT row's value
df["next_return"] = df["return"].shift(-1)
# BUG: global z-score uses future data to compute mean/std
df["znorm"] = (df["price"] - df["price"].mean()) / df["price"].std()
None of these will raise an error.
Your tests will pass.
Your backtests will look amazing.
And then reality hits.
How It Works: The Temporal Perturbation Test
Timeline: โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถ
T (midpoint)
โ
Past โโโโโโโโโโโโโโโโโโโโโโโโคโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถ Future
โ
Step 1: Run pipeline on original data
baseline_features = pipeline(df)
Step 2: MUTATE the future
df_perturbed[t > T] = ๐ฅ (noise / sign flip / NaN)
Step 3: Re-run pipeline on perturbed data
perturbed_features = pipeline(df_perturbed)
Step 4: Compare features for PAST rows only (t โค T)
If baseline_features[tโคT] โ perturbed_features[tโคT]
then the past features DEPEND on future data โ LEAK! ๐จ
The key insight: if your past features are truly causal, mutating the future should not change them. If they change, future data crept in.
Installation
pip install temporal-leaks
Or from source:
git clone https://github.com/temporal-leaks/temporal-leaks
cd temporal-leaks
pip install -e ".[dev]"
Quick Start
import pandas as pd
import numpy as np
from temporal_leaks import TemporalAudit, TemporalLeakageError
# Build a sample time-series dataset
df = pd.DataFrame({
"ts": np.arange(500),
"price": np.random.default_rng(42).normal(100, 5, size=500),
})
# โโโ โ CLEAN PIPELINE โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
def causal_features(df: pd.DataFrame) -> pd.DataFrame:
out = df.copy()
# Expanding window only looks at past โ safe!
out["expanding_mean"] = out["price"].expanding(min_count=1).mean()
# shift(+1) looks at the previous row โ safe!
out["lag1"] = out["price"].shift(1)
return out
auditor = TemporalAudit(mode="nullify", random_seed=42)
report = auditor.check(df, timestamp_col="ts", pipeline_fn=causal_features)
print(report)
# โ CLEAN โ leakage_score=0.0000
# โโโ โ LEAKING PIPELINE โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
def leaking_features(df: pd.DataFrame) -> pd.DataFrame:
out = df.copy()
# center=True peeks at future rows โ LEAKS!
out["centred_roll"] = out["price"].rolling(11, center=True, min_periods=1).mean()
return out
try:
auditor.check(df, timestamp_col="ts", pipeline_fn=leaking_features)
except TemporalLeakageError as exc:
print(exc)
# TemporalLeakageError: leakage_score=0.4812
# Breached columns (1):
# โข [HIGH] column='centred_roll' effect_size=0.4812 ...
Decorator API
from temporal_leaks import temporal_audit
@temporal_audit(timestamp_col="ts", mode="noise", random_seed=42)
def build_features(df: pd.DataFrame) -> pd.DataFrame:
df = df.copy()
df["expanding_mean"] = df["price"].expanding(min_count=1).mean()
return df
# The audit runs automatically on every call.
# TemporalLeakageError is raised if leakage is detected.
result = build_features(df)
HTML Audit Reports
report = auditor.check(df, "ts", leaking_features)
# Write a beautiful standalone HTML report
with open("audit_report.html", "w") as f:
f.write(report.to_html())
The HTML report includes:
- Leakage score with a visual progress bar
- Per-column severity badges (LOW / MEDIUM / HIGH / CRITICAL)
- Effect size, mean |ฮ|, max |ฮ|, % rows changed
- First timestamp where each leak was observed
- Provenance hints describing likely causes
API Reference
TemporalAudit
TemporalAudit(
mode: Literal["noise", "sign_flip", "nullify"] = "noise",
random_seed: int = 42,
delta_threshold: float = 1e-8,
leakage_threshold: float = 0.0,
ignore_columns: list[str] | None = None,
)
| Parameter | Description |
|---|---|
mode |
Perturbation strategy: noise adds Gaussian noise, sign_flip multiplies by -1, nullify sets NaN |
random_seed |
Integer seed โ fully deterministic, reproducible across runs |
delta_threshold |
Minimum cell-level change to count as "different" (suppresses float noise) |
leakage_threshold |
If leakage_score > leakage_threshold, raise TemporalLeakageError. Set to 1.1 to always return report |
ignore_columns |
List of output columns to skip during comparison |
AuditReport
@dataclass
class AuditReport:
leakage_score: float # 0.0 = clean, 1.0 = fully compromised
breached_columns: list[ColumnLeakMeta]
clean_columns: list[str]
perturbation_mode: str
evaluation_time: Any
random_seed: int
provenance_hints: dict[str, str]
def to_html(self) -> str: ... # standalone HTML report
ColumnLeakMeta
@dataclass(frozen=True)
class ColumnLeakMeta:
column_name: str
first_leaky_timestamp: Any
mean_absolute_delta: float
max_delta: float
pct_rows_changed: float
effect_size: float # normalised, 0โ1
severity: str # LOW | MEDIUM | HIGH | CRITICAL
Severity Classification
| Severity | Effect Size |
|---|---|
| ๐ฆ LOW | effect_size < 0.15 |
| ๐จ MEDIUM | 0.15 โค effect_size < 0.40 |
| ๐ง HIGH | 0.40 โค effect_size < 0.75 |
| ๐ฅ CRITICAL | effect_size โฅ 0.75 |
Perturbation Modes
โโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Mode โ What it does to future rows โ
โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ noise โ Adds Gaussian noise: ฮผ=0, ฯ=2รcolumn_std โ
โ sign_flip โ Multiplies all numeric values by โ1 โ
โ nullify โ Replaces all values with NaN / null โ
โโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Use nullify for the strictest test.
Use noise for pipelines that handle NaN gracefully (e.g., imputers).
Use sign_flip to test pipelines sensitive to sign changes (e.g., momentum factors).
Polars Support
import polars as pl
from temporal_leaks import TemporalAudit
df = pl.DataFrame({"ts": range(200), "value": [float(i) for i in range(200)]})
auditor = TemporalAudit(mode="nullify", random_seed=42)
report = auditor.check(df, "ts", my_polars_pipeline)
temporal-leaks handles Polars DataFrames transparently โ pass them in, get results back in the same type.
Benchmarks
| Dataset | Rows | Columns | Backend | Mode | Time |
|---|---|---|---|---|---|
| Synthetic prices | 1,000,000 | 5 | Polars | nullify | ~1.1 s |
| Synthetic prices | 10,000,000 | 5 | Polars | nullify | ~3.2 s |
| Equity features | 500,000 | 20 | Pandas | noise | ~2.8 s |
Benchmarks run on Apple M2 Pro, 16 GB RAM. Polars backend strongly recommended for large frames.
Running Tests
# Install dev extras
pip install -e ".[dev]"
# Run the full suite
pytest tests/ -v
# With coverage
pytest tests/ --cov=temporal_leaks --cov-report=term-missing
Contributing
Pull requests are welcome. For major changes, please open an issue first.
- Fork the repo
- Create your feature branch:
git checkout -b feat/my-feature - Commit your changes:
git commit -m 'feat: add my feature' - Push and open a PR
Please make sure ruff check . and mypy temporal_leaks/ pass before submitting.
License
MIT ยฉ temporal-leaks contributors
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file temporal_leaks-0.1.0.tar.gz.
File metadata
- Download URL: temporal_leaks-0.1.0.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a818c316ff0fe0fd1f873e57e5a5d2f343674495fe8f32982c692480a0d79af2
|
|
| MD5 |
3d8eb6e81063f30c7f4f83a47b72e3d3
|
|
| BLAKE2b-256 |
ea2bac9f91698fdab8832dcef996dd425b6a28d3a5ca094172d5e39c542271da
|
File details
Details for the file temporal_leaks-0.1.0-py3-none-any.whl.
File metadata
- Download URL: temporal_leaks-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f8887d05085b70eca39729551e670e65d20aa5ce9ff6aba5e058c48c7e8f7a11
|
|
| MD5 |
51c886c85688d0430ed8f5d14d18a309
|
|
| BLAKE2b-256 |
3e9aee2d1e4bd84d869f71c0df4ac69e18c9a66d30624e950c42902bd2171927
|