Catch training ↔ serving feature skew before you ship to production

These details have not been verified by PyPI

Project description

SkewSentry

Catch training ↔ serving feature skew before you ship to production

Prevent ML model failures with automated feature parity validation

🚀 Why SkewSentry?

SkewSentry transforms fragile ML deployments into reliable production systems through automated feature parity validation.

💰 Prevent Costly ML Failures

70% of ML failures stem from training/serving skew
Months of silent degradation before detection
Lost revenue and customer trust from broken predictions

⚡ Production-Ready Validation

Pre-deployment detection - Catch issues in CI before they ship
Configurable tolerances - Handle expected differences intelligently
Multi-source support - Python functions, HTTP APIs, any feature pipeline
Rich reporting - HTML reports with detailed mismatch analysis

🔧 Developer-First Design

Zero configuration - Works out of the box with intelligent defaults
CI integration - Exit codes for automated validation gates
Multiple formats - Text, JSON, and HTML reports for different use cases

📦 Installation

Production

pip install skewsentry

Development

uv venv .venv
source .venv/bin/activate
uv pip install -e ".[dev]"

⚡ Quickstart

Basic Feature Parity Check

# Initialize spec from your data
skewsentry init features.yml --data validation.parquet --keys user_id timestamp

# Run parity check
skewsentry check \
  --spec features.yml \
  --offline training.pipeline:extract_features \
  --online serving.api:get_features \
  --data validation.parquet \
  --html report.html

# ✅ Exit 0: Features match within tolerance
# ❌ Exit 1: Parity violations detected (fails CI)
# 🚨 Exit 2: Configuration error

Realistic Example: E-commerce Features

# features.yml
version: 1
keys: ["user_id", "timestamp"]

features:
  - name: total_spend_7d
    dtype: float
    tolerance:
      abs: 0.01  # $0.01 absolute tolerance
      rel: 0.001  # 0.1% relative tolerance
      
  - name: order_count_30d
    dtype: int
    tolerance:
      abs: 1  # Allow 1 order difference

# Offline pipeline (training)
def extract_features(df):
    return df.assign(
        total_spend_7d=df.groupby('user_id')['amount'].rolling('7D').sum(),
        order_count_30d=df.groupby('user_id').size().rolling('30D').sum()
    )

# Online pipeline (serving) - subtle differences
def get_features(df):
    return df.assign(
        total_spend_7d=df.groupby('user_id')['amount'].rolling('7D', closed='right').sum(),  # Different windowing!
        order_count_30d=df.groupby('user_id').size().rolling('30D').sum()
    )

SkewSentry catches the windowing difference:

❌ Feature parity violations detected:
  - total_spend_7d: mismatch_rate=0.1200 rows=5000 mean_abs_diff=0.0845

🏗️ Feature Adapters

SkewSentry works with any feature pipeline through adapters:

Python Functions

# Direct Python function integration
from skewsentry.adapters import PythonFunctionAdapter

adapter = PythonFunctionAdapter("mymodule:extract_features")
features = adapter.get_features(input_data)

HTTP APIs

# REST API integration with automatic batching
from skewsentry.adapters import HTTPAdapter

adapter = HTTPAdapter("http://api.example.com/features", timeout=30.0)
features = adapter.get_features(input_data)

Usage

Command Line Interface

Initialize Feature Spec

skewsentry init features.yml \
  --data sample_data.parquet \
  --keys user_id timestamp

Run Parity Check

skewsentry check \
  --spec features.yml \
  --offline module.offline:build_features \
  --online module.online:get_features \
  --data validation.parquet \
  --sample 10000 \
  --seed 42 \
  --html artifacts/report.html \
  --json artifacts/results.json

Python API

from skewsentry import FeatureSpec
from skewsentry.adapters.python import PythonFunctionAdapter
from skewsentry.adapters.http import HTTPAdapter
from skewsentry.runner import run_check

# Define feature comparison rules
spec = FeatureSpec.from_yaml("features.yml")

# Set up adapters for your pipelines
offline_adapter = PythonFunctionAdapter("training.pipeline:extract_features")
online_adapter = HTTPAdapter("https://api.myservice.com/features")

# Run comparison
report = run_check(
    spec=spec,
    data="validation_data.parquet",  # or DataFrame
    offline=offline_adapter,
    online=online_adapter,
    sample=5000,
    seed=42,
    html_out="report.html",
    json_out="results.json"
)

# Check results
if report.ok:
    print("✅ All features match within tolerance")
else:
    print("❌ Feature parity violations detected:")
    print(report.to_text(max_rows=10))
    
    # Fail CI/CD pipeline
    raise SystemExit(1)

Feature Specification

SkewSentry uses YAML configuration to define feature comparison rules:

version: 1
keys: ["user_id", "timestamp"]  # Row alignment keys
null_policy: "same"              # "same" | "allow_both_null"

features:
  # Numeric features with tolerance
  - name: spend_7d
    dtype: float
    nullable: true
    tolerance:
      abs: 0.01      # Absolute tolerance (optional)
      rel: 0.001     # Relative tolerance (optional)
    window:
      lookback_days: 7
      timestamp_col: "timestamp"
      closed: "right"
      
  # Categorical features with validation
  - name: country
    dtype: category
    categories: ["US", "UK", "DE", "FR"]  # Expected values
    nullable: false
    
  # Integer features with range validation
  - name: age
    dtype: int
    nullable: false
    range: [0, 120]  # [min, max] bounds
    
  # String features (exact match)
  - name: user_segment
    dtype: string
    nullable: true
    
  # DateTime features (exact match)
  - name: last_login
    dtype: datetime
    nullable: true

Supported Data Types

Type	Comparison	Tolerance	Notes
`int`	Numeric	✅ abs/rel	Coerced to float for comparison
`float`	Numeric	✅ abs/rel	NaN handling per null_policy
`bool`	Exact	❌	True/False only
`string`	Exact	❌	Case sensitive
`category`	Exact + Unknown detection	❌	Validates against expected categories
`datetime`	Exact	❌	Timezone aware

Tolerance Configuration

Absolute Tolerance: |offline_value - online_value| ≤ abs_tolerance

Either or both can be specified. If both are provided, the comparison passes if either tolerance is satisfied.

Adapters

SkewSentry supports multiple adapter types to connect with different feature pipeline architectures:

Python Function Adapter

For in-process Python functions:

from skewsentry.adapters.python import PythonFunctionAdapter

# Your feature function signature
def extract_features(df: pd.DataFrame) -> pd.DataFrame:
    """Extract features from input DataFrame.
    
    Args:
        df: Input DataFrame with raw data
        
    Returns:
        DataFrame with feature columns + key columns
    """
    return df[["user_id", "timestamp", "spend_7d", "country"]]

# Reference by module:function string
adapter = PythonFunctionAdapter("mypackage.features:extract_features")

HTTP Adapter

For REST API endpoints:

from skewsentry.adapters.http import HTTPAdapter

adapter = HTTPAdapter(
    url="https://features.myservice.com/batch",
    method="POST",
    headers={"Authorization": "Bearer token"},
    batch_size=1000,  # Records per request
    timeout=30.0,
    max_retries=3
)

Expected API Contract:

Request: JSON array of input records
Response: JSON array of feature records (same order)
Status: 200 for success, 4xx/5xx for errors

Reporting

SkewSentry generates multiple report formats for different use cases:

Text Report

# Console-friendly summary
print(report.to_text(max_rows=10))

OK: False
Missing rows — offline: 0, online: 3
Per-feature mismatch rates:
  - spend_7d: mismatch_rate=0.1200 rows=1000 mean_abs_diff=0.0845
  - country: mismatch_rate=0.0000 rows=1000 mean_abs_diff=None

JSON Report

# Machine-readable results
report.to_json("results.json")

{
  "ok": false,
  "keys": ["user_id", "timestamp"],
  "missing_in_online": 3,
  "missing_in_offline": 0,
  "features": [
    {
      "name": "spend_7d",
      "mismatch_rate": 0.12,
      "num_rows": 1000,
      "mean_abs_diff": 0.0845,
      "unknown_categories": null
    }
  ],
  "failing_features": ["spend_7d"]
}

HTML Report

# Rich visual report for stakeholders
report.to_html("report.html")

Interactive HTML report includes:

Executive summary with pass/fail status
Per-feature mismatch statistics
Sample mismatched rows with differences highlighted
Missing row analysis
Feature distribution comparisons

CI Integration

GitHub Actions

name: Feature Parity Check
on: [push, pull_request]

jobs:
  parity-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      
      - name: Install dependencies
        run: |
          pip install -e ".[dev]"
          
      - name: Run feature parity check
        run: |
          skewsentry check \
            --spec features.yml \
            --offline training.pipeline:extract_features \
            --online serving.api:get_features \
            --data tests/fixtures/validation.parquet \
            --html artifacts/parity-report.html \
            --json artifacts/parity-results.json
            
      - name: Upload report artifacts
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: parity-reports
          path: artifacts/

Exit Codes

0: All features match within specified tolerances ✅
1: Feature parity violations detected ❌
2: Configuration error or runtime failure 🚨

Integration Patterns

Pre-deployment Gate:

# Block deployment if parity check fails
skewsentry check --spec features.yml --offline offline:fn --online online:fn --data validation.parquet
if [ $? -eq 1 ]; then
  echo "❌ Feature parity violations detected. Blocking deployment."
  exit 1
fi

Model Registry Integration:

# Validate features before model registration
report = run_check(spec, data, offline_adapter, online_adapter)
if report.ok:
    model_registry.register_model(model, features=spec.features)
else:
    raise ValueError(f"Feature parity check failed: {report.failing_features}")

Examples

Real-World Bug Caught by SkewSentry

This is the exact type of production bug SkewSentry prevents:

# Training pipeline (offline) - Spark/Python
def extract_features(df):
    # Rolling 7-day sum with pandas semantics
    spend_7d = df.groupby("user_id")["amount"] \
                 .rolling(7, min_periods=1) \
                 .sum() \
                 .round(2)
    return df.assign(spend_7d=spend_7d)

# Serving pipeline (online) - Java/Kafka Streams  
# Translated to Python equivalent for illustration
def get_features(df):
    # Rolling 7-day sum with different window semantics
    spend_7d = df.groupby("user_id")["amount"] \
                 .rolling(7, closed="left") \
                 .sum() \
                 .apply(lambda x: math.floor(x * 100) / 100)
    return df.assign(spend_7d=spend_7d)

The Differences:

Window boundaries: min_periods=1 vs closed="left"
Rounding logic: round(2) vs floor() * 100 / 100

The Impact: 12% of feature values differed by 0.01-0.15, causing model accuracy to drop from 94% to 89% in production.

The Solution: SkewSentry with tolerance: {abs: 0.01} caught this in CI:

❌ Feature parity violations detected:
  - spend_7d: mismatch_rate=0.1200 rows=5000 mean_abs_diff=0.0845

Complete Example

See examples/python/ for a runnable demonstration showing how SkewSentry catches windowing and rounding differences between offline and online pipelines.

Development

Setup

git clone https://github.com/your-org/skewsentry.git
cd skewsentry
uv venv .venv && source .venv/bin/activate
uv pip install -e ".[dev]"

Testing

# Run all tests
uv run pytest

# With coverage (enforces 85%+)
uv run pytest --cov=skewsentry --cov-fail-under=85

# Run specific test categories
uv run pytest -k test_spec              # Specification tests
uv run pytest -k test_adapter           # Adapter tests  
uv run pytest -m "e2e"                  # End-to-end integration tests

Project Architecture

skewsentry/
├── __init__.py                    # Package exports
├── spec.py                        # FeatureSpec Pydantic models
├── inputs.py                      # Data loading and sampling
├── adapters/                      # Pipeline adapters
│   ├── __init__.py
│   ├── base.py                    # FeatureAdapter protocol
│   ├── python.py                  # Python function adapter
│   ├── http.py                    # HTTP/REST API adapter
├── align.py                       # Row alignment by keys
├── compare.py                     # Feature comparison logic
├── runner.py                      # Pipeline orchestration
├── report.py                      # Report generation
├── cli.py                         # Command-line interface
├── errors.py                      # Exception classes
└── utils.py                       # Logging utilities

Contributing

Issues: Report bugs or request features via GitHub Issues
Pull Requests: Fork, create feature branch, add tests, submit PR
Testing: All changes must include tests and maintain 85%+ coverage
Documentation: Update README and docstrings for new features

Roadmap

v0.2.0 - Enhanced Analysis

Statistical significance testing (KS-test, chi-square)
Feature drift detection over time
SQL adapter for database sources
Streaming data support

v0.3.0 - Scale & Performance

Spark/Dask backends for large datasets
Distributed comparison for high-volume pipelines
Advanced sampling strategies
Performance benchmarking suite

v4.0.0 - Production Features

Web dashboard for monitoring
Alert integrations (Slack, PagerDuty)
Model performance correlation analysis
Enterprise security features

License: MIT | Python: 3.9+ | Maintained by: Yasser El Haddar

Prevent ML model failures before they reach production. Start validating your feature pipelines today.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.1

Aug 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skewsentry-0.1.1.tar.gz (26.0 kB view details)

Uploaded Aug 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

skewsentry-0.1.1-py3-none-any.whl (24.0 kB view details)

Uploaded Aug 28, 2025 Python 3

File details

Details for the file skewsentry-0.1.1.tar.gz.

File metadata

Download URL: skewsentry-0.1.1.tar.gz
Upload date: Aug 28, 2025
Size: 26.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for skewsentry-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`5ba0d0e80b17772bbccbb9d05923b8f066e70a11e95763e11b94cf73055fb851`
MD5	`2db31f8c37131bfd5f0af4330e2f3d45`
BLAKE2b-256	`95e36f60c81c34e51affb880d51e53d105270b6c3ca5575582d44e52fcaf0dec`

See more details on using hashes here.

File details

Details for the file skewsentry-0.1.1-py3-none-any.whl.

File metadata

Download URL: skewsentry-0.1.1-py3-none-any.whl
Upload date: Aug 28, 2025
Size: 24.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for skewsentry-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`04d8091fb8a82ab8fff5315b3b509929a586311c15b155b07afff40245fb46f5`
MD5	`a965213e2663de85edec472fce4c71a9`
BLAKE2b-256	`da6e9bda141b3a8dccee4fe49b8e28a54e0a1141d7c5288ace5b59de303066b7`

See more details on using hashes here.

skewsentry 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

SkewSentry

🚀 Why SkewSentry?

💰 Prevent Costly ML Failures

⚡ Production-Ready Validation

🔧 Developer-First Design

📦 Installation

Production

Development

⚡ Quickstart

Basic Feature Parity Check

Realistic Example: E-commerce Features

🏗️ Feature Adapters

Python Functions

HTTP APIs

Usage

Command Line Interface

Initialize Feature Spec

Run Parity Check

Python API

Feature Specification

Supported Data Types

Tolerance Configuration

Adapters

Python Function Adapter

HTTP Adapter

Reporting

Text Report

JSON Report

HTML Report

CI Integration

GitHub Actions

Exit Codes

Integration Patterns

Examples

Real-World Bug Caught by SkewSentry

Complete Example

Development

Setup

Testing

Project Architecture

Contributing

Roadmap

v0.2.0 - Enhanced Analysis

v0.3.0 - Scale & Performance

v4.0.0 - Production Features

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes