Skip to main content

DeepBridge: Model Validation Toolkit for Machine Learning

Project description

DeepBridge Logo

DeepBridge

Documentation Status CI PyPI version PyPI Downloads Python Version License: MIT Development Status Code style: black codecov

⚠️ BREAKING CHANGES - DeepBridge v2.0

DeepBridge v2.0 has been refactored to focus on Model Validation.

Moved to separate packages:

See Migration Guide for details.

DeepBridge is a comprehensive Python library for machine learning model validation and performance analysis. It provides powerful tools to manage experiments, validate models, and conduct in-depth performance evaluations.

Installation

You can install DeepBridge using pip:

pip install deepbridge

Or install from source:

git clone https://github.com/DeepBridge-Validation/DeepBridge.git
cd deepbridge
pip install -e .

Key Features

  • Comprehensive Testing Framework

    • Robustness testing with perturbation analysis
    • Uncertainty quantification using conformal prediction
    • Resilience testing under distribution shifts
    • Hyperparameter importance analysis
    • Fairness testing and bias detection (NEW!)
      • 15 fairness metrics (pre-training and post-training)
      • Auto-detection of sensitive attributes
      • EEOC compliance verification (80% rule)
      • Threshold analysis for fairness optimization
      • Interactive HTML reports with visualizations
  • Model Validation

    • Experiment tracking and management
    • Comprehensive model performance analysis
    • Advanced metric tracking
    • Model versioning support
  • Model Distillation → Moved to deepbridge-distillation

    • Knowledge distillation across multiple model types
    • Automated distillation with hyperparameter optimization
    • Support for GBM, XGBoost, and neural networks
    • Performance optimization and model compression
  • Advanced Analytics & Reporting

    • Interactive HTML reports with Plotly visualizations
    • Static reports for documentation
    • Detailed performance metrics and analysis
    • Multi-model comparison capabilities
  • Synthetic Data Generation → Moved to deepbridge-synthetic

    • Gaussian Copula method
    • Privacy-preserving data synthesis
    • Quality metrics and validation
    • Standalone package (no dependencies on deepbridge)

Quick Start

Model Validation

from deepbridge.core.experiment import Experiment
from deepbridge.db_data import DBDataset

# Create dataset
dataset = DBDataset(
    data=df,
    target_column='target',
    features=['feature1', 'feature2', 'feature3']
)

# Create experiment
experiment = Experiment(
    name='model_validation',
    dataset=dataset,
    models={'my_model': trained_model}
)

# Run validation tests
robustness_results = experiment.run_test('robustness', config='medium')
uncertainty_results = experiment.run_test('uncertainty', config='medium')

# Generate comprehensive report
experiment.generate_report('robustness', output_dir='./reports')

Model Distillation

Note: Distillation has moved to deepbridge-distillation

pip install deepbridge-distillation
from deepbridge import DBDataset
from deepbridge_distillation import AutoDistiller

# Create dataset with predictions
dataset = DBDataset(
    data=df,
    target_column='target',
    features=features,
    prob_cols=['prob_class_0', 'prob_class_1']
)

# Run automated distillation
distiller = AutoDistiller(
    dataset=dataset,
    output_dir='results',
    test_size=0.2,
    n_trials=10
)
results = distiller.run(use_probabilities=True)

Fairness Testing

from deepbridge.core.experiment import Experiment
from deepbridge.db_data import DBDataset

# Create dataset (model already trained)
dataset = DBDataset(
    data=df,
    target_column='approved',
    model=trained_model
)

# Create experiment with protected attributes
experiment = Experiment(
    dataset=dataset,
    experiment_type="binary_classification",
    tests=["fairness"],
    protected_attributes=['gender', 'race', 'age_group']
)

# Run fairness tests
fairness_result = experiment.run_fairness_tests(config='full')

# Check results
print(f"Overall Fairness Score: {fairness_result.overall_fairness_score:.3f}")
print(f"Critical Issues: {len(fairness_result.critical_issues)}")
print(f"EEOC Compliant: {fairness_result.overall_fairness_score >= 0.80}")

# Generate interactive HTML report
fairness_result.save_html('fairness_report.html', model_name='My Model')

Command-Line Interface

# Run model validation
deepbridge validate --dataset data.csv --model model.pkl --tests all

# Generate reports
deepbridge report --results ./results --output ./reports --format interactive

# Train distilled model (requires deepbridge-distillation)
deepbridge distill train gbm predictions.csv features.csv -s ./models

# Generate synthetic data (requires deepbridge-synthetic)
deepbridge synthetic generate --data original.csv --method gaussian_copula --samples 10000

Requirements

  • Python 3.10-3.12
  • Key Dependencies:
    • numpy >= 2.2.3
    • pandas >= 2.2.3
    • scikit-learn >= 1.6.1
    • xgboost >= 2.1.4
    • scipy >= 1.15.1
    • matplotlib >= 3.10.0
    • seaborn >= 0.13.2
    • plotly >= 6.0.0
    • optuna >= 4.2.1
    • jinja2 >= 3.1.5

Documentation

Full documentation is available at: DeepBridge Documentation

Key Documentation Sections

Quick Links

Fairness Documentation

Contributing

We welcome contributions! Please see our contribution guidelines for details on how to submit pull requests, report issues, and contribute to the project.

  1. Fork the repository
  2. Create your feature branch
  3. Commit your changes
  4. Push to the branch
  5. Open a Pull Request

Recent Updates

  • 2025-11-03: NEW Fairness Module - Complete fairness testing framework with 15 metrics, auto-detection of sensitive attributes, EEOC compliance checks, threshold analysis, and interactive HTML reports. Includes comprehensive documentation, tutorial, and examples.
  • 2025-07-02: Added comprehensive documentation including Implementation Guide, Testing Framework, Report Generation, and complete API Reference
  • 2025-05-15: Fixed static report chart URLs to properly use relative paths with ./ prefix for improved portability across different environments

Development Setup

# Clone the repository
git clone https://github.com/DeepBridge-Validation/DeepBridge.git
cd deepbridge

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Running Tests

pytest tests/

License

MIT License

Citation

If you use DeepBridge in your research, please cite:

@software{deepbridge2025,
  title = {DeepBridge: Advanced Model Validation and Distillation Library},
  author = {Gustavo Haase, Paulo Dourado},
  year = {2025},
  url = {https://github.com/DeepBridge-Validation/DeepBridge}
}

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepbridge-2.0.0.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepbridge-2.0.0-py3-none-any.whl (1.5 MB view details)

Uploaded Python 3

File details

Details for the file deepbridge-2.0.0.tar.gz.

File metadata

  • Download URL: deepbridge-2.0.0.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.10 Linux/6.6.87.1-microsoft-standard-WSL2

File hashes

Hashes for deepbridge-2.0.0.tar.gz
Algorithm Hash digest
SHA256 d702694ae83b373bebc9a5949805d90125905d19e0e08ab2562113dc0cf7d5a5
MD5 c71ad8b045694fdfe816ee4a28fca915
BLAKE2b-256 5cc0f7fdafadd46b0bd97e6cb43e6986cc86dfcb9aa84f774f8a27b1f65124d8

See more details on using hashes here.

File details

Details for the file deepbridge-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: deepbridge-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.10 Linux/6.6.87.1-microsoft-standard-WSL2

File hashes

Hashes for deepbridge-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9c6f50b9738dc3844ea36428d989f56afefd62419a9247f13f8ceec0a25eff76
MD5 71416abd8ceeeedfa1e8b44bf7658aaa
BLAKE2b-256 d9f95ec2be965dd5273668b05091215c94c153454fd6f52defe424743ff4ce74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page