Skip to main content

No project description provided

Project description

DeepBridge

Documentation Status CI PyPI version

DeepBridge is a comprehensive Python library for advanced machine learning model validation, distillation, and performance analysis. It provides powerful tools to manage experiments, validate models, create more efficient model versions, and conduct in-depth performance evaluations.

Installation

You can install DeepBridge using pip:

pip install deepbridge

Or install from source:

git clone https://github.com/DeepBridge-Validation/DeepBridge.git
cd deepbridge
pip install -e .

Key Features

Experiment Framework

  • Modular Architecture: Component-based design with specialized managers
  • Comprehensive Testing: Robustness, uncertainty, and resilience evaluation
  • Feature Selection: Focus testing on specific features with the features_select parameter
  • Test Configuration Levels: Quick, medium, or full test suites via the suite parameter
  • Visualization System: Integrated visualization capabilities
  • Reporting Engine: Detailed HTML report generation

Model Validation

  • Multi-faceted Evaluation: Assess models across multiple dimensions
  • Alternative Model Comparison: Generate and compare different model types
  • Metrics Analysis: Comprehensive performance metrics
  • Visualization Tools: Interactive plots for model analysis

Model Distillation

  • Knowledge Distillation: Transfer knowledge from complex to simpler models
  • Surrogate Modeling: Create lightweight approximations of complex models
  • Hyperparameter Optimization: Automated tuning of student models
  • Distribution Matching: Ensure student models faithfully reproduce teacher distributions

Advanced Analytics

  • Robustness Testing: Evaluate model stability under perturbations
  • Uncertainty Quantification: Assess model confidence and calibration
  • Resilience Analysis: Test models under adverse conditions
  • Hyperparameter Importance: Identify critical hyperparameters

Architecture Overview

DeepBridge has been redesigned with a modular, component-based architecture:

┌─────────────────┐
│   Experiment    │
└───────┬─────────┘
        │
        ▼
┌─────────────────────────────────────────────────────────┐
│                                                         │
│  ┌───────────┐  ┌────────────┐  ┌──────────────┐        │
│  │DataManager│  │ModelManager│  │TestRunner    │        │
│  └───────────┘  └────────────┘  └──────────────┘        │
│                                                         │
│  ┌───────────┐  ┌─────────────┐  ┌────────────────┐     │
│  │ModelEval  │  │ReportGen    │  │VisualizationMgr│     │
│  └───────────┘  └─────────────┘  └────────────────┘     │
│                                                         │
└─────────────────────────────────────────────────────────┘

Key Components

  • Experiment: Central coordinator managing the experiment workflow
  • DataManager: Handles data preparation and splitting
  • ModelManager: Creates and manages models and distillation
  • TestRunner: Coordinates test execution across managers
  • ModelEvaluation: Calculates metrics and evaluates models
  • ReportGenerator: Creates HTML reports with results
  • VisualizationManager: Coordinates visualization creation

Quick Start

Comprehensive Experiment

from deepbridge.core.experiment import Experiment
from deepbridge.core.db_data import DBDataset

# Create dataset
dataset = DBDataset(
    data=df,
    target_column='target',
    features=features
)

# Initialize experiment with tests
experiment = Experiment(
    dataset=dataset,
    experiment_type='binary_classification',
    tests=['robustness', 'uncertainty'],
    features_select=['feature1', 'feature2', 'feature3'],  # Optional: Specify features to focus on
    suite='medium'  # Optional: Run tests immediately with this configuration
)

# Train a distilled model
experiment.fit(
    student_model_type='random_forest',
    distillation_method='knowledge_distillation',
    temperature=2.0
)

# If suite parameter wasn't provided, run tests manually
# experiment.run_tests(config_name='medium')
robustness_plot = experiment.plot_robustness_comparison()

# Save comprehensive report
experiment.save_report('experiment_report.html')

Direct Model Distillation

from deepbridge.distillation.techniques.knowledge_distillation import KnowledgeDistillation

# Create distiller directly
distiller = KnowledgeDistillation(
    teacher_model=teacher_model,
    student_model_type='gbm',
    temperature=2.0,
    alpha=0.5
)

# Train the distilled model
distiller.fit(X_train, y_train)

# Make predictions
predictions = distiller.predict(X_test)

Automated Distillation

from deepbridge.auto_distiller import AutoDistiller
from deepbridge.db_data import DBDataset

# Create dataset with probabilities
dataset = DBDataset(
    data=df,
    target_column='target',
    features=features,
    prob_cols=['prob_class_0', 'prob_class_1']
)

# Run automated distillation
distiller = AutoDistiller(
    dataset=dataset,
    output_dir='results',
    test_size=0.2,
    n_trials=10
)
results = distiller.run(use_probabilities=True)

Command-Line Interface

# Create experiment
deepbridge validation create my_experiment --path ./experiments

# Train distilled model
deepbridge distill train gbm predictions.csv features.csv -s ./models

# Run robustness tests
deepbridge validation test robustness my_experiment --config medium

New Features in This Release

Simplified Experiment Configuration

The Experiment class now supports two new parameters to streamline your workflow:

  1. features_select: Focus your analysis on specific features of interest

    # Only test these specific features
    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness'],
        features_select=['feature_1', 'feature_2', 'feature_3']
    )
    
  2. suite: Automatically run tests at initialization with a specific configuration level

    # Tests run automatically with 'quick' configuration
    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness'],
        suite='quick'
    )
    # Results are immediately available in experiment.full_results
    
  3. Combined usage: Use both parameters together for maximum efficiency

    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness', 'uncertainty'],
        features_select=['feature_1', 'feature_2'],
        suite='medium'
    )
    # Now run the report generation directly
    experiment.full_results.save_report("report.html")
    

Documentation

Full documentation available at: DeepBridge Documentation

The documentation includes:

  • API Reference
  • Architecture Guides
  • Tutorial Notebooks
  • Examples

Requirements

  • Python 3.8+
  • Key Dependencies:
    • numpy
    • pandas
    • scikit-learn
    • xgboost
    • scipy
    • plotly
    • optuna

Contributing

We welcome contributions! Please see our contribution guidelines for details on how to submit pull requests, report issues, and contribute to the project.

  1. Fork the repository
  2. Create your feature branch
  3. Commit your changes
  4. Push to the branch
  5. Open a Pull Request

Development Setup

# Clone the repository
git clone https://github.com/DeepBridge-Validation/DeepBridge.git
cd deepbridge

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Running Tests

pytest tests/

License

MIT License

Citation

If you use DeepBridge in your research, please cite:

@software{deepbridge2025,
  title = {DeepBridge: Advanced Model Validation and Distillation Library},
  author = {Gustavo Haase, Paulo Dourado},
  year = {2025},
  url = {https://github.com/DeepBridge-Validation/DeepBridge}
}

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepbridge-0.1.33.tar.gz (764.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepbridge-0.1.33-py3-none-any.whl (826.3 kB view details)

Uploaded Python 3

File details

Details for the file deepbridge-0.1.33.tar.gz.

File metadata

  • Download URL: deepbridge-0.1.33.tar.gz
  • Upload date:
  • Size: 764.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for deepbridge-0.1.33.tar.gz
Algorithm Hash digest
SHA256 995e4268111a91ec43f47b59ebf23befed6741aefda56cd42286f5ca58391c25
MD5 e0a10316b4da352fff89aefd5bc2b653
BLAKE2b-256 4be4d98e83e7f4dfc066e82ac396d431134e3af6aa0ac2628ed7012004bcd0bc

See more details on using hashes here.

File details

Details for the file deepbridge-0.1.33-py3-none-any.whl.

File metadata

  • Download URL: deepbridge-0.1.33-py3-none-any.whl
  • Upload date:
  • Size: 826.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for deepbridge-0.1.33-py3-none-any.whl
Algorithm Hash digest
SHA256 14f5c67ab202f6dd406ae26ae52636345193d9b43c3efea97c3550e3e7e9407e
MD5 52f02412b76f7bdf2e9aa7861350600c
BLAKE2b-256 8f31955f68650205b3e2e8be73e32936337373dc3d790545fff6f3679fccdc1d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page