Skip to main content

No project description provided

Project description

DeepBridge

Documentation Status CI PyPI version

DeepBridge is a comprehensive Python library for advanced machine learning model validation, distillation, and performance analysis. It provides powerful tools to manage experiments, validate models, create more efficient model versions, and conduct in-depth performance evaluations.

Installation

You can install DeepBridge using pip:

pip install deepbridge

Or install from source:

git clone https://github.com/DeepBridge-Validation/DeepBridge.git
cd deepbridge
pip install -e .

Key Features

Experiment Framework

  • Modular Architecture: Component-based design with specialized managers
  • Comprehensive Testing: Robustness, uncertainty, and resilience evaluation
  • Feature Selection: Focus testing on specific features with the features_select parameter
  • Test Configuration Levels: Quick, medium, or full test suites via the suite parameter
  • Visualization System: Integrated visualization capabilities
  • Reporting Engine: Detailed HTML report generation

Model Validation

  • Multi-faceted Evaluation: Assess models across multiple dimensions
  • Alternative Model Comparison: Generate and compare different model types
  • Metrics Analysis: Comprehensive performance metrics
  • Visualization Tools: Interactive plots for model analysis

Model Distillation

  • Knowledge Distillation: Transfer knowledge from complex to simpler models
  • Surrogate Modeling: Create lightweight approximations of complex models
  • Hyperparameter Optimization: Automated tuning of student models
  • Distribution Matching: Ensure student models faithfully reproduce teacher distributions

Advanced Analytics

  • Robustness Testing: Evaluate model stability under perturbations
  • Uncertainty Quantification: Assess model confidence and calibration
  • Resilience Analysis: Test models under adverse conditions
  • Hyperparameter Importance: Identify critical hyperparameters

Architecture Overview

DeepBridge has been redesigned with a modular, component-based architecture:

┌─────────────────┐
│   Experiment    │
└───────┬─────────┘
        │
        ▼
┌─────────────────────────────────────────────────────────┐
│                                                         │
│  ┌───────────┐  ┌────────────┐  ┌──────────────┐        │
│  │DataManager│  │ModelManager│  │TestRunner    │        │
│  └───────────┘  └────────────┘  └──────────────┘        │
│                                                         │
│  ┌───────────┐  ┌─────────────┐  ┌────────────────┐     │
│  │ModelEval  │  │ReportGen    │  │VisualizationMgr│     │
│  └───────────┘  └─────────────┘  └────────────────┘     │
│                                                         │
└─────────────────────────────────────────────────────────┘

Key Components

  • Experiment: Central coordinator managing the experiment workflow
  • DataManager: Handles data preparation and splitting
  • ModelManager: Creates and manages models and distillation
  • TestRunner: Coordinates test execution across managers
  • ModelEvaluation: Calculates metrics and evaluates models
  • ReportGenerator: Creates HTML reports with results
  • VisualizationManager: Coordinates visualization creation

Quick Start

Comprehensive Experiment

from deepbridge.core.experiment import Experiment
from deepbridge.core.db_data import DBDataset

# Create dataset
dataset = DBDataset(
    data=df,
    target_column='target',
    features=features
)

# Initialize experiment with tests
experiment = Experiment(
    dataset=dataset,
    experiment_type='binary_classification',
    tests=['robustness', 'uncertainty'],
    features_select=['feature1', 'feature2', 'feature3'],  # Optional: Specify features to focus on
    suite='medium'  # Optional: Run tests immediately with this configuration
)

# Train a distilled model
experiment.fit(
    student_model_type='random_forest',
    distillation_method='knowledge_distillation',
    temperature=2.0
)

# If suite parameter wasn't provided, run tests manually
# experiment.run_tests(config_name='medium')
robustness_plot = experiment.plot_robustness_comparison()

# Save comprehensive report
experiment.save_report('experiment_report.html')

Direct Model Distillation

from deepbridge.distillation.techniques.knowledge_distillation import KnowledgeDistillation

# Create distiller directly
distiller = KnowledgeDistillation(
    teacher_model=teacher_model,
    student_model_type='gbm',
    temperature=2.0,
    alpha=0.5
)

# Train the distilled model
distiller.fit(X_train, y_train)

# Make predictions
predictions = distiller.predict(X_test)

Automated Distillation

from deepbridge.auto_distiller import AutoDistiller
from deepbridge.db_data import DBDataset

# Create dataset with probabilities
dataset = DBDataset(
    data=df,
    target_column='target',
    features=features,
    prob_cols=['prob_class_0', 'prob_class_1']
)

# Run automated distillation
distiller = AutoDistiller(
    dataset=dataset,
    output_dir='results',
    test_size=0.2,
    n_trials=10
)
results = distiller.run(use_probabilities=True)

Command-Line Interface

# Create experiment
deepbridge validation create my_experiment --path ./experiments

# Train distilled model
deepbridge distill train gbm predictions.csv features.csv -s ./models

# Run robustness tests
deepbridge validation test robustness my_experiment --config medium

New Features in This Release

Simplified Experiment Configuration

The Experiment class now supports two new parameters to streamline your workflow:

  1. features_select: Focus your analysis on specific features of interest

    # Only test these specific features
    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness'],
        features_select=['feature_1', 'feature_2', 'feature_3']
    )
    
  2. suite: Automatically run tests at initialization with a specific configuration level

    # Tests run automatically with 'quick' configuration
    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness'],
        suite='quick'
    )
    # Results are immediately available in experiment.full_results
    
  3. Combined usage: Use both parameters together for maximum efficiency

    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness', 'uncertainty'],
        features_select=['feature_1', 'feature_2'],
        suite='medium'
    )
    # Now run the report generation directly
    experiment.full_results.save_report("report.html")
    

Documentation

Full documentation available at: DeepBridge Documentation

The documentation includes:

  • API Reference
  • Architecture Guides
  • Tutorial Notebooks
  • Examples

Requirements

  • Python 3.8+
  • Key Dependencies:
    • numpy
    • pandas
    • scikit-learn
    • xgboost
    • scipy
    • plotly
    • optuna

Contributing

We welcome contributions! Please see our contribution guidelines for details on how to submit pull requests, report issues, and contribute to the project.

  1. Fork the repository
  2. Create your feature branch
  3. Commit your changes
  4. Push to the branch
  5. Open a Pull Request

Development Setup

# Clone the repository
git clone https://github.com/DeepBridge-Validation/DeepBridge.git
cd deepbridge

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Running Tests

pytest tests/

License

MIT License

Citation

If you use DeepBridge in your research, please cite:

@software{deepbridge2025,
  title = {DeepBridge: Advanced Model Validation and Distillation Library},
  author = {Gustavo Haase, Paulo Dourado},
  year = {2025},
  url = {https://github.com/DeepBridge-Validation/DeepBridge}
}

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepbridge-0.1.24.tar.gz (763.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepbridge-0.1.24-py3-none-any.whl (824.8 kB view details)

Uploaded Python 3

File details

Details for the file deepbridge-0.1.24.tar.gz.

File metadata

  • Download URL: deepbridge-0.1.24.tar.gz
  • Upload date:
  • Size: 763.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for deepbridge-0.1.24.tar.gz
Algorithm Hash digest
SHA256 f7c80d363b1455aa4de81375a54f71be5c898b0ee96f7c64c73e1a954f877c37
MD5 97f160ccb54bf2ff9292a3fd709dc4c2
BLAKE2b-256 d167074af1291b63f51cd53451454a731e977c34b7353de7c890ee45a4e08317

See more details on using hashes here.

File details

Details for the file deepbridge-0.1.24-py3-none-any.whl.

File metadata

  • Download URL: deepbridge-0.1.24-py3-none-any.whl
  • Upload date:
  • Size: 824.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for deepbridge-0.1.24-py3-none-any.whl
Algorithm Hash digest
SHA256 ea2c2fe126a2182b38e8beba5b60f03cde8bd0238dd957aff311fc389b080a8a
MD5 61ce7dd08cfdbb4eafefdc9dae5f1cfc
BLAKE2b-256 7a4ae5fe42b3aa972fbd137f0d7304580a91ee090649c8d49ecbd3df06fc3aa6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page