Skip to main content

No project description provided

Project description

DeepBridge

Documentation Status CI PyPI version

DeepBridge is a comprehensive Python library for advanced machine learning model validation, distillation, and performance analysis. It provides powerful tools to manage experiments, validate models, create more efficient model versions, and conduct in-depth performance evaluations.

Installation

You can install DeepBridge using pip:

pip install deepbridge

Or install from source:

git clone https://github.com/DeepBridge-Validation/DeepBridge.git
cd deepbridge
pip install -e .

Key Features

Experiment Framework

  • Modular Architecture: Component-based design with specialized managers
  • Comprehensive Testing: Robustness, uncertainty, and resilience evaluation
  • Feature Selection: Focus testing on specific features with the features_select parameter
  • Test Configuration Levels: Quick, medium, or full test suites via the suite parameter
  • Visualization System: Integrated visualization capabilities
  • Reporting Engine: Detailed HTML report generation

Model Validation

  • Multi-faceted Evaluation: Assess models across multiple dimensions
  • Alternative Model Comparison: Generate and compare different model types
  • Metrics Analysis: Comprehensive performance metrics
  • Visualization Tools: Interactive plots for model analysis

Model Distillation

  • Knowledge Distillation: Transfer knowledge from complex to simpler models
  • Surrogate Modeling: Create lightweight approximations of complex models
  • Hyperparameter Optimization: Automated tuning of student models
  • Distribution Matching: Ensure student models faithfully reproduce teacher distributions

Advanced Analytics

  • Robustness Testing: Evaluate model stability under perturbations
  • Uncertainty Quantification: Assess model confidence and calibration
  • Resilience Analysis: Test models under adverse conditions
  • Hyperparameter Importance: Identify critical hyperparameters

Architecture Overview

DeepBridge has been redesigned with a modular, component-based architecture:

┌─────────────────┐
│   Experiment    │
└───────┬─────────┘
        │
        ▼
┌─────────────────────────────────────────────────────────┐
│                                                         │
│  ┌───────────┐  ┌────────────┐  ┌──────────────┐        │
│  │DataManager│  │ModelManager│  │TestRunner    │        │
│  └───────────┘  └────────────┘  └──────────────┘        │
│                                                         │
│  ┌───────────┐  ┌─────────────┐  ┌────────────────┐     │
│  │ModelEval  │  │ReportGen    │  │VisualizationMgr│     │
│  └───────────┘  └─────────────┘  └────────────────┘     │
│                                                         │
└─────────────────────────────────────────────────────────┘

Key Components

  • Experiment: Central coordinator managing the experiment workflow
  • DataManager: Handles data preparation and splitting
  • ModelManager: Creates and manages models and distillation
  • TestRunner: Coordinates test execution across managers
  • ModelEvaluation: Calculates metrics and evaluates models
  • ReportGenerator: Creates HTML reports with results
  • VisualizationManager: Coordinates visualization creation

Quick Start

Comprehensive Experiment

from deepbridge.core.experiment import Experiment
from deepbridge.core.db_data import DBDataset

# Create dataset
dataset = DBDataset(
    data=df,
    target_column='target',
    features=features
)

# Initialize experiment with tests
experiment = Experiment(
    dataset=dataset,
    experiment_type='binary_classification',
    tests=['robustness', 'uncertainty'],
    features_select=['feature1', 'feature2', 'feature3'],  # Optional: Specify features to focus on
    suite='medium'  # Optional: Run tests immediately with this configuration
)

# Train a distilled model
experiment.fit(
    student_model_type='random_forest',
    distillation_method='knowledge_distillation',
    temperature=2.0
)

# If suite parameter wasn't provided, run tests manually
# experiment.run_tests(config_name='medium')
robustness_plot = experiment.plot_robustness_comparison()

# Save comprehensive report
experiment.save_report('experiment_report.html')

Direct Model Distillation

from deepbridge.distillation.techniques.knowledge_distillation import KnowledgeDistillation

# Create distiller directly
distiller = KnowledgeDistillation(
    teacher_model=teacher_model,
    student_model_type='gbm',
    temperature=2.0,
    alpha=0.5
)

# Train the distilled model
distiller.fit(X_train, y_train)

# Make predictions
predictions = distiller.predict(X_test)

Automated Distillation

from deepbridge.auto_distiller import AutoDistiller
from deepbridge.db_data import DBDataset

# Create dataset with probabilities
dataset = DBDataset(
    data=df,
    target_column='target',
    features=features,
    prob_cols=['prob_class_0', 'prob_class_1']
)

# Run automated distillation
distiller = AutoDistiller(
    dataset=dataset,
    output_dir='results',
    test_size=0.2,
    n_trials=10
)
results = distiller.run(use_probabilities=True)

Command-Line Interface

# Create experiment
deepbridge validation create my_experiment --path ./experiments

# Train distilled model
deepbridge distill train gbm predictions.csv features.csv -s ./models

# Run robustness tests
deepbridge validation test robustness my_experiment --config medium

New Features in This Release

Simplified Experiment Configuration

The Experiment class now supports two new parameters to streamline your workflow:

  1. features_select: Focus your analysis on specific features of interest

    # Only test these specific features
    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness'],
        features_select=['feature_1', 'feature_2', 'feature_3']
    )
    
  2. suite: Automatically run tests at initialization with a specific configuration level

    # Tests run automatically with 'quick' configuration
    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness'],
        suite='quick'
    )
    # Results are immediately available in experiment.full_results
    
  3. Combined usage: Use both parameters together for maximum efficiency

    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness', 'uncertainty'],
        features_select=['feature_1', 'feature_2'],
        suite='medium'
    )
    # Now run the report generation directly
    experiment.full_results.save_report("report.html")
    

Documentation

Full documentation available at: DeepBridge Documentation

The documentation includes:

  • API Reference
  • Architecture Guides
  • Tutorial Notebooks
  • Examples

Requirements

  • Python 3.8+
  • Key Dependencies:
    • numpy
    • pandas
    • scikit-learn
    • xgboost
    • scipy
    • plotly
    • optuna

Contributing

We welcome contributions! Please see our contribution guidelines for details on how to submit pull requests, report issues, and contribute to the project.

  1. Fork the repository
  2. Create your feature branch
  3. Commit your changes
  4. Push to the branch
  5. Open a Pull Request

Development Setup

# Clone the repository
git clone https://github.com/DeepBridge-Validation/DeepBridge.git
cd deepbridge

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Running Tests

pytest tests/

License

MIT License

Citation

If you use DeepBridge in your research, please cite:

@software{deepbridge2025,
  title = {DeepBridge: Advanced Model Validation and Distillation Library},
  author = {Gustavo Haase, Paulo Dourado},
  year = {2025},
  url = {https://github.com/DeepBridge-Validation/DeepBridge}
}

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepbridge-0.1.31.tar.gz (764.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepbridge-0.1.31-py3-none-any.whl (825.6 kB view details)

Uploaded Python 3

File details

Details for the file deepbridge-0.1.31.tar.gz.

File metadata

  • Download URL: deepbridge-0.1.31.tar.gz
  • Upload date:
  • Size: 764.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for deepbridge-0.1.31.tar.gz
Algorithm Hash digest
SHA256 ce52181d95f58af6fcac15bad779d4fd35e2ca223a16c5ff7ceb2c5dd3686b3f
MD5 ab1764d5ccffad4b181d4b9708c7cc26
BLAKE2b-256 c41233cb97bf1f04ae93343bc3ddd8c246343f912ddd0f94e626dde2ef023b71

See more details on using hashes here.

File details

Details for the file deepbridge-0.1.31-py3-none-any.whl.

File metadata

  • Download URL: deepbridge-0.1.31-py3-none-any.whl
  • Upload date:
  • Size: 825.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for deepbridge-0.1.31-py3-none-any.whl
Algorithm Hash digest
SHA256 8e55079f61ba73d18a2aa30164f77ce86921618d29ef4d1711e53cb92f825c67
MD5 5cd3bcc8691c578caef11c432b3df7e4
BLAKE2b-256 6cccea53443ceb229d41792c2e000738843e0b517459401b8ffe37e63e89a5f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page