Skip to main content

No project description provided

Project description

DeepBridge

Documentation Status CI PyPI version

DeepBridge is a comprehensive Python library for advanced machine learning model validation, distillation, and performance analysis. It provides powerful tools to manage experiments, validate models, create more efficient model versions, and conduct in-depth performance evaluations.

Installation

You can install DeepBridge using pip:

pip install deepbridge

Or install from source:

git clone https://github.com/DeepBridge-Validation/DeepBridge.git
cd deepbridge
pip install -e .

Key Features

Experiment Framework

  • Modular Architecture: Component-based design with specialized managers
  • Comprehensive Testing: Robustness, uncertainty, and resilience evaluation
  • Feature Selection: Focus testing on specific features with the features_select parameter
  • Test Configuration Levels: Quick, medium, or full test suites via the suite parameter
  • Visualization System: Integrated visualization capabilities
  • Reporting Engine: Detailed HTML report generation

Model Validation

  • Multi-faceted Evaluation: Assess models across multiple dimensions
  • Alternative Model Comparison: Generate and compare different model types
  • Metrics Analysis: Comprehensive performance metrics
  • Visualization Tools: Interactive plots for model analysis

Model Distillation

  • Knowledge Distillation: Transfer knowledge from complex to simpler models
  • Surrogate Modeling: Create lightweight approximations of complex models
  • Hyperparameter Optimization: Automated tuning of student models
  • Distribution Matching: Ensure student models faithfully reproduce teacher distributions

Advanced Analytics

  • Robustness Testing: Evaluate model stability under perturbations
  • Uncertainty Quantification: Assess model confidence and calibration
  • Resilience Analysis: Test models under adverse conditions
  • Hyperparameter Importance: Identify critical hyperparameters

Architecture Overview

DeepBridge has been redesigned with a modular, component-based architecture:

┌─────────────────┐
│   Experiment    │
└───────┬─────────┘
        │
        ▼
┌─────────────────────────────────────────────────────────┐
│                                                         │
│  ┌───────────┐  ┌────────────┐  ┌──────────────┐        │
│  │DataManager│  │ModelManager│  │TestRunner    │        │
│  └───────────┘  └────────────┘  └──────────────┘        │
│                                                         │
│  ┌───────────┐  ┌─────────────┐  ┌────────────────┐     │
│  │ModelEval  │  │ReportGen    │  │VisualizationMgr│     │
│  └───────────┘  └─────────────┘  └────────────────┘     │
│                                                         │
└─────────────────────────────────────────────────────────┘

Key Components

  • Experiment: Central coordinator managing the experiment workflow
  • DataManager: Handles data preparation and splitting
  • ModelManager: Creates and manages models and distillation
  • TestRunner: Coordinates test execution across managers
  • ModelEvaluation: Calculates metrics and evaluates models
  • ReportGenerator: Creates HTML reports with results
  • VisualizationManager: Coordinates visualization creation

Quick Start

Comprehensive Experiment

from deepbridge.core.experiment import Experiment
from deepbridge.core.db_data import DBDataset

# Create dataset
dataset = DBDataset(
    data=df,
    target_column='target',
    features=features
)

# Initialize experiment with tests
experiment = Experiment(
    dataset=dataset,
    experiment_type='binary_classification',
    tests=['robustness', 'uncertainty'],
    features_select=['feature1', 'feature2', 'feature3'],  # Optional: Specify features to focus on
    suite='medium'  # Optional: Run tests immediately with this configuration
)

# Train a distilled model
experiment.fit(
    student_model_type='random_forest',
    distillation_method='knowledge_distillation',
    temperature=2.0
)

# If suite parameter wasn't provided, run tests manually
# experiment.run_tests(config_name='medium')
robustness_plot = experiment.plot_robustness_comparison()

# Save comprehensive report
experiment.save_report('experiment_report.html')

Direct Model Distillation

from deepbridge.distillation.techniques.knowledge_distillation import KnowledgeDistillation

# Create distiller directly
distiller = KnowledgeDistillation(
    teacher_model=teacher_model,
    student_model_type='gbm',
    temperature=2.0,
    alpha=0.5
)

# Train the distilled model
distiller.fit(X_train, y_train)

# Make predictions
predictions = distiller.predict(X_test)

Automated Distillation

from deepbridge.auto_distiller import AutoDistiller
from deepbridge.db_data import DBDataset

# Create dataset with probabilities
dataset = DBDataset(
    data=df,
    target_column='target',
    features=features,
    prob_cols=['prob_class_0', 'prob_class_1']
)

# Run automated distillation
distiller = AutoDistiller(
    dataset=dataset,
    output_dir='results',
    test_size=0.2,
    n_trials=10
)
results = distiller.run(use_probabilities=True)

Command-Line Interface

# Create experiment
deepbridge validation create my_experiment --path ./experiments

# Train distilled model
deepbridge distill train gbm predictions.csv features.csv -s ./models

# Run robustness tests
deepbridge validation test robustness my_experiment --config medium

New Features in This Release

Simplified Experiment Configuration

The Experiment class now supports two new parameters to streamline your workflow:

  1. features_select: Focus your analysis on specific features of interest

    # Only test these specific features
    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness'],
        features_select=['feature_1', 'feature_2', 'feature_3']
    )
    
  2. suite: Automatically run tests at initialization with a specific configuration level

    # Tests run automatically with 'quick' configuration
    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness'],
        suite='quick'
    )
    # Results are immediately available in experiment.full_results
    
  3. Combined usage: Use both parameters together for maximum efficiency

    experiment = Experiment(
        dataset=dataset,
        experiment_type='binary_classification',
        tests=['robustness', 'uncertainty'],
        features_select=['feature_1', 'feature_2'],
        suite='medium'
    )
    # Now run the report generation directly
    experiment.full_results.save_report("report.html")
    

Documentation

Full documentation available at: DeepBridge Documentation

The documentation includes:

  • API Reference
  • Architecture Guides
  • Tutorial Notebooks
  • Examples

Requirements

  • Python 3.8+
  • Key Dependencies:
    • numpy
    • pandas
    • scikit-learn
    • xgboost
    • scipy
    • plotly
    • optuna

Contributing

We welcome contributions! Please see our contribution guidelines for details on how to submit pull requests, report issues, and contribute to the project.

  1. Fork the repository
  2. Create your feature branch
  3. Commit your changes
  4. Push to the branch
  5. Open a Pull Request

Development Setup

# Clone the repository
git clone https://github.com/DeepBridge-Validation/DeepBridge.git
cd deepbridge

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Running Tests

pytest tests/

License

MIT License

Citation

If you use DeepBridge in your research, please cite:

@software{deepbridge2025,
  title = {DeepBridge: Advanced Model Validation and Distillation Library},
  author = {Gustavo Haase, Paulo Dourado},
  year = {2025},
  url = {https://github.com/DeepBridge-Validation/DeepBridge}
}

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepbridge-0.1.21.tar.gz (759.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepbridge-0.1.21-py3-none-any.whl (821.2 kB view details)

Uploaded Python 3

File details

Details for the file deepbridge-0.1.21.tar.gz.

File metadata

  • Download URL: deepbridge-0.1.21.tar.gz
  • Upload date:
  • Size: 759.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for deepbridge-0.1.21.tar.gz
Algorithm Hash digest
SHA256 6fa9ed5626e7d7686f01ff2ffc97f9c9de94095f8822d4a3b9e9d00f204aa2f7
MD5 fb3825da234ea9e3846e6a20cf9060e6
BLAKE2b-256 8dd0e5b07a54d81fdd8eff987cee2e03d560c81cb695198a91c66990c8b80898

See more details on using hashes here.

File details

Details for the file deepbridge-0.1.21-py3-none-any.whl.

File metadata

  • Download URL: deepbridge-0.1.21-py3-none-any.whl
  • Upload date:
  • Size: 821.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for deepbridge-0.1.21-py3-none-any.whl
Algorithm Hash digest
SHA256 8c645f8ab4389bddfbc9592178e54e71ee7011266ba46b532f7c11972bae3c65
MD5 af6ccd9e8ae48f7f362ea0bf846922a9
BLAKE2b-256 c96861e8bbfef49a5db45c171e088c3df06dfc4debc556396e704959cca59a42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page