Skip to main content

A comprehensive benchmarking framework for discrete choice models

Project description

Discrete Choice Model Benchmarking (DCMBench)

A comprehensive Python package for benchmarking, analyzing, and validating discrete choice models for transportation mode choice analysis. DCMBench provides a unified framework for comparing different model specifications, conducting sensitivity analysis, and visualizing results.

Installation

You can install DCMBench using pip:

pip install dcmbench

Key Features

Model Estimation and Benchmarking

  • Multiple Model Types: Support for Multinomial Logit (MNL), Nested Logit (NL), and Mixed Logit (ML) models
  • Standardized Metrics: Compare models using log-likelihood, rho-squared, prediction accuracy, and market share
  • Cross-validation: Evaluate model performance on training and testing datasets
  • Visualization: Generate comparative plots showing model performance across different metrics
from dcmbench.model_benchmarker import Benchmarker
from dcmbench.datasets import fetch_data

# Load dataset (automatically downloads if not in local cache)
data = fetch_data("swissmetro_dataset")

# Define models and run benchmark
benchmarker = Benchmarker()
benchmarker.register_model(models, "Model Name")
results = benchmarker.run_benchmark(data, choice_column="CHOICE")
benchmarker.print_comparison()

Advanced Analysis Capabilities

  • Sensitivity Analysis: Evaluate how model predictions change with variations in key variables

    • Simulate changes in travel times, costs, and other attributes
    • Generate plots showing the evolution of market shares under different scenarios
  • Individual-Level Parameters: For Mixed Logit models, calculate and visualize:

    • Individual-specific parameter distributions using Bayesian approaches
    • Value of Time (VOT) distributions across the population
    • Heterogeneity in preference structures
  • Model Calibration: Automatically calibrate Alternative Specific Constants (ASCs) to match observed market shares

Visualization and Reporting

  • Market Share Analysis: Compare predicted vs. observed mode shares
  • Performance Plots: Visualize how different models perform across multiple datasets
  • Parameter Distributions: Plot distributions of random parameters and derived metrics like VOT
  • Sensitivity Curves: Show how predicted mode shares change with variations in key variables

Supported Datasets

  • Swissmetro (swissmetro_dataset): Swiss inter-city travel mode choice
  • London Transport (ltds_dataset): London Travel Demand Survey with urban mode choices
  • ModeCanada (modecanada_dataset): Canadian inter-city travel dataset

Datasets are automatically downloaded from the dcmbench-datasets repository on first use and cached locally:

from dcmbench.datasets import fetch_data

# Use default cache location (~/.dcmbench/datasets)
data = fetch_data("swissmetro_dataset")

# Specify custom cache location
data = fetch_data("swissmetro_dataset", local_cache_dir="/path/to/cache")

# Get features and target separately
X, y = fetch_data("swissmetro_dataset", return_X_y=True)

Example Applications

Benchmarking Multiple Models

The package includes tools to benchmark multiple model types across different datasets:

# Run benchmark_all_models.py to compare models across datasets
python benchmark_all_models.py

This generates comparative visualizations showing how different model types perform across datasets, plotting metrics like choice accuracy and market share accuracy against model fit.

Sensitivity Analysis

Analyze how changes in key variables affect predicted mode shares:

# Run sensitivity analysis on ModeCanada models
python sensitivity_analysis.py

This creates plots showing the evolution of mode shares as you modify variables like:

  • Travel costs for different modes
  • Travel times
  • Service frequencies

Individual Parameter Analysis

For Mixed Logit models, analyze individual-level parameters and VOT:

# Generate individual parameter distributions for ModeCanada
python plot_individual_parameters_canada.py

This calculates individual-specific parameters using Bayesian conditioning and produces:

  • Distributions of time and cost parameters
  • Value of Time (VOT) distributions
  • Summary statistics for preference heterogeneity

Requirements

  • Python >=3.8
  • NumPy >=2.0.0
  • Pandas >=2.0.0
  • Biogeme >=3.2.14
  • Matplotlib >=3.0.0
  • Requests >=2.25.0
  • SciPy (for statistical functions)
  • Seaborn (for advanced visualizations)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

We welcome contributions! Please see our contributing guidelines for details.

Adding New Datasets

To add a new dataset to the DCMBench ecosystem:

  1. Fork the dcmbench-datasets repository
  2. Add your dataset following the structure guidelines in the repository's CONTRIBUTING.md file
  3. Submit a pull request to the dcmbench-datasets repository
  4. Update the metadata.json file in the main DCMBench package to include your dataset information

This design allows the package to remain lightweight while providing access to a growing collection of transportation mode choice datasets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dcmbench-0.1.2.tar.gz (112.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dcmbench-0.1.2-py3-none-any.whl (127.5 kB view details)

Uploaded Python 3

File details

Details for the file dcmbench-0.1.2.tar.gz.

File metadata

  • Download URL: dcmbench-0.1.2.tar.gz
  • Upload date:
  • Size: 112.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.15

File hashes

Hashes for dcmbench-0.1.2.tar.gz
Algorithm Hash digest
SHA256 64a117495eb6a265b9c53898cf8100c07b8423988eb2ef648c6a94ba3f5be6df
MD5 70c7bb6f93a7d3641e69ea7744231619
BLAKE2b-256 6085c08e972c2dd2f6463abcd1e4ef939a27dc6970339a05070fb1527555d999

See more details on using hashes here.

File details

Details for the file dcmbench-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: dcmbench-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 127.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.15

File hashes

Hashes for dcmbench-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 175e91b325a6df5b6af8f4dc8d281764a2791312f229f87a807509dab1dcc102
MD5 5a413fd24338699b83adf939a83a29f1
BLAKE2b-256 16b8bae189cbe342e5fb657162fba14e458d9337b5bfe108a84cd2b2bc757d4c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page