Skip to main content

Large-scale choice modeling through the lens of machine learning.

Project description

Large-scale choice modeling through the lens of machine learning

CI status Linting , formatting, imports sorting: ruff security: bandit Pre-commit

PyPI - Python Version PyPI - Version PyPI - License

Choice-Learn is a Python package designed to help you formulate, estimate, and deploy discrete choice models, e.g., for assortment planning. The package provides ready-to-use datasets and models studied in the academic literature. It also provides a lower level use if you wish to customize the specification of the choice model or formulate your own model from scratch. Choice-Learn efficiently handles large-scale choice data by limiting RAM usage.

Choice-Learn uses NumPy and pandas as data backend engines and TensorFlow for models.

:trident: Table of Contents

:trident: Introduction - Discrete Choice Modelling

Discrete choice models aim at explaining or predicting choices over a set of alternatives. Well known use-cases include analyzing people's choice of mean of transport or products purchases in stores.

If you are new to choice modelling, you can check this resource. The different notebooks from the Getting Started section can also help you understand choice modelling and more importantly help you for your usecase.

:trident: What's in there ?

Data

Model estimation

Auxiliary tools

  • Assortment & Pricing optimization algorithms [Example] [8]

:trident: Getting Started

You can find the following tutorials to help you getting started with the package:

:trident: Installation

User installation

To install the required packages in a virtual environment, run the following command:

The easiest is to pip-install the package:

pip install choice-learn

Otherwise you can use the git repository to get the latest version:

git clone git@github.com:artefactory/choice-learn.git

Dependencies

For manual installation, Choice-Learn requires the following:

  • Python (>=3.9, <3.13)
  • NumPy (>=1.24)
  • pandas (>=1.5)

For modelling you need:

  • TensorFlow (>=2.14, <2.17)

:warning: Warning: If you are a MAC user with a M1 or M2 chip, importing TensorFlow might lead to Python crashing. In such case, use anaconda to install TensorFlow with conda install -c apple tensorflow.

An optional requirement used for coefficients analysis and L-BFGS optimization is:

  • TensorFlow Probability (>=0.22)

Finally for pricing or assortment optimization, you need either Gurobi or OR-Tools:

  • gurobipy (>=11.0)
  • ortools (>=9.6)

               

:bulb: Tip: You can use the poetry.lock or requirements-complete.txt files with poetry or pip to install a fully predetermined and working environment.

:trident: Usage

Here is a short example of model parametrization to estimate a Conditional Logit on the ModeCanada dataset.

from choice_learn.data import ChoiceDataset
from choice_learn.models import ConditionalLogit, RUMnet
from choice_learn.datasets import load_modecanada

transport_df = load_modecanada(as_frame=True)
# Instantiation of a ChoiceDataset from a pandas.DataFrame
dataset = ChoiceDataset.from_single_long_df(df=transport_df,
                                            items_id_column="alt",
                                            choices_id_column="case",
                                            choices_column="choice",
                                            shared_features_columns=["income"],
                                            items_features_columns=["cost", "freq", "ovt", "ivt"],
                                            choice_format="one_zero")

# Initialization of the model
model = ConditionalLogit()

# Creation of the different weights:

# add_coefficients adds one coefficient for each specified item_index
# intercept, and income are added for each item except the first one that needs to be zeroed
model.add_coefficients(feature_name="intercept",
                       items_indexes=[1, 2, 3])
model.add_coefficients(feature_name="income",
                       items_indexes=[1, 2, 3])
model.add_coefficients(feature_name="ivt",
                       items_indexes=[0, 1, 2, 3])

# add_shared_coefficient add one coefficient that is used for all items specified in the items_indexes:
# Here, cost, freq and ovt coefficients are shared between all items
model.add_shared_coefficient(feature_name="cost",
                             items_indexes=[0, 1, 2, 3])
model.add_shared_coefficient(feature_name="freq",
                             items_indexes=[0, 1, 2, 3])
model.add_shared_coefficient(feature_name="ovt",
                             items_indexes=[0, 1, 2, 3])

history = model.fit(dataset, get_report=True)
print("The average neg-loglikelihood is:", model.evaluate(dataset).numpy())
print(model.report)

:trident: Documentation

A detailed documentation of this project is available here.
TensorFlow also has extensive documentation that can help you.

:trident: Contributing

You are welcome to contribute to the project ! You can help in various ways:

  • raise issues
  • resolve issues already opened
  • develop new features
  • provide additional examples of use
  • fix typos, improve code quality
  • develop new tests

We recommend to first open an issue to discuss your ideas. More details are given here.

:trident: Citation

If you consider this package and any of its feature useful for your research, please cite us.

License

The use of this software is under the MIT license, with no limitation of usage, including for commercial applications.

Affiliations

Choice-Learn has been developed through a collaboration between researchers at the Artefact Research Center and the laboratory MICS from CentraleSupélec, Université Paris Saclay.

   

           

:trident: References

Papers

[1]Representing Random Utility Choice Models with Neural Networks, Aouad, A.; Désir, A. (2022)
[2]The Acceptance of Model Innovation: The Case of Swissmetro, Bierlaire, M.; Axhausen, K., W.; Abay, G. (2001)
[3]Applications and Interpretation of Nested Logit Models of Intercity Mode Choice, Forinash, C., V.; Koppelman, F., S. (1993)
[4]The Demand for Local Telephone Service: A Fully Discrete Model of Residential Calling Patterns and Service Choices, Train K., E.; McFadden, D., L.; Moshe, B. (1987)
[5] Estimation of Travel Choice Models with Randomly Distributed Values of Time, Ben-Akiva, M.; Bolduc, D.; Bradley, M. (1993)
[6] Personalize Expedia Hotel Searches - ICDM 2013, Ben Hamner, A.; Friedman, D.; SSA_Expedia. (2013)
[7] A Neural-embedded Discrete Choice Model: Learning Taste Representation with Strengthened Interpretability, Han, Y.; Calara Oereuran F.; Ben-Akiva, M.; Zegras, C. (2020)
[8] A branch-and-cut algorithm for the latent-class logit assortment problem, Méndez-Díaz, I.; Miranda-Bront, J. J.; Vulcano, G.; Zabala, P. (2014)
[9] Stated Preferences for Car Choice in Mixed MNL models for discrete response., McFadden, D. and Kenneth Train (2000)
[10] Modeling the Choice of Residential Location, McFadden, D. (1978)

Code and Repositories

Official models implementations:

[1] RUMnet
[7] TasteNet [Repo1] [Repo2]

Other choice modeling packages:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

choice_learn-1.0.0.tar.gz (14.7 MB view details)

Uploaded Source

Built Distribution

choice_learn-1.0.0-py3-none-any.whl (14.8 MB view details)

Uploaded Python 3

File details

Details for the file choice_learn-1.0.0.tar.gz.

File metadata

  • Download URL: choice_learn-1.0.0.tar.gz
  • Upload date:
  • Size: 14.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.14 Linux/6.5.0-1025-azure

File hashes

Hashes for choice_learn-1.0.0.tar.gz
Algorithm Hash digest
SHA256 a3e3f07a5cd7c11c5b5a69e43379632f78c9e80abc6a4897a4633a3bd76a3860
MD5 234081b2236efa85731869b5cb3b7e14
BLAKE2b-256 d22aecd8a49ccd6312759ef34145f2c6d773598282b3dcddca515fc4bde7a83f

See more details on using hashes here.

File details

Details for the file choice_learn-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: choice_learn-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 14.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.14 Linux/6.5.0-1025-azure

File hashes

Hashes for choice_learn-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 47477a2c4e5711d9260d9549c4acd0b3c6212ea5dfb2aa83b4735147e0962b97
MD5 84c08d7f4c5c89e01b518a5c402cbe36
BLAKE2b-256 0ac21af288deaf8e764d9324399cf5d24b81ab1a162ad2fd0158650c8590bc5c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page