Python Tensor based package for Deep neural net assisted Discrete Choice Modelling.

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: MIT License
Natural Language
- English
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering :: Artificial Intelligence

Project description

PyCMTensor

Licence

A tensor-based choice modelling Python package with deep learning libraries. Read the documentation at https://pycmtensor.readthedocs.io

About PyCMTensor

PyCMTensor is a discrete choice model development platform which is designed with the use of deep learning in mind, enabling users to write more complex models using neural networks. PyCMTensor is build on Aesara, a tensor library, and uses many features commonly found in deep learning packages such as Tensorflow and Keras. Aesara was chosen as the back end mathematical library because of its hackable, open-source nature. Users of Biogeme will be familiar with the syntax of PyCMTensor.

This package allows one to incorporate neural networks into discrete choice models that boosts accuracy of model estimates which still being able to produce all the same statistical analysis found in traditional choice modelling software.

PyCMTensor aims to provide developers and researchers with deep learning tools for econometric and travel behaviour modelling with reproducible and interpretable results.

PyCMTensor and Biogeme

PyCMTensor improves on Biogeme in situations where much more complex models are necessary, for example, integrating neural networks into discrete choice models. PyCMTensor also include the ability to estimate models using 1st order stochastic gradient descent methods by default, such as Nesterov Accelerated Gradient, Adam, or RMSProp.

Features

Estimate complex choice models with neural networks using deep learning algorithms
Combines traditional econometric models (e.g. Multinomial Logit) with deep learning models (e.g. ResNets)
Shares similar programming syntax with Biogeme, allowing easy transition between Biogeme and PyCMTensor models
Uses tensor based mathematical operations from the advanced features found in the Aesara library

Pre-install

To install PyCMTensor, you need Conda (Full Anaconda works fine, but miniconda is recommmended for a minimal installation). Ensure that Conda is using at least Python 3.9.

Once Conda is installed, install the required dependencies from conda by running the following command in your terminal:

Windows

conda install mkl-service conda-forge::cxx-compiler conda-forge::m2w64-toolchain -y

Linux

conda install mkl-service conda-forge::cxx-compiler

Mac OSX

conda install mkl-service Clang

Install PyCMTensor

Then, run this command in your terminal to download and install the latest branch of PyCMTensor from PyPI:

pip install pycmtensor -U

Optional: If you want the development version from the Github repository:

pip install git+https://github.com/mwong009/pycmtensor.git@develop -U

The development branch is the most recent update of PyCMTensor. If you want a stable branch (master), remove @develop at the end of the .git url.

Usage

PyCMTensor uses syntax very similar to Biogeme. Users of Biogeme should be familiar with the syntax.

Start an interactive session (IPython or Jupyter Notebook) and import:

import pycmtensor as cmt

Several submodules are also important to include:

from pycmtensor.expressions import Beta # Beta class for model parameters
from pycmtensor.models import MNLogit   # model library
from pycmtensor.optimizers import Adam  # Optimizers
from pycmtensor.results import Results  # for generating results

For a full list of submodules and description, refer to API Reference

Development

To set up PyCMTensor in a local development environment, you need to set up a virtual environment and install the project requirements. Follow the instructions to install Conda (miniconda), then start a new virtual environment with the provided environment_<your OS>.yml file.

For example in windows:

conda env create -f environment_windows.yml

Next, activate the virtual environment and install poetry via pip.

conda activate pycmtensor-dev
pip install poetry

Lastly, install the project and development dependencies

poetry install -E dev

The virtual environment needs to be activated and commits are done from the virtural environment.

Simple example: Swissmetro dataset

Using the swissmetro dataset from Biogeme, we define a simple MNL model.

Note:The following is a replication of the results from Biogeme using the Adam optimization algorithm.

Import the dataset and perform some data santiation

swissmetro = pd.read_csv("data/swissmetro.dat", sep="\t")
db = cmt.Database(name="swissmetro", pandasDatabase=swissmetro, choiceVar="CHOICE")
globals().update(db.variables)
# Removing some observations
db.data.drop(db.data[db.data["CHOICE"] == 0].index, inplace=True)
db.data["CHOICE"] -= 1  # set the first choice index to 0
db.choices = [0, 1, 2]
db.autoscale(
	variables=["TRAIN_CO", "TRAIN_TT", "CAR_CO", "CAR_TT", "SM_CO", "SM_TT"],
	default=100.0,
	verbose=False,
)

cmt.Database() loads the dataset and defines tensor variables automatically.

Initialize the model parameters and specify the utility functions and availability conditions

b_cost = Beta("b_cost", 0.0, None, None, 0)
b_time = Beta("b_time", 0.0, None, None, 0)
asc_train = Beta("asc_train", 0.0, None, None, 0)
asc_car = Beta("asc_car", 0.0, None, None, 0)
asc_sm = Beta("asc_sm", 0.0, None, None, 1)

U_1 = b_cost * db["TRAIN_CO"] + b_time * db["TRAIN_TT"] + asc_train
U_2 = b_cost * db["SM_CO"] + b_time * db["SM_TT"] + asc_sm
U_3 = b_cost * db["CAR_CO"] + b_time * db["CAR_TT"] + asc_car

Define the Multinomial Logit model

mymodel = MNLogit(u=U, av=AV, database=db, name="Multinomial Logit")
mymodel.add_params(locals()) # load Betas into the model

(optional) Define the model hyperparameters

mymodel.config["patience"] = 9000
mymodel.config["max_epoch"] = 500
mymodel.config["base_lr"] = 0.0012
mymodel.config["max_lr"] = 0.002
mymodel.config["learning_scheduler"] = "ConstantLR"

Call the training function and save the trained model

model = cmt.train(model=mymodel, database=db, optimizer=Adam)  # we use the Adam Optimizer

Generate the statistics and correlation matrices

results = Results(model, db, prnt=False)
print(results)
results.generate_beta_statistics()
results.print_beta_statistics()
results.print_correlation_matrix()

Sample output:

 Python 3.10.4 | packaged by conda-forge | (main, Mar 30 2022, 08:38:02) [MSC v.1916 64 bit (AMD64)]
 [2022-08-12 18:51:40] INFO: Building model...
 [2022-08-12 18:51:52] INFO: Training model...
 [2022-08-12 18:51:55] INFO: Maximum iterations reached. Terminating...
 [2022-08-12 18:51:55] INFO: Optimization complete with accuracy of 61.937%.
 [2022-08-12 18:51:55] INFO: Max log likelihood reached @ epoch 57.

 Results
 ------
 Model: Multinomial Logit
 Build time: 00:00:12
 Estimation time: 00:00:03
 Estimation rate: 3400.838 iter/s
 Seed value: 7577
 Number of Beta parameters: 4
 Sample size: 10719
 Excluded data: None
 Init loglikelihood: -11093.627
 Final loglikelihood: -9165.567
 Final loglikelihood reached at: epoch 57
 Likelihood ratio test: 3856.120
 Accuracy: 61.937%
 Rho square: 0.174
 Rho bar square: 0.173
 Akaike Information Criterion: 18339.13
 Bayesian Information Criterion: 18368.25
 Final gradient norm: 0.121

 Model statistics:
               Value   Std err     t-test   p-value Rob. Std err Rob. t-test Rob. p-value
 asc_car    0.013287  0.030614   0.434002  0.664287     0.159125    0.083498     0.933456
 asc_train -0.537674  0.037544 -14.321085       0.0     0.014821  -36.278684          0.0
 b_cost     0.021882  0.002227   9.824814       0.0     0.005462     4.00618     0.000062
 b_time    -0.604866  0.035116 -17.224787       0.0     0.514255   -1.176199     0.239515

 Correlation matrix:
              b_cost    b_time  asc_train   asc_car
 b_cost     1.000000 -0.092697   0.171935  0.269662
 b_time    -0.092697  1.000000  -0.710780 -0.596636
 asc_train  0.171935 -0.710780   1.000000  0.603376
 asc_car    0.269662 -0.596636   0.603376  1.000000

Plot the training performance and accuracy
Compute the elasticities

Visualize the computation graph

import aesara.d3viz as d3v
from aesara import printing
printing.pydotprint(mymodel.cost, "graph.png")

Credits

PyCMTensor was inspired by Biogeme and aims to provide deep learning modelling tools for transport modellers and researchers.

This package template was generated with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: MIT License
Natural Language
- English
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering :: Artificial Intelligence

Release history Release notifications | RSS feed

1.12.0

Mar 6, 2024

1.11.1

Feb 29, 2024

1.11.0

Feb 29, 2024

1.10.0

Feb 15, 2024

1.9.2

Feb 8, 2024

1.9.1

Dec 20, 2023

1.9.0

Dec 20, 2023

1.8.1

Nov 28, 2023

1.8.0

Nov 24, 2023

1.7.1

Sep 28, 2023

1.7.0

Sep 22, 2023

1.6.4

Aug 16, 2023

1.6.3

Aug 11, 2023

1.6.2

Aug 10, 2023

1.6.1 yanked

Aug 10, 2023

1.6.0

Aug 7, 2023

1.5.0

Jul 26, 2023

1.4.0

Jul 17, 2023

1.3.2

Jun 23, 2023

1.3.1

Nov 17, 2022

1.3.0

Nov 10, 2022

1.2.1

Oct 25, 2022

1.2.0

Oct 14, 2022

1.1.0

Sep 23, 2022

This version

1.0.7

Aug 12, 2022

1.0.3

May 12, 2022

1.0.2 yanked

May 12, 2022

Reason this release was yanked:

depreciated

0.7.0

Mar 17, 2022

0.6.4

Mar 13, 2022

0.6.3

Mar 13, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycmtensor-1.0.7.tar.gz (28.4 kB view hashes)

Uploaded Aug 12, 2022 Source

Built Distribution

pycmtensor-1.0.7-py3-none-any.whl (27.6 kB view hashes)

Uploaded Aug 12, 2022 Python 3

Hashes for pycmtensor-1.0.7.tar.gz

Hashes for pycmtensor-1.0.7.tar.gz
Algorithm	Hash digest
SHA256	`508f4783946f1fc184a9f8803373503c2c211663731531650489dc8db3fcebc5`
MD5	`8bebcee61cc1d99c180eb2fd7c4a3186`
BLAKE2b-256	`44cee64bb025f8f6797e44e4bcf657552b01095ec162fd604ba431eb179fa3f7`

Hashes for pycmtensor-1.0.7-py3-none-any.whl

Hashes for pycmtensor-1.0.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6cf859ab1b1b9ddefa3d4858879250ed7d25dbacd53b3eb6b91e00aadc460570`
MD5	`3167ac1de6b359fba96d19d442bc2728`
BLAKE2b-256	`a85fc011452b11c22c7542fe681d6de76ca7a93abf2c25e4d3f8adaedd593a48`