A package that implements Marginal Distribution Models (MDMs)
Project description
MDM Py
This package is a Python
implementation of Marginal Distribution Models (MDMs), which can be used in Discrete Choice Modelling.
Documentation
Documentation is kindly hosted by Read The Docs.
Install
This package is uploaded to PyPI. Hence,
pip install mdmpy
should work.
How to use
Simplest Case
Gradient Descent
In the simplest case, we will use the Multinomial Logit (MNL) model, which is used as a default. Assuming numpy
, scipy
and pandas
are installed, we generate choice data assuming a random utility model:
from string import ascii_uppercase as letters import pandas as pd import scipy.stats as stats import numpy as np NUM_INDIV = 57 NUM_CHOICES = 3 NUM_ATTR = 4 np.random.seed(2019) X = np.random.random((NUM_ATTR, NUM_INDIV * NUM_CHOICES)) true_beta = np.random.random(NUM_ATTR) V = np.dot(true_beta.T, X) V = np.reshape(V, (NUM_INDIV,NUM_CHOICES)) eps = stats.gumbel_r.rvs(size=NUM_INDIV * NUM_CHOICES) eps = np.reshape(eps, (NUM_INDIV, NUM_CHOICES)) U = V + eps highest_util = np.argmax(U, 1) df = pd.DataFrame(X.T) df['choice'] = [1 if idx == x else 0 for idx in highest_util for x in range(NUM_CHOICES)] df['individual'] = [indiv for indiv in range(NUM_INDIV) for _ in range(NUM_CHOICES)] df['altvar'] = [altlvl for _ in range(NUM_INDIV) for altlvl in letters[:NUM_CHOICES]]
With this package, we will assume that df
is the dataframe which is simply given to us. Instead of having the code itself find out how many individuals, choices and coefficients or attributes there are, we will simply feed them into the class. To perform a gradient descent with this class, we will use the grad_desc
method, using the df
from above as input,
import mdmpy # In a typical case one would load df before this line mdm = mdmpy.MDM(df, 4, 3, [0, 1, 2, 3]) np.random.seed(4) init_beta = np.random.random(4) grad_beta = mdm.grad_desc(init_beta) print(grad_beta) # expected output [0.30238122 0.07955214 0.86779824 0.50951981]
Solver
The MDM
class acts as a wrapper and adds the necessary pyomo
variables and sets to model the problem, but requires a solver. IPOPT, an interior point solver, is recommended. If you have such a solver, it can be called. Assuming IPOPT is being used:
import mdmpy ipopt_exec_path = /path/to/ipopt # Replace with proper path mdm = mdmpy.MDM(df, 4, 3, [0, 1, 2, 3]) mdm.model_init() mdm.model_solve("ipopt",ipopt_exec_path) print([mdm.m.beta[idx].value for idx in mdm.m.beta]) # expected output [0.30238834989235025, 0.07953888508425154, 0.8678050334295714, 0.5095096796373667]
Todo

Add documentation and more meaningful comments
 Add more type hints, especially those involving Python builtins

Add tests.

Put
pandas
intoextras_require
ofsetup.py
, and remove the dependency.
Input of
MDM
class will become aNumPy
array rather than a dataframe. 
Dataframe conversion will be turned into a utility function, likely using tryexcept block for imports

Project details
Release history Release notifications
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size  File type  Python version  Upload date  Hashes 

Filename, size mdmpy0.0.15.17py3noneany.whl (14.5 kB)  File type Wheel  Python version py3  Upload date  Hashes View hashes 
Filename, size mdmpy0.0.15.17.tar.gz (9.3 kB)  File type Source  Python version None  Upload date  Hashes View hashes 
Hashes for mdmpy0.0.15.17py3noneany.whl
Algorithm  Hash digest  

SHA256  5c3f231ce159281fafd76df41a9adefc5861d6f45aeb9295b0a5da918b2e80d9 

MD5  02352be99295998046fb276114bf7482 

BLAKE2256  903f94c5b78e0b601c4ac198ea328b0a6cfa5c322b7b7f30d1104bc909d25f7a 