Skip to main content

Leightweight Implementation of Athey et al. (2021)'s MC-NNM estimator

Project description

lightweight-mcnnm

License: GPL v3 Python Versions OS PyPI version Documentation Status Code style: black mypy checked codecov Tests GitHub last commit Issues Pull Requests

lightweight-mcnnm is a Python package that provides a lightweight and performant implementation of the Matrix Completion with Nuclear Norm Minimization (MC-NNM) estimator for causal inference in panel data settings.

Table of Contents

What is lightweight-mcnnm

lightweight-mcnnm implements the MC-NNM estimator exactly as described in "Matrix Completion Methods for Causal Panel Data Models" by Susan Athey, Mohsen Bayati, Nikolay Doudchenko, Guido Imbens, and Khashayar Khosravi (2021). This estimator provides a powerful tool for estimating causal effects in panel data settings, particularly when dealing with complex treatment patterns and potential confounders.

The implementation focuses on performance and minimal dependencies, making it suitable for use in various environments, including GPUs and cloud clusters.

Features

  • Lightweight implementation with minimal dependencies
  • Utilizes JAX for improved performance and GPU compatibility
  • Faithful to the original MC-NNM algorithm as described in Athey et al. (2021)
  • Suitable for large-scale panel data analysis
  • Supports various treatment assignment mechanisms
  • Includes unit-specific, time-specific, and unit-time specific covariates
  • Offers flexible validation methods for parameter selection

Comparison to Other Implementations

lightweight-mcnnm is designed to be lightweight and easy to use, with a focus on performance and minimal dependencies. The other two main implementations of the MC-NNM estimator are CausalTensor and fect. Both packages implement MC-NNM as part of a broader set of causal inference methods. Both implement covariates and cross-validation differently from this package. For a detailed comparison, see this notebook:

Installation

Requirements

lightweight-mcnnm is compatible with Python 3.10 or later and depends on JAX and NumPy. CUDA-compatible versions of Jax are not currently supported directly by lightweight-mcnnm, but you can use JAX with CUDA support by installing it separately.

Installing from PyPI

The simplest way to install lightweight-mcnnm and its dependencies is from PyPI using pip:

pip install lightweight-mcnnm

To upgrade lightweight-mcnnm to the latest version, use:

pip install --upgrade lightweight-mcnnm

JIT Compilation

By default, this package uses JAX's JIT compilation for better performance in typical use cases. If you want to disable JIT compilation, you can add the following line at the top of your script:

jax.config.update('jax_disable_jit', True)

Note that disabling JIT may impact performance depending on your specific use case. I have found leaving JIT enabled to be the best option for most use cases. An example use case where disabling JIT may be sensible is calling estimate() multiple times on datasets of different sizes, which triggers recompilation any time the input data shape changes.

Documentation

The full documentation for lightweight-mcnnm is available at: https://mcnnm.readthedocs.io/en/latest/

Using lightweight-mcnnm

  1. A comprehensive example is available here:
  2. Simple example of how to use lightweight-mcnnm:
import jax.numpy as jnp
from lightweight_mcnnm import estimate, generate_data

Y, W, X, Z, V, true_params = generate_data(
        nobs=50,
        nperiods=10,
        unit_fe=True,
        time_fe=True,
        X_cov=True,
        Z_cov=True,
        V_cov=True,
        seed=2024,
        noise_scale=0.1,
        assignment_mechanism="staggered",
        treatment_probability=0.1,
    )

# Run estimation
results = estimate(
    Y=Y,
    Mask=W,
    X=X,
    Z=Z,
    V=V,
    Omega=None,
    use_unit_fe=True,
    use_time_fe=True,
    lambda_L=None,
    lambda_H=None,
    validation_method='cv',
    K=3,
    n_lambda=30,
)

print(f"\nTrue effect: {true_params['treatment_effect']}, Estimated effect: {results.tau:.3f}")
print(f"Chosen lambda_L: {results.lambda_L:.4f}, lambda_H: {results.lambda_H:.4f}")

For more detailed usage instructions and examples, please refer to the documentation.

Development

Setting up the development environment

This project uses Poetry for dependency management. To set up your development environment:

  1. Ensure you have Poetry installed. If not, install it by following the instructions on the official Poetry website.

  2. Clone the repository:

    git clone https://github.com/tobias-schnabel/mcnnm.git
    cd lightweight-mcnnm
    
  3. Install the project dependencies:

    poetry install
    

    This command creates a virtual environment and installs all the necessary dependencies.

  4. Activate the virtual environment:

    poetry shell
    

Now you're ready to start developing!

Testing and building the package

  1. Running tests: use the following command:

    poetry run pytest
    
  2. Coverage: to generate a coverage report, run the following command:

    poetry run coverage report
    

    This will generate a coverage report showing the percentage of code covered by the tests.

  3. Building the package: run the following command:

    poetry build
    

    This will create both wheel and source distributions in the dist/ directory.

Development Workflow

Pre-commit Hooks

This project uses pre-commit hooks to ensure code quality and consistency. Pre-commit hooks are scripts that run automatically every time you commit changes to your version control system. They help catch common issues before they get into the codebase. To set up:

  1. Install pre-commit:
    poetry add pre-commit
    
  2. Install the hooks:
    poetry run pre-commit install
    
  3. Run the hooks on all files (recommended for the first setup):
    poetry run pre-commit run --all-files
    

The configuration for the pre-commit hooks can be found in the .pre-commit-config.yaml file. The following hooks are configured:

•	Trailing whitespace removal: Ensures no trailing whitespace is left in the code.
•	End-of-file fixer: Ensures files end with a newline.
•	YAML check: Validates YAML files.
•	Flake8: Checks for Python style guide enforcement.
•	Black: Ensures consistent code formatting.
•	Bandit: Checks for common security issues in Python code.
•	Mypy: Performs static type checking.

Branch Protection

To maintain the integrity of the main branch, branch protection rules are enforced. These rules ensure that all changes to the main branch go through a review process and pass all required checks.

Protected Branch Rules

  1. Require pull request reviews before merging: At least one approval from an administrator is required.
  2. Require status checks to pass before merging: All CI checks must be successful before merging.

References

This implementation is based on the method described in: Athey, S., Bayati, M., Doudchenko, N., Imbens, G., & Khosravi, K. (2021). Matrix Completion Methods for Causal Panel Data Models. Journal of the American Statistical Association, 116(536), 1716-1730.

Acknowledgements

This project was inspired by and draws upon ideas from CausalTensor and fect. I am grateful for their contributions to the field of causal inference.

Citing lightweight-mcnnm

If you use lightweight-mcnnm in your research, please cite both the software and the original paper describing the method:

For the software: Schnabel, T. (2023). lightweight-mcnnm: A Python package for Matrix Completion with Nuclear Norm Minimization. https://github.com/tobias-schnabel/mcnnm

For the method: Athey, S., Bayati, M., Doudchenko, N., Imbens, G., & Khosravi, K. (2021). Matrix Completion Methods for Causal Panel Data Models. Journal of the American Statistical Association, 116(536), 1716-1730.

BibTeX entries:

@software{schnabel2024lightweightmcnnm, author = {Schnabel, Tobias}, title = {lightweight-mcnnm: A Python package for Matrix Completion with Nuclear Norm Minimization}, year = {2024}, url = {https://github.com/tobias-schnabel/mcnnm} }

@article{athey2021matrix, title={Matrix completion methods for causal panel data models}, author={Athey, Susan and Bayati, Mohsen and Doudchenko, Nikolay and Imbens, Guido and Khosravi, Khashayar}, journal={Journal of the American Statistical Association}, volume={116}, number={536}, pages={1716--1730}, year={2021}, publisher={Taylor & Francis} }

License

lightweight-mcnnm is released under the GNU General Public License v3.0. See the LICENSE file for more details.

Changelog, Contributing, and Templates

  1. For a detailed changelog of each release, please see the GitHub Releases page
  2. Please refer to CONTRIBUTING.md for guidelines on how to contribute to this project.
  3. For reporting issues, please use the template provided in ISSUE_TEMPLATE.md
  4. For submitting pull requests, please use the template provided in PULL_REQUEST_TEMPLATE.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lightweight_mcnnm-1.0.2.tar.gz (43.3 kB view hashes)

Uploaded Source

Built Distribution

lightweight_mcnnm-1.0.2-py3-none-any.whl (58.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page