A Python toolkit for estimating firm-level markups using production function-based marginal cost recovery.

These details have not been verified by PyPI

Project links

Project description

PyMarkup

A Python toolkit for estimating firm-level markups using production function-based marginal cost recovery.

Installation

git clone https://github.com/immortalsRDJ/PyMarkup
cd PyMarkup
uv sync

For WRDS data downloads, add the wrds extra:

uv sync --extra wrds

Quick Start

Option 1: Command Line (Recommended)

# 1. Set up config file
cp config.example.yaml config.yaml
# Edit config.yaml with your API keys and settings

# 2. Run the full pipeline
uv run pymarkup run-all --config config.yaml

# Or skip data download if you already have the data
uv run pymarkup run-all --config config.yaml --skip-download

Option 2: Python Script

from PyMarkup import MarkupPipeline, PipelineConfig, EstimatorConfig

config = PipelineConfig(
    compustat_path="Input/DLEU/Compustat_annual.csv",
    macro_vars_path="Input/DLEU/macro_vars_new.xlsx",
    estimator=EstimatorConfig(method="wooldridge_iv"),
)

pipeline = MarkupPipeline(config)
results = pipeline.run()
results.save(output_dir="Output/", format="csv")

Command Line Reference

Full Pipeline

# Run everything (download + estimate + figures)
uv run pymarkup run-all --config config.yaml

# Skip all downloads (use existing data)
uv run pymarkup run-all --config config.yaml --skip-download

# Skip only Compustat download (no WRDS credentials needed)
uv run pymarkup run-all --config config.yaml --skip-compustat

# Skip figure generation
uv run pymarkup run-all --config config.yaml --no-figures

# Verbose output for debugging
uv run pymarkup run-all --config config.yaml -v

Individual Commands

# Download data only
uv run pymarkup download ppi                        # PPI (no credentials needed)
uv run pymarkup download cpi --config config.yaml   # CPI (needs FRED API key)
uv run pymarkup download all --config config.yaml   # All datasets

# Run estimation only (requires existing data)
uv run pymarkup estimate --config config.yaml

# Validate input data
uv run pymarkup validate Input/DLEU/Compustat_annual.csv

# Check version
uv run pymarkup version

Configuration

Setting Up Credentials

Copy the example config file:
```
cp config.example.yaml config.yaml
```

Edit config.yaml with your credentials:

fred_api_key: "your-fred-api-key"
wrds_username: "your-wrds-username"

Alternatively, set environment variables: FRED_API_KEY, WRDS_USERNAME

Data Requirements

Data Source	Credentials	How to Get
Compustat (WRDS)	WRDS account	Register at WRDS
CPI (FRED)	FRED API key	Free at FRED
PPI (BLS)	None	Public data from BLS
Macro variables	N/A	Included in repo: `Input/DLEU/macro_vars_new.xlsx`
NAICS descriptions	N/A	Included in repo: `Input/Other/NAICS_2D_Description.xlsx`
DEU observations	N/A	Optional: Original DLEU paper firm-year sample (see below)

Pipeline Overview

Download -> Data Preparation -> Elasticity Estimation -> Markup Calculation -> Figures & Decomposition

1. Data Download

Downloads raw data from external sources:

from PyMarkup.data import download_compustat, download_cpi, download_ppi, load_config

config = load_config("config.yaml")
download_ppi(config)        # No credentials needed
download_cpi(config)        # Requires FRED API key
download_compustat(config)  # Requires WRDS credentials

Data Sources:

PPI: Bureau of Labor Statistics Producer Price Index data from https://download.bls.gov/pub/time.series/pc/
CPI: Federal Reserve Economic Data (FRED) Consumer Price Index
Compustat: WRDS Compustat Fundamentals Annual/Quarterly

2. Data Preparation

Cleans and prepares the Compustat panel:

Deduplicates firm-year observations
Extracts NAICS industry codes
Deflates monetary values by GDP
Computes market shares
Trims outliers

3. Elasticity Estimation

Estimates output elasticity of variable inputs (θ) at the industry-year level:

Method	Class	Use Case
Wooldridge IV	`WooldridgeIVEstimator`	Main method, addresses endogeneity via IV/2SLS
Cost Share	`CostShareEstimator`	Fast baseline, no regression needed
ACF	`ACFEstimator`	Robustness, two-stage GMM with control function

from PyMarkup.estimators import WooldridgeIVEstimator

estimator = WooldridgeIVEstimator(specification="spec2")
elasticities = estimator.estimate_elasticities(panel_data)

SG&A Configuration

All three estimators support including SG&A (Selling, General & Administrative expenses) as a third input in the production function:

Estimator	Parameter	Options	Default
Wooldridge IV	`specification`	`"spec1"` (COGS+K), `"spec2"` (COGS+K+SG&A)	`"spec2"`
Cost Share	`include_sga`	`True`, `False`	`False`
ACF	`include_sga`	`True`, `False`	`False`

from PyMarkup.estimators import ACFEstimator, CostShareEstimator, WooldridgeIVEstimator

# Wooldridge IV: use spec2 for 3-input (COGS + Capital + SG&A)
iv_est = WooldridgeIVEstimator(specification="spec2")

# Cost Share: include SG&A in cost share calculation
cs_est = CostShareEstimator(include_sga=True)

# ACF: include SG&A as third input
acf_est = ACFEstimator(include_sga=True)

Via pipeline config:

from PyMarkup import PipelineConfig, EstimatorConfig

config = PipelineConfig(
    compustat_path="Input/DLEU/Compustat_annual.csv",
    macro_vars_path="Input/DLEU/macro_vars_new.xlsx",
    estimator=EstimatorConfig(
        method="all",
        iv_specification="spec2",    # Wooldridge IV with SG&A
        cs_include_sga=True,         # Cost Share with SG&A
        acf_include_sga=True,        # ACF with SG&A
    ),
)

Aggregation Weights

When aggregating firm-level markups to industry or economy level, you can choose the weighting scheme:

Weight Type	Formula	Use Case
`"revenue"` (default)	`firm_revenue / total_revenue`	Standard approach, larger firms weighted more
`"cost"`	`firm_cogs / total_cogs`	Weight by production scale

from PyMarkup.core.markup_calculation import aggregate_markups

# Revenue-weighted aggregation (default)
agg = aggregate_markups(
    firm_markups, by="year", method="weighted_mean",
    weight_type="revenue", panel_data=panel_data
)

# Cost-weighted aggregation
agg = aggregate_markups(
    firm_markups, by="year", method="weighted_mean",
    weight_type="cost", panel_data=panel_data
)

Via pipeline config:

config = PipelineConfig(
    ...
    aggregation_weight="revenue",  # or "cost"
)

DEU Sample Filtering

To replicate the original De Loecker, Eeckhout, and Unger (2020) paper results, you can filter the Compustat data to only include the firm-year observations from the original study:

# config.yaml
use_deu_sample: true
deu_observations_path: "Input/DLEU/DEU_observations.dta"

Or via Python:

config = PipelineConfig(
    compustat_path="Input/DLEU/Compustat_annual.csv",
    macro_vars_path="Input/DLEU/macro_vars_new.xlsx",
    use_deu_sample=True,
    deu_observations_path="Input/DLEU/DEU_observations.dta",
    ...
)

When enabled, the pipeline performs an inner merge on gvkey and year to filter to the original DLEU sample (approximately 242,000 firm-year observations from 1955-2016).

4. Markup Calculation

Computes firm-level markups using the De Loecker & Warzynski formula:

markup = θ / cost_share
where cost_share = COGS / Revenue

5. Figures

Figure	Function	Description
Aggregate Markup	`plot_aggregate_markup()`	Time series of aggregate markups
PPI vs Markup	`plot_markup_vs_ppi()`	Scatter plot with weighted OLS regression

6. Decomposition

Dynamic Olley-Pakes decomposition of aggregate markup changes (DLEU 2020). The decomposition runs automatically in the pipeline for Wooldridge IV and Cost Share methods.

Decomposes markup growth into three components:

Component	Description
Within	Markup changes within continuing firms
Reallocation	Market share shifts toward high/low-markup firms
Net Entry	Difference between entering and exiting firms

The components sum to the total markup change: Within + Reallocation + Net Entry = Markup (benchmark)

Output files:

File	Description
`Output/intermediate/decomposition_wooldridge_iv.csv`	IV decomposition results
`Output/intermediate/decomposition_cost_share.csv`	Cost Share decomposition results
`Output/figures/Decomposition - Wooldridge IV (YYYY-YYYY).pdf`	IV decomposition figure
`Output/figures/Decomposition - Cost Share (YYYY-YYYY).pdf`	Cost Share decomposition figure

Standalone usage:

from PyMarkup.decomposition import OlleyPakesDecomposition, plot_decomposition

op = OlleyPakesDecomposition(
    firm_var="gvkey",
    time_var="year",
    markup_var="markup",
    weight_var="sale_D",
)
decomp_results = op.decompose(firm_markups)

# Plot with cumulative markup levels (DLEU Figure IV style)
# All lines start at the same baseline and show counterfactual paths:
# "What would markup be if only this component operated?"
plot_decomposition(
    decomp_results,
    cumulative=True,
    base_markup=1.21,  # Base period aggregate markup (e.g., 1980 value)
    save_path="Output/decomposition.pdf",
)

Project Structure

src/PyMarkup/
├── core/              # Data preparation, markup calculation, figures
├── data/              # Data downloaders and loaders
├── estimators/        # WooldridgeIV, CostShare, ACF estimators
├── pipeline/          # MarkupPipeline orchestrator, config
├── decomposition/     # Dynamic Olley-Pakes decomposition
├── io/                # I/O schemas (Pydantic)
└── cli/               # CLI commands

Input/                 # Raw data (not version controlled)
Intermediate/          # Generated datasets, theta estimates
Output/                # Figures and tables

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymkp-0.1.0.tar.gz (104.8 kB view details)

Uploaded Apr 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pymkp-0.1.0-py3-none-any.whl (111.7 kB view details)

Uploaded Apr 13, 2026 Python 3

File details

Details for the file pymkp-0.1.0.tar.gz.

File metadata

Download URL: pymkp-0.1.0.tar.gz
Upload date: Apr 13, 2026
Size: 104.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pymkp-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6eda564c2cd97b6f538aa0ff2c7e556ac8c9b2525676bf239382687fd4ac4303`
MD5	`ed19718bb23c415238a6c124ee12d0bc`
BLAKE2b-256	`79d944db131200b991b6f091d4350381fa3d5eb7c209804d43f2ca53c914ab66`

See more details on using hashes here.

File details

Details for the file pymkp-0.1.0-py3-none-any.whl.

File metadata

Download URL: pymkp-0.1.0-py3-none-any.whl
Upload date: Apr 13, 2026
Size: 111.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pymkp-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a2da60585833e02ec0934e980bad6c5fd33a7e0473e1be6c3ef0b9c71877bbc4`
MD5	`064e751d7c1b941384a536d90bdb4a6b`
BLAKE2b-256	`28341e24019815e077ac1ce06307a06006b896eb29fabb1a3296cc0ff52a6d94`

See more details on using hashes here.

Pymkp 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PyMarkup

Installation

Quick Start

Option 1: Command Line (Recommended)

Option 2: Python Script

Command Line Reference

Full Pipeline

Individual Commands

Configuration

Setting Up Credentials

Data Requirements

Pipeline Overview

1. Data Download

2. Data Preparation

3. Elasticity Estimation

SG&A Configuration

Aggregation Weights

DEU Sample Filtering

4. Markup Calculation

5. Figures

6. Decomposition

Project Structure

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes