Python library for index numbers calculations
Project description
PyIndexNum
A high-performance Python library for calculating economic index numbers using Polars. Designed for statisticians and economists working with price and quantity indices.
Features
- High Performance: Built on Polars for efficient data processing of large datasets
- Comprehensive Index Methods: Support for bilateral and multilateral price/quantity indices
- Data Preparation Tools: Built-in utilities for data standardization and temporal aggregation
- Panel Data Handling: Robust methods for dealing with unbalanced panels through removal or imputation
- Extension Methods: Support for index splicing and rolling window calculations
- Type Safety: Full type annotations for better IDE support and code reliability
Installation
Using pip
pip install pyindexnum
Using uv
uv add pyindexnum
From source
git clone https://github.com/paluigi/PyIndexNum.git
cd PyIndexNum
uv sync
Quick Start
Here's the typical workflow for calculating economic indices:
import polars as pl
import pyindexnum as pin
# Load your price data
df = pl.read_csv("price_data.csv")
# 1. Standardize column names
df_std = pin.standardize_columns(df, date_col="date", price_col="price", id_col="product_id", quantity_col="quantity")
# 2. Aggregate to desired time frequency
df_agg = pin.aggregate_time(df_std, freq="1mo", agg_type="arithmetic")
# 3. Handle unbalanced panels (optional)
df_balanced = pin.remove_unbalanced(df_agg)
# or
df_imputed = pin.carry_forward_imputation(df_agg, ["aggregated_price", "aggregated_quantity"])
# 4. Calculate bilateral indices (two periods)
laspeyres_idx = pin.laspeyres(df_balanced)
fisher_idx = pin.fisher(df_balanced)
# 5. Calculate multilateral indices (multiple periods)
geks_fisher_idx = pin.geks_fisher(df_agg)
# 6. Apply extension methods (optional)
extended_idx = pin.movement_splice(geks_fisher_idx1, geks_fisher_idx2)
Supported Index Methods
Bilateral Indices (Two-Period Comparisons)
| Index | Formula | Use Case |
|---|---|---|
| Jevons | Geometric mean of price relatives | Unweighted geometric average |
| Carli | Arithmetic mean of price relatives | Unweighted arithmetic average |
| Dutot | Ratio of arithmetic means of prices | Simple price average comparison |
| Laspeyres | Weighted by base period quantities | Fixed basket approach |
| Paasche | Weighted by current period quantities | Current basket approach |
| Fisher | Geometric mean of Laspeyres and Paasche | Ideal index (time/quantity reversal) |
| Törnqvist | Weighted geometric mean with average expenditure shares | Symmetric treatment |
| Walsh | Geometric mean of quantities as fixed basket | Alternative symmetric approach |
Multilateral Indices (Multi-Period Comparisons)
| Index | Method | Description |
|---|---|---|
| GEKS-Fisher | Chained Fisher indices | Most widely used multilateral method |
| GEKS-Törnqvist | Chained Törnqvist indices | Alternative chaining approach |
| Geary-Khamis | System of equations | Global approach |
| Time Product Dummy | Regression-based | Econometric approach |
Extension Methods
- Movement Splice: Chain indices using movement ratios
- Window Splice: Moving window chaining
- Half Splice: Half-year overlapping windows
- Mean Splice: Average of overlapping windows
- Fixed Base Rolling Window: Rolling window with fixed base
Data Requirements
Your data should contain:
- Date column: Date or datetime values
- Price column: Numeric price observations
- Product ID column: Unique identifier for each product/variety
- Quantity column: Numeric quantities (required for weighted indices)
Example data structure:
┌────────────┬────────────┬───────┬──────────┐
│ date ┆ product_id ┆ price ┆ quantity │
│ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ str ┆ f64 ┆ f64 │
╞════════════╪════════════╪═══════╪══════════╡
│ 2023-01-01 ┆ A ┆ 100.0 ┆ 10.0 │
│ 2023-01-01 ┆ B ┆ 200.0 ┆ 5.0 │
│ 2023-02-01 ┆ A ┆ 105.0 ┆ 12.0 │
│ 2023-02-01 ┆ B ┆ 210.0 ┆ 4.5 │
└────────────┴────────────┴────────────┴──────────┘
API Overview
Data Preparation
# Standardize column names and types
df_std = pin.standardize_columns(df, date_col="date", price_col="price", id_col="id")
# Aggregate time series data
df_agg = pin.aggregate_time(df_std, freq="1mo", agg_type="weighted_arithmetic")
# Handle unbalanced panels
df_balanced = pin.remove_unbalanced(df_agg)
df_imputed = pin.carry_forward_imputation(df_agg, ["price", "quantity"])
Index Calculation
# Bilateral indices
jevons = pin.jevons(df)
laspeyres = pin.laspeyres(df)
fisher = pin.fisher(df)
# Multilateral indices
geks = pin.geks_fisher(df)
gk = pin.geary_khamis(df)
Extensions
# Splicing methods
movement_spliced = pin.movement_splice(multilateral_index1, multilateral_index2)
window_spliced = pin.window_splice(multilateral_index1, multilateral_index2)
Documentation
Full documentation is available at https://pyindexnum.readthedocs.io/
Contributing
PyIndexNum is an open-source project and welcomes contributions! See our contributing guide for details.
Development Setup
# Clone and setup
git clone https://github.com/paluigi/PyIndexNum.git
cd PyIndexNum
uv sync --dev
# Run tests
uv run pytest
# Build documentation
cd docs && make html
Areas for Contribution
- New index methods and formulations
- Performance optimizations
- Additional data validation
- Enhanced documentation and examples
- Bug fixes and improvements
Citation
If you use PyIndexNum in your research, please cite:
@software{pyindexnum,
title = {PyIndexNum: A Python Library for Economic Index Numbers},
author = {Palumbo, Luigi, and Yu, Mengting},
url = {https://github.com/paluigi/PyIndexNum},
version = {0.1.2},
}
License
PyIndexNum is licensed under the MIT License. See LICENSE for details.
Related Projects
- Polars: The high-performance DataFrame library that powers PyIndexNum
Built with ❤️ for the economic statistics community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyindexnum-0.1.2.tar.gz.
File metadata
- Download URL: pyindexnum-0.1.2.tar.gz
- Upload date:
- Size: 145.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01f1db56e33dd6c9cb1844144b61ea15c6d64db026f3a8c5b9e3910a1cfcf606
|
|
| MD5 |
2884b23181110a80244b8df96b5dfd1e
|
|
| BLAKE2b-256 |
ded97b3d22958c2b3bab2ba7bd29a10f5422fe18b0ad593703f959e1694d92c3
|
Provenance
The following attestation bundles were made for pyindexnum-0.1.2.tar.gz:
Publisher:
python-publish.yml on paluigi/PyIndexNum
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyindexnum-0.1.2.tar.gz -
Subject digest:
01f1db56e33dd6c9cb1844144b61ea15c6d64db026f3a8c5b9e3910a1cfcf606 - Sigstore transparency entry: 1384145689
- Sigstore integration time:
-
Permalink:
paluigi/PyIndexNum@9fba201f83d590f69217962dba4f6b77f08c045f -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/paluigi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@9fba201f83d590f69217962dba4f6b77f08c045f -
Trigger Event:
release
-
Statement type:
File details
Details for the file pyindexnum-0.1.2-py3-none-any.whl.
File metadata
- Download URL: pyindexnum-0.1.2-py3-none-any.whl
- Upload date:
- Size: 19.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
893a0ee8d603a9f2ca11dd1ea7fbd904034ec65d2016f776b831261f9ddbbc24
|
|
| MD5 |
c77f52768b21f2abec87e4256632bd80
|
|
| BLAKE2b-256 |
9a8f4d1d1bac017fa73d6f307a2867446091a18e7fb7c205c96b5146da91e5e9
|
Provenance
The following attestation bundles were made for pyindexnum-0.1.2-py3-none-any.whl:
Publisher:
python-publish.yml on paluigi/PyIndexNum
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyindexnum-0.1.2-py3-none-any.whl -
Subject digest:
893a0ee8d603a9f2ca11dd1ea7fbd904034ec65d2016f776b831261f9ddbbc24 - Sigstore transparency entry: 1384145826
- Sigstore integration time:
-
Permalink:
paluigi/PyIndexNum@9fba201f83d590f69217962dba4f6b77f08c045f -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/paluigi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@9fba201f83d590f69217962dba4f6b77f08c045f -
Trigger Event:
release
-
Statement type: