An implementation of Anderson (2008) inverse covariance weighted index for Python, validated against STATA's Swindex package.

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

joshash

Project description

Inverse-Covariance Weighted Index for Python

A Python implementation of the Inverse-Covariance Weighted (ICW) Index introduced by Anderson (2008) and implemented in Stata's swindex by Schwab et al. (2020). I validated this against Stata's swindex and produces effectively identical results.

Quick Start

import numpy as np
import pandas as pd
from icw import icw_index  # or just copy-paste this function

# Example using numpy arrays
x1 = np.random.rand(100)
x2 = np.random.rand(100)
index = icw_index([x1, x2])

# Example using Pandas dataframes 
df = pd.DataFrame({'var1': np.random.rand(100),
                   'var2': np.random.rand(100),
                   'treat': np.random.randint(0, 2, size=100)})

# Full sample normalization, no reference group. Entire index is distributed M=0, SD=1
df['icw'] = icw_index([df['var1'].values, df['var2'].values])

# User-specified reference group normalization. Control group is distributed M=0, SD=1 
# and treatment group is in effect size units relative to control group.
ref_mask = (df['treat'] == 0).values
df['icw_control_reference'] = icw_index([df['var1'].values, df['var2'].values],
                                        reference_mask=ref_mask)

What is the ICW Index?

Tl;DR: The ICW index is a weighted average of variables where the weights are determined by the inverse of the covariance matrix of the variables.

Anderson (2008) proposed an index to combine multiple outcomes into a single measure using the inverse of the covariance matrix as weights. Why would you do this? Well first, people use indices all the time to avoid a multiple comparison problem. But usually, you would average the index variables so each counts equally. This can be sub-optimal if a bunch of variables all correlate with each other. You may want to up-weight the ones that are providing unique information. So the ICW index down-weights correlated outcomes and up-weights less correlated ones. Also, using the inverse covariance matrix as weights minimizes the variance of the resulting index.

Implementation Details

The implementation follows the procedure explained by Schwab et al. (2020). I'll quote their steps here for clarity...

We can calculate the standardized weighted index $\tilde{s}$ for each observation $i$ as follows:

Select $k$ indicators relevant for outcome $j$.
Adjust sign: For all $k$ indicators, ensure the positive direction always indicates a "better outcome".
Normalize indicators: Demean all $k$ indicators by subtracting the mean of the indicator in the reference group (the full sample is the default reference group). Then, convert them to effect sizes, $\tilde{y}_k$, by dividing each indicator by its reference group standard deviation.
Construct weights: Create weights using $\Sigma^{-1}$, the inverse of the covariance matrix of the normalized indicators. Specifically, set the weight $\tilde{w}_k$ on each indicator equal to the sum of its row entries in $\Sigma^{-1}$. With this rule, highly correlated indicators are assigned small or offsetting weights, while less correlated outcomes receive larger weights.
Construct index: Calculate the weighted average of $\tilde{y}_k$ for observation $i$. Formally, the weighted average $\overline{s}_i$ is calculated using $\tilde{s}_i = (1'\Sigma^{-1}1)^{-1}(1'\Sigma^{-1}\tilde{y}_i)$, where $\mathbf{1}$ is a column vector of 1s and $\tilde{y}_i$ is a column vector of all outcomes for observation $i$. This is an efficient GLS estimator.
Normalize index: Demean index $\overline{s}_i$ by subtracting the mean of the index in the reference group, and convert it to effect sizes by dividing it by its reference group standard deviation. This normalization results in an index distributed with mean zero and standard deviation one in the reference group.

Validation

I validated this implementation against Stata's swindex (version 14) using 100 synthetic datasets:

Datasets: 100 datasets with 5 variables each
Sample sizes: 500-2000 observations per dataset
Total observations: 122,444
Variables: Standard normal distribution, no missing data

Results

Results are identical (within a floating point tolerance) to Stata's swindex implementation. Here are the two options I tested.

Default settings (full sample as reference group)

Correlation: 0.999999999999996
Differences > 1e-06: 0
Max absolute difference: 3.08e-07
Median absolute difference: 3.01e-08
Mean absolute difference: 3.88e-08

User-specified reference group (using the control group as reference)

Correlation: 1.000000000000000
Differences > 1e-06: 0
Max absolute difference: 3.31e-07
Median absolute difference: 2.94e-08
Mean absolute difference: 3.77e-08

Limitations

This implementation is simpler than swindex and has the following restrictions:

No missing data: Input arrays must not contain NaN values
User handles sign orientation: Assumes input data is already oriented so higher values indicate better outcomes
Report bugs: I imagine I missed some edge cases. Feel free to report bugs.

System I Ran Tests On

I was using Python 3.13, dev_requirements.txt packages, MacOS, and Stata 19.5 for testing.

References

Schwab, B., Janzen, S., Magnan, N. P., & Thompson, W. M. (2020). Constructing a summary index using the standardized inverse-covariance weighted average of indicators. The Stata Journal, 20(4), 952-964.
Anderson, M. L. (2008). Multiple Inference and Gender Differences in the Effects of Early Intervention: A Reevaluation of the Abecedarian, Perry Preschool, and Early Training Projects. Journal of the American Statistical Association, 103(484), 1481–1495.

Citation

If you use this implementation in your work, please cite:

@misc{icw_index,
  author = {Joshua Ashkinaze},
  title = {Inverse-Covariance Weighted Index for Python},
  year = {2025},
  url = {https://github.com/josh-ashkinaze/inverse-covariance-weighted-index}
}

Issues

Please open an issue if you find any bugs or edge cases.

ToDos

Add option for user-specified reference group as in Schwab et al. (2020) [DONE]
Add handling for missing data as in Schwab et al. (2020)

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

joshash

Release history Release notifications | RSS feed

This version

0.1.0

Nov 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

icw_index-0.1.0.tar.gz (4.9 kB view details)

Uploaded Nov 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

icw_index-0.1.0-py3-none-any.whl (5.3 kB view details)

Uploaded Nov 30, 2025 Python 3

File details

Details for the file icw_index-0.1.0.tar.gz.

File metadata

Download URL: icw_index-0.1.0.tar.gz
Upload date: Nov 30, 2025
Size: 4.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for icw_index-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`689f8dc2fec38f6bfe5c6b528bb09e7ac12594c47ce1eafd52ccf539a9438144`
MD5	`113db2c7bb8fc2f6b42127d43e91c5b4`
BLAKE2b-256	`4a43b80b7069396a6c10346993a7dfad5771d435924aaacaafac11bfc88cec0d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for icw_index-0.1.0.tar.gz:

Publisher: publish.yml on josh-ashkinaze/icw-index

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: icw_index-0.1.0.tar.gz
- Subject digest: 689f8dc2fec38f6bfe5c6b528bb09e7ac12594c47ce1eafd52ccf539a9438144
- Sigstore transparency entry: 731938904
- Sigstore integration time: Nov 30, 2025
Source repository:
- Permalink: josh-ashkinaze/icw-index@976041f08d7032b65b2d02ebfe845a4a55297c86
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/josh-ashkinaze
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@976041f08d7032b65b2d02ebfe845a4a55297c86
- Trigger Event: release

File details

Details for the file icw_index-0.1.0-py3-none-any.whl.

File metadata

Download URL: icw_index-0.1.0-py3-none-any.whl
Upload date: Nov 30, 2025
Size: 5.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for icw_index-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5a4ada519a1c835fa33aabc95f59f5d0108664cc1a0feb8ae90e36349664bacb`
MD5	`98862376ed8c97a5b1860b9e5914a3c2`
BLAKE2b-256	`f66997121a9b2aaf07dd7f4ec63240e2e3282f50161550bcee207546b64acce5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for icw_index-0.1.0-py3-none-any.whl:

Publisher: publish.yml on josh-ashkinaze/icw-index

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: icw_index-0.1.0-py3-none-any.whl
- Subject digest: 5a4ada519a1c835fa33aabc95f59f5d0108664cc1a0feb8ae90e36349664bacb
- Sigstore transparency entry: 731938905
- Sigstore integration time: Nov 30, 2025
Source repository:
- Permalink: josh-ashkinaze/icw-index@976041f08d7032b65b2d02ebfe845a4a55297c86
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/josh-ashkinaze
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@976041f08d7032b65b2d02ebfe845a4a55297c86
- Trigger Event: release

icw-index 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

Inverse-Covariance Weighted Index for Python

Quick Start

What is the ICW Index?

Implementation Details

Validation

Results

Limitations

System I Ran Tests On

References

Citation

Issues

ToDos

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance