Skip to main content

Package to calculate economic complexity and associated variables

Project description

Economic Complexity and Product Complexity

By the Growth Lab at Harvard's Center for International Development

This package is part of Harvard Growth Lab’s portfolio of software packages, digital products and interactive data visualizations. To browse our entire portfolio, please visit growthlab.app. To learn more about our research, please visit Harvard Growth Lab’s home page.

About

Python package to calculate economic complexity indices.

STATA implementation of the economic complexity index available at: https://github.com/cid-harvard/ecomplexity

Explore complexity and associated data using Harvard CID's Atlas tool: http://atlas.cid.harvard.edu

Tutorial

Installation: For the latest stable version: pip install ecomplexity

Latest version of the package under development (untested and possibly with bugs), install directly from GitHub: pip install git+https://github.com/cid-harvard/py-ecomplexity@develop

Usage:

from ecomplexity import ecomplexity
from ecomplexity import proximity

# Import trade data from CID Atlas
data_url = "https://intl-atlas-downloads.s3.amazonaws.com/country_hsproduct2digit_year.csv.zip"
data = pd.read_csv(data_url, compression="zip", low_memory=False)
data = data[['year','location_code','hs_product_code','export_value']]

# Calculate complexity
trade_cols = {'time':'year', 'loc':'location_code', 'prod':'hs_product_code', 'val':'export_value'}
cdata = ecomplexity(data, trade_cols)

# Calculate proximity matrix
prox_df = proximity(data, trade_cols)

Arguments:

data: pandas dataframe containing production / trade data.
    Including variables indicating time, location, product and value
cols_input: dict of column names for time, location, product and value.
    Example: {'time':'year', 'loc':'origin', 'prod':'hs92', 'val':'export_val'}
presence_test: str for test used for presence of industry in location.
    One of "rca" (default), "rpop", or "manual".
    Determines which values are used for M_cp calculations.
    If "manual", M_cp is taken as given from the "value" column in data
val_errors_flag: {'coerce','ignore','raise'}. Passed to pd.to_numeric
    *default* coerce.
rca_mcp_threshold: numeric indicating RCA threshold beyond which mcp is 1.
    *default* 1.
rpop_mcp_threshold: numeric indicating RPOP threshold beyond which mcp is 1.
    *default* 1. Only used if presence_test is not "rca".
pop: pandas df, with time, location and corresponding population, in that order.
    Not required if presence_test is "rca", which is the default.
continuous: Used to calculate product proximities, indicates whether
    to consider correlation of every product pair (True) or product
    co-occurrence (False). *default* False.
asymmetric: Used to calculate product proximities, indicates whether
    to generate asymmetric proximity matrix (True) or symmetric (False).
    *default* False.
proximity_edgelist: pandas df with cols 'prod1', 'prod2', 'proximity'.
    If None (default), proximity values are calculated from data.
knn: Number of nearest neighbors from proximity matrix to use to calculate
    density. Will use entire proximity matrix if None.
    *default* None.
check_logsupermodularity: If True (default), check log-supermodularity.
    If int, use roughly that many samples to check log-supermodularity.
report_logsupermodularity: If True, print warning if log-supermodularity.
    If False (default), don't.
verbose: Print year being processed

FAQ

  • Why are ECI and PCI are both normalized using ECI's mean and std. dev?
    • This normalization preserves the property that ECI = (mean of PCI of products for which MCP=1)
  • What is log-supermodularity?
    • Refer ecomplexity/log_supermodularity.py for a brief explanation. More at Schetter, U. (2019). A Structural Ranking of Economic Complexity (SSRN Scholarly Paper 3485842). https://doi.org/10.2139/ssrn.3485842.

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ecomplexity-0.5.3.tar.gz (15.4 kB view details)

Uploaded Source

Built Distribution

ecomplexity-0.5.3-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file ecomplexity-0.5.3.tar.gz.

File metadata

  • Download URL: ecomplexity-0.5.3.tar.gz
  • Upload date:
  • Size: 15.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.14

File hashes

Hashes for ecomplexity-0.5.3.tar.gz
Algorithm Hash digest
SHA256 f9eaed2bf58495de4f8648e05c452edc1dc8ced300b1d99856a19dd38c88b06c
MD5 ad19916f9231a2903665f9f680484681
BLAKE2b-256 5b5e5efc2dfef2c6fc0e993ad00c02951a11be9b01ab5f32aae0180bf1c38d28

See more details on using hashes here.

File details

Details for the file ecomplexity-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: ecomplexity-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.14

File hashes

Hashes for ecomplexity-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 60581b61bd7e4e32174f344a36918c833d83ab5a4892427dd1f8d034a3dedd54
MD5 08dbc6fef690413f3f281f53bf312bf1
BLAKE2b-256 e5abb35b670e7ea69fcfd6f5e015edbb47e0e003fa712c54f3481a21ef87aead

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page