Skip to main content

Python package for downloading economic data from the International Monetary Fund JSON RESTful API endpoint.

Project description

imfr

imfp, by Christopher C. Smith, is a Python package for downloading data from the International Monetary Funds's RESTful JSON API.

Installation

To install the development version of imfp, use:

pip install --upgrade https://github.com/jkbr/httpie/tarball/master

To load the library, use import:

import imfp

Usage

Suggested packages

imfp outputs data in a pandas data frame, so you will want to use the pandas package for its functions for viewing and manipulating this object type. I also recommend matplotlib and numpy, common libraries for data analysis for making plots. These packages can be installed using pip and loaded using import:

import pandas
import matplotlib
import numpy as np

Setting a Unique Application Name with imf_app_name

imfp.imf_app_name() allows users to set a custom application name to be used when making API calls to the IMF API. The IMF API has an application-based rate limit of 50 requests per second, with the application identified by the "user_agent" variable in the request header.

This could prove problematic if the imfp library became too popular and too many users tried to make simultaneous API requests using the default app name. By setting a custom application name, users can avoid hitting rate limits and being blocked by the API. imfp.imf_app_name() sets the application name by changing the IMF_APP_NAME variable in the environment. If this variable doesn't exist, imfp.imf_app_name() will create it.

To set a custom application name, simply call the imfp.imf_app_name() function with your desired application name as an argument:

# Set custom app name as an environment variable
imfp.imf_app_name("my_custom_app_name")

The function will throw an error if the provided name is missing, NULL, NA, not a string, or longer than 255 characters. If the provided name is "imfr" (the default) or an empty string, the function will issue a warning recommending the use of a unique app name to avoid hitting rate limits.

Fetching an Index of Databases with the imf_databases Function

The imfp package introduces four core functions: imfp.imf_databases, imfp.imf_parameters, imfp.imf_parameter_defs, and imfp.imf_dataset. The function for downloading datasets is imfp.imf_dataset, but you will need the other functions to determine what arguments to supply to imfp.imf_dataset. For instance, all calls to imfp.imf_dataset require a database_id. This is because the IMF serves many different databases through its API, and the API needs to know which of these many databases you're requesting data from. To obtain a list of databases, use imfp.imf_databases, like so:

#Fetch the list of databases available through the IMF API
databases = imfp.imf_databases()
databases.head()
database_id description
0 BOP_2017M06 Balance of Payments (BOP), 2017 M06
1 BOP_2020M3 Balance of Payments (BOP), 2020 M03
2 BOP_2017M11 Balance of Payments (BOP), 2017 M11
3 DOT_2020Q1 Direction of Trade Statistics (DOTS), 2020 Q1
4 GFSMAB2016 Government Finance Statistics Yearbook (GFSY 2...

This function returns the IMF’s listing of 259 databases available through the API. (In reality, 7 of the listed databases are defunct and not actually available: FAS_2015, GFS01, FM202010, APDREO202010, AFRREO202010, WHDREO202010, BOPAGG_2020.)

To view and explore the database list, it’s possible to explore subsets of the data frame by row number with databases.loc:

# View a subset consisting of rows 5 through 9
databases.loc[5:9]
database_id description
5 BOP_2019M12 Balance of Payments (BOP), 2019 M12
6 GFSYFALCS2014 Government Finance Statistics Yearbook (GFSY 2...
7 GFSE2016 Government Finance Statistics Yearbook (GFSY 2...
8 FM201510 Fiscal Monitor (FM) October 2015
9 GFSIBS2016 Government Finance Statistics Yearbook (GFSY 2...

Or, if you already know which database you want, you can fetch the corresponding code by searching for a string match using str.contains and subsetting the data frame for matching rows. For instance, here’s how to search for the Primary Commodity Price System:

databases[databases['description'].str.contains("Commodity")]
database_id description
238 PCTOT Commodity Terms of Trade
241 PCPS Primary Commodity Price System (PCPS)

Fetching a List of Parameters and Input Codes with imf_parameters and imf_parameter_defs

Once you have a database_id, it’s possible to make a call to imfp.imf_dataset to fetch the entire database: imfp.imf_dataset(database_id). However, while this will succeed for some small databases, it will fail for many of the larger ones. And even when it succeeds, fetching an entire database can take a long time. You’re much better off supplying additional filter parameters to reduce the size of your request.

Requests to databases available through the IMF API are complicated by the fact that each database uses a different set of parameters when making a request. (At last count, there were 43 unique parameters used in making API requests from the various databases!) You also have to have the list of valid input codes for each parameter. The imfp.imf_parameters function solves this problem. Use the function to obtain the full list of parameters and valid input codes for a given database:

# Fetch list of valid parameters and input codes for commodity price database
params = imfp.imf_parameters("PCPS")

The imfp.imf_parameters function returns a dictionary of data frames. Each dictionary key name corresponds to a parameter used in making requests from the database:

# Get key names from the params object
params.keys()
dict_keys(['freq', 'ref_area', 'commodity', 'unit_measure'])

In the event that a parameter name is not self-explanatory, the imfp.imf_parameter_defs function can be used to fetch short text descriptions of each parameter:

# Fetch and display parameter text descriptions for the commodity price database
imfp.imf_parameter_defs("PCPS")
parameter description
0 freq Frequency
1 ref_area Geographical Areas
2 commodity Indicator
3 unit_measure Unit

Each named list item is a data frame containing a vector of valid input codes that can be used with the named parameter, and a vector of text descriptions of what each code represents.

To access the data frame containing valid values for each parameter, subset the params dict by the parameter name:

# View the data frame of valid input codes for the frequency parameter
params['freq']
input_code description
0 A Annual
1 M Monthly
2 Q Quarterly

Supplying Parameter Arguments to imf_dataset: A Tale of Two Workflows

There are two ways to supply parameters to imfp.imf_dataset: by supplying list arguments or by supplying a modified parameters dict. The list arguments workflow will be more intuitive for most users, but the dict argument workflow requires a little less code.

The List Arguments Workflow

To supply list arguments, just find the codes you want and supply them to imfp.imf_dataset using the parameter name as the argument name. The example below shows how to request 2000–2015 annual coal prices from the Primary Commodity Price System database:

# Fetch the 'freq' input code for annual frequency
selected_freq = list(
    params['freq']['input_code'][params['freq']['description'].str.contains("Annual")]
)

# Fetch the 'commodity' input code for coal
selected_commodity = list(
    params['commodity']['input_code'][params['commodity']['description'].str.contains("Coal")]
)

# Fetch the 'unit_measure' input code for index
selected_unit_measure = list(
    params['unit_measure']['input_code'][params['unit_measure']['description'].str.contains("Index")]
)

# Request data from the API
df = imfp.imf_dataset(database_id = "PCPS",
         freq = selected_freq, commodity = selected_commodity,
         unit_measure = selected_unit_measure,
         start_year = 2000, end_year = 2015)

# Display the first few entries in the retrieved data frame
df.head()
freq ref_area commodity unit_measure unit_mult time_format time_period obs_value
0 A W00 PCOAL IX 0 P1Y 2000 39.3510230293202
1 A W00 PCOAL IX 0 P1Y 2001 49.3378587284039
2 A W00 PCOAL IX 0 P1Y 2002 39.4949091648006
3 A W00 PCOAL IX 0 P1Y 2003 43.2878876950788
4 A W00 PCOAL IX 0 P1Y 2004 82.9185858052862

The Parameters Argument Workflow

To supply a list object, modify each data frame in the params list object to retain only the rows you want, and then supply the modified list object to imfp.imf_dataset as its parameters argument. Here is how to make the same request for annual coal price data using a parameters list:

# Fetch the 'freq' input code for annual frequency
params['freq'] = params['freq'][params['freq']['description'].str.contains("Annual")]

# Fetch the 'commodity' input code for coal
params['commodity'] = params['commodity'][params['commodity']['description'].str.contains("Coal")]

# Fetch the 'unit_measure' input code for index
params['unit_measure'] = params['unit_measure'][params['unit_measure']['description'].str.contains("Index")]

# Request data from the API
df = imfp.imf_dataset(database_id = "PCPS",
         parameters = params,
         start_year = 2000, end_year = 2015)

# Display the first few entries in the retrieved data frame
df.head()
freq ref_area commodity unit_measure unit_mult time_format time_period obs_value
0 A W00 PCOAL IX 0 P1Y 2000 39.3510230293202
1 A W00 PCOAL IX 0 P1Y 2001 49.3378587284039
2 A W00 PCOAL IX 0 P1Y 2002 39.4949091648006
3 A W00 PCOAL IX 0 P1Y 2003 43.2878876950788
4 A W00 PCOAL IX 0 P1Y 2004 82.9185858052862

Working with the Returned Data Frame

Note that all columns in the returned data frame are character vectors, and that to plot the series we will need to convert to valid numeric or date formats.

Also note that the returned data frame has mysterious-looking codes as values in some columns.

Codes in the time_format column are ISO 8601 duration codes. In this case, “P1Y” means “periods of 1 year.” The unit_mult column represents the number of zeroes you should add to the value column. For instance, if value is in millions, then the unit multiplier will be 6. If in billions, then the unit multiplier will be 9.

The meanings of the other codes are stored in our params object and can be fetched with a join. For instance to fetch the meaning of the ref_area code “W00”, we can perform a left join with the params['ref_area'] data frame and use select to replace ref_area with the parameter description:

# Join df with params['ref_area'] to fetch code description
df = df.merge(params['ref_area'], left_on='ref_area',right_on='input_code',how='left')

# Drop redundant columns and rename description column
df = df.drop(columns=['ref_area','input_code']).rename(columns={"description":"ref_area"})

# View first few columns in the modified data frame
df.head()
freq commodity unit_measure unit_mult time_format time_period obs_value ref_area
0 A PCOAL IX 0 P1Y 2000 39.3510230293202 All Countries, excluding the IO
1 A PCOAL IX 0 P1Y 2001 49.3378587284039 All Countries, excluding the IO
2 A PCOAL IX 0 P1Y 2002 39.4949091648006 All Countries, excluding the IO
3 A PCOAL IX 0 P1Y 2003 43.2878876950788 All Countries, excluding the IO
4 A PCOAL IX 0 P1Y 2004 82.9185858052862 All Countries, excluding the IO

Alternatively, we can simply replace the code in our data series with the corresponding description in params. Here, we replace each unit_measure code with the corresponding description in params['unit_measure']:

# Replace each unique unit_measure code in df with corresponding description
# in params['unit_measure']
for code in np.unique(df['unit_measure']):
    df['unit_measure'][df['unit_measure'] == (code)] = (
        params['unit_measure']['description'][params['unit_measure']['input_code'] == (code)][0]
    )

# Display the first few entries in the retrieved data frame using knitr::kable
df.head()
freq commodity unit_measure unit_mult time_format time_period obs_value ref_area
0 A PCOAL Index 0 P1Y 2000 39.3510230293202 All Countries, excluding the IO
1 A PCOAL Index 0 P1Y 2001 49.3378587284039 All Countries, excluding the IO
2 A PCOAL Index 0 P1Y 2002 39.4949091648006 All Countries, excluding the IO
3 A PCOAL Index 0 P1Y 2003 43.2878876950788 All Countries, excluding the IO
4 A PCOAL Index 0 P1Y 2004 82.9185858052862 All Countries, excluding the IO

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imfp-0.0.1.tar.gz (16.1 kB view hashes)

Uploaded Source

Built Distribution

imfp-0.0.1-py3-none-any.whl (16.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page