Skip to main content

Pythonic access to UNESCO data

Project description

unesco_reader

PyPI PyPI - Python Version Documentation Status codecov Black

Pythonic access to UNESCO data

unesco_reader is a Python package providing a simple interface to access UNESCO data. UNESCO does not currently provide an API to access its data, particularly the widely used UNESCO Institute for Statistics (UIS) data. Users must download the data from the UIS bulk download services as a zipped file, and then extract the data from the zip file. This requires several manual steps, and some of the datasets are too large to be read easily with a standard spreadsheet program and must be read programmatically. UNESCO provides some guidance on how to do this in their python tutorial.

With unesco_reader, users don't need to worry about downloading the data, extracting it from the zip file, and following the python tutorial - this is all taken care of. This package handles accessing the data directly from the UNESCO website, and provides a simple interface to explore the data.

Note:

This package is currently in development, and only supports UIS datasets. It contains basic functionality to extract and interact with the data, and will be expanded to include more analytical functionality in the future. All feedback, suggestions, and contributions are welcome!

Installation

$ pip install unesco-reader

Usage

To access UIS data, import the uis module from unesco_reader

from unesco_reader import uis

You can see available datasets or retrieve information for a particular dataset. To see all available datasets from UIS, run the following function:

uis.available_datasets()

The output will be a list of available dataset codes ['SDG', 'OPRI', 'SCI', 'SDG11', 'DEM'].

Optionally you can return available datasets as names, and see available datasets that belong to a particular category.

uis.available_datasets(as_names=True, category='education')

To see details about a particular dataset, call the dataset_info() function passing in either the dataset code or name.

uis.dataset_info('SDG')

Information about the dataset will be printed:

----------------  -----------------------------------------------
dataset_name      SDG Global and Thematic Indicators
dataset_code      SDG
dataset_category  education
----------------  -----------------------------------------------

To extract and explore the data in a particular dataset, use the UIS class. A UIS object allows a user to extract the data, either from directly from UIS bulk download services, or from a zipped file downloaded locally, and explore and analyze the data easily.

To use, first create an instance of UIS, passing either the dataset code or name. Here we create an object for the "SDG" dataset.

sdg = uis.UIS("SDG")

Once instantiated, you can retrieve relevant information about the dataset

sdg.dataset_name # SDG Global and Thematic Indicators
sdg.dataset_code # SDG
sdg.dataset_category # education
sdg.link # https://apimgmtstzgjpfeq2u763lag.blob.core.windows.net/uisdatastore/SDG.zip

To access and start exploring the data, you need to load the data to the object using the load_data method. This will download the data from the UNESCO website, clean it, and format it as a pandas DataFrame stored in the object. Optionally, if you have downloaded the zipped file locally, you can pass the path to the file.

sdg = UIS("SDG")
sdg.load_data()

Once the data is loaded, you can access it using the get_data method.

df = sdg.get_data()
print(df)

The result will be a pandas DataFrame with the data. Here is a sample what the data looks like:

INDICATOR_ID INDICATOR_NAME COUNTRY_ID COUNTRY_NAME YEAR VALUE
ADMI.ENDOFLOWERSEC.MAT Administration of a nationally-representative... ABW Aruba 2014 0.0
ADMI.ENDOFLOWERSEC.MAT Administration of a nationally-representative... ABW Aruba 2015 0.0
ADMI.ENDOFLOWERSEC.MAT Administration of a nationally-representative... ABW Aruba 2016 0.0
ADMI.ENDOFLOWERSEC.MAT Administration of a nationally-representative... ABW Aruba 2017 0.0
ADMI.ENDOFLOWERSEC.MAT Administration of a nationally-representative... ABW Aruba 2018 0.0

In the get_data you can specify whether you want to return country or regional (if available) data, and whether to include metadata in the dataframe.

Several other tools are available to explore the data. Please see the documentation for more details.

Contributing

All contributions are welcome! If you find a bug, or have a suggestion for a new feature, or an improvement on the documentation please open an issue. Since this project is under current development, please check open issues and make sure the issue has not been raised already.

A detailed overview of the contribution process can be found here. By contributing to this project, you agree to abide by its terms.

License

unesco_reader was created by Luca Picci. It is licensed under the terms of the MIT license.

Credits

unesco_reader was created with cookiecutter and the py-pkgs-cookiecutter template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unesco_reader-0.3.0.tar.gz (11.9 kB view hashes)

Uploaded Source

Built Distribution

unesco_reader-0.3.0-py3-none-any.whl (10.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page