Python library for downloading UNICEF indicators via SDMX API
Project description
unicefData - Python Package
Python component of the trilingual unicefData library for downloading UNICEF SDG indicators via SDMX API
This is the Python implementation of the unicefData package. For other implementations, see the links below.
Other languages: R | Stata | Main README
Installation
pip install unicefdata
For development:
git clone https://github.com/unicef-drp/unicefData.git
cd unicefData/python
pip install -e ".[dev]"
Verify Installation
import unicefdata
print(unicefdata.__version__) # Should print: 2.1.0
What's New in 2.1.0
- 🧪 Cross-language test suite: 14 shared fixture tests validating structural consistency across Python, R, and Stata
- 📚 YAML schema documentation: Comprehensive format reference for all 7 YAML file types
- 🗑️ Enhanced cache management: 5-layer cache clearing with optional reload, 30-day staleness threshold
- 🔍 Improved 404 errors: All not-found errors now include tried dataflows in error messages
- ✅ Version alignment: All sub-modules now match package version, dynamic User-Agent strings
- 🧹 Removed hardcoded paths: All path resolution is now dynamic
See CHANGELOG.md for complete details.
Quick Start
Search for Indicators
from unicefdata import search_indicators, list_categories
# Search by keyword
search_indicators("mortality")
search_indicators("stunting")
# List all categories
list_categories()
# Search within a category
search_indicators(category="CME")
search_indicators("rate", category="CME")
Download Data
from unicefdata import unicefData
# Fetch under-5 mortality (dataflow auto-detected)
df = unicefData(
indicator="CME_MRY0T4",
countries=["ALB", "USA", "BRA"],
year="2015:2023" # Range, single year, or list
)
print(df.head())
View Dataflow Schema
from unicefdata import dataflow_schema, print_dataflow_schema
schema = dataflow_schema("CME")
print_dataflow_schema(schema)
Post-Production Options
Output Formats
# Long format (default)
df = unicefData(indicator="CME_MRY0T4", format="long")
# Wide format - years as columns
df = unicefData(indicator="CME_MRY0T4", format="wide")
# Wide indicators - indicators as columns
df = unicefData(
indicator=["CME_MRY0T4", "NT_ANT_HAZ_NE2_MOD"],
format="wide_indicators"
)
Latest Value Per Country
df = unicefData(indicator="CME_MRY0T4", latest=True)
Most Recent Values (MRV)
df = unicefData(indicator="CME_MRY0T4", mrv=3)
Circa (Nearest Year)
df = unicefData(indicator="NT_ANT_HAZ_NE2", year=2015, circa=True)
Add Metadata
df = unicefData(
indicator="CME_MRY0T4",
add_metadata=["region", "income_group", "continent"]
)
Combining Options
df = unicefData(
indicator=["CME_MRY0T4", "NT_ANT_HAZ_NE2_MOD"],
format="wide_indicators",
latest=True,
add_metadata=["region", "income_group"],
dropna=True
)
API Reference
unicefData()
Main function for fetching UNICEF indicator data.
| Parameter | Type | Default | Description |
|---|---|---|---|
indicator |
str/list | required | Indicator code(s) |
dataflow |
str | auto-detect | SDMX dataflow ID |
countries |
list | None (all) | ISO3 country codes |
year |
int/str/list | None (all) | Year(s) |
circa |
bool | False | Find closest year |
sex |
str | "_T" |
Sex filter |
max_retries |
int | 3 | Retry attempts |
Post-Production Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
format |
str | "long" |
"long", "wide", "wide_indicators" |
latest |
bool | False | Keep only latest per country |
mrv |
int | None | Keep N most recent values |
add_metadata |
list | None | Metadata to add |
dropna |
bool | False | Remove missing values |
simplify |
bool | False | Keep only essential columns |
UNICEFSDMXClient (Advanced)
from unicefdata import UNICEFSDMXClient
client = UNICEFSDMXClient()
# Fetch single indicator
df = client.fetch_indicator(
"CME_MRY0T4",
countries=["ALB", "USA"],
start_year=2015,
end_year=2023
)
# Fetch multiple indicators
df = client.fetch_multiple_indicators(
["CME_MRY0T4", "NT_ANT_HAZ_NE2_MOD"],
countries=["ALB", "USA"],
combine=True
)
Other Functions
| Function | Description |
|---|---|
search_indicators(query, category, limit) |
Search indicators |
list_categories() |
List all categories |
list_dataflows() |
List available dataflows |
dataflow_schema(dataflow) |
Get dataflow schema |
clear_cache() |
Clear all 5 cache layers |
Time Period Handling
Monthly periods are converted to decimal years:
| Original | Decimal | Calculation |
|---|---|---|
2020 |
2020.0 |
Integer year |
2020-01 |
2020.0833 |
2020 + 1/12 |
2020-06 |
2020.5000 |
2020 + 6/12 |
Common Indicators
Child Mortality (SDG 3.2)
CME_MRM0- Neonatal mortality rateCME_MRY0T4- Under-5 mortality rate
Nutrition (SDG 2.2)
NT_ANT_HAZ_NE2_MOD- Stunting prevalenceNT_ANT_WHZ_NE2- Wasting prevalence
Immunization (SDG 3.b)
IM_DTP3- DTP3 coverageIM_MCV1- Measles coverage
WASH (SDG 6)
WS_PPL_W-SM- Safely managed waterWS_PPL_S-SM- Safely managed sanitation
Child Protection
PT_CHLD_Y0T4_REG- Birth registrationPT_F_20-24_MRD_U18_TND- Child marriage
Error Handling
from unicefdata import SDMXNotFoundError, SDMXBadRequestError, SDMXTimeoutError
try:
df = unicefData(indicator="INVALID_CODE")
except SDMXNotFoundError as e:
print(f"Indicator not found: {e}")
except SDMXBadRequestError as e:
print(f"Invalid request: {e}")
except SDMXTimeoutError as e:
print(f"Request timed out: {e}")
Configurable Timeout
from unicefdata import UNICEFSDMXClient
# Set custom timeout (default: 60s)
client = UNICEFSDMXClient(timeout=120)
Troubleshooting
Connection Errors
# Increase retry attempts
df = unicefData(indicator="CME_MRY0T4", max_retries=5)
Stale Cache
from unicefdata import clear_cache
clear_cache() # Clears all 5 cache layers
Examples
See examples/ folder:
00_quick_start.py- Basic usage01_indicator_discovery.py- Finding indicators02_sdg_indicators.py- SDG queries03_data_formats.py- Output formats04_metadata_options.py- Metadata enrichment05_advanced_features.py- Advanced options
Version History
See CHANGELOG.md for complete changelog.
Dependencies
- pandas
- requests
- pyyaml
Acknowledgments
This package was developed at the UNICEF Data and Analytics Section. The author gratefully acknowledges the collaboration of Lucas Rodrigues, Yang Liu, and Karen Avanesian, whose technical contributions and feedback were instrumental in the development of this Python package.
Special thanks to Yves Jaques, Alberto Sibileau, and Daniele Olivotti for designing and maintaining the UNICEF SDMX data warehouse infrastructure that makes this package possible.
The author also acknowledges the UNICEF database managers and technical teams who ensure data quality, as well as the country office staff and National Statistical Offices whose data collection efforts make this work possible.
Development of this package was supported by UNICEF institutional funding for data infrastructure and statistical capacity building. The author also acknowledges UNICEF colleagues who provided testing and feedback during development, as well as the broader open-source Python community.
Development was assisted by AI coding tools (GitHub Copilot, Claude). All code has been reviewed, tested, and validated by the package maintainers.
Disclaimer
This package is provided for research and analytical purposes.
The unicefData package provides programmatic access to UNICEF's public data warehouse. While the author is affiliated with UNICEF, this package is not an official UNICEF product and the statements in this documentation are the views of the author and do not necessarily reflect the policies or views of UNICEF.
Data accessed through this package comes from the UNICEF Data Warehouse. Users should verify critical data points against official UNICEF publications at data.unicef.org.
This software is provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the authors or UNICEF be liable for any claim, damages or other liability arising from the use of this software.
The designations employed and the presentation of material in this package do not imply the expression of any opinion whatsoever on the part of UNICEF concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries.
Data Citation and Provenance
Important Note on Data Vintages
Official statistics are subject to revisions as new information becomes available and estimation methodologies improve. UNICEF indicators are regularly updated based on new surveys, censuses, and improved modeling techniques. Historical values may be revised retroactively to reflect better information or methodological improvements.
For reproducible research and proper data attribution, users should:
- Document the indicator code - Specify the exact SDMX indicator code(s) used (e.g.,
CME_MRY0T4) - Record the download date - Note when data was accessed (e.g., "Data downloaded: 2026-02-09")
- Cite the data source - Reference both the package and the UNICEF Data Warehouse
- Archive your dataset - Save a copy of the exact data used in your analysis
Example citation for data used in research:
Under-5 mortality data (indicator: CME_MRY0T4) accessed from UNICEF Data Warehouse via unicefData Python package (v2.1.0) on 2026-02-09. Data available at: https://sdmx.data.unicef.org/
This practice ensures that others can verify your results and understand any differences that may arise from data updates. For official UNICEF statistics in publications, always cross-reference with the current version at data.unicef.org.
License
MIT License - See LICENSE
Author
Joao Pedro Azevedo Chief Statistician, UNICEF Data and Analytics Section Email: jpazevedo@unicef.org Website: jpazvd.github.io
Contributing
See CONTRIBUTING.md for detailed guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file unicefdata-2.1.1.tar.gz.
File metadata
- Download URL: unicefdata-2.1.1.tar.gz
- Upload date:
- Size: 353.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4904904a7ef073f6ca3a239472987554adc6aa67958c2f67ae52d71433b14753
|
|
| MD5 |
9d205bdf7cc1a7ea2e67aa42078b21f7
|
|
| BLAKE2b-256 |
b9d517a6b4487c3a99ef6edd7d7d27fd684f97f4930bc2fe260f1cdcc42d3282
|
File details
Details for the file unicefdata-2.1.1-py3-none-any.whl.
File metadata
- Download URL: unicefdata-2.1.1-py3-none-any.whl
- Upload date:
- Size: 416.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f068b3c2662d5ebed23d38f7c31b00acebaba833b61b388f0d65a2ede0e97f75
|
|
| MD5 |
bd820ff7941ba583488d93f9449122d2
|
|
| BLAKE2b-256 |
a9b81cfa932c23531692e736da40eb5b558ce04db951cd1868dc3f45f3ae1729
|