Python client to help you work with WorldPop data for any region on earth
Project description
WorldPopPy README
A Python client for downloading, merging, and processing WorldPop raster data.
WorldPopPy provides a programmatic interface to the WorldPop open data archive.
WorldPop offers global, gridded datasets on population dynamics, night-light emissions, topography, and much more. These datasets are typically distributed as individual files per country. WorldPopPy abstracts the process of data discovery, retrieval, and preprocessing. Users query data by Area of Interest (AOI). The library automatically identifies the necessary country rasters, downloads them, and merges them into a unified dataset.
(See the Example Gallery below for a visual overview of the library's capabilities).
Key Features
- Fetch data for any region by passing GeoDataFrames, country codes, or bounding boxes.
- Easy handling of time-series through integration with
xarray. - Built-in optimisations to help you handle massive country rasters.
- Parallel data downloads with automatic retry logic, local caching, and dry-run support.
- Searchable data manifest, allowing you to quickly find WorldPop products of interest.
Installation
pip install worldpoppy
Quickstart
Example 1: Merging Population Rasters for Several Countries
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from worldpoppy import wp_raster, clean_axes, plot_country_borders
# Fetch & Merge Data
# `wp_raster` returns an xarray.DataArray ready for analysis and plotting.
countries = ['THA', 'KHM', 'LAO', 'VNM']
pop_data = wp_raster(
product_name='pop_g2_1km_r25a', # Low-res. pop. estimates (Global 2 series)
aoi=countries, years=2024
)
# Plot (Log-scale)
# We use fillna(0) to represent areas without population and +1 to avoid log(0).
(pop_data.fillna(0) + 1).plot(norm=LogNorm(), cmap='inferno', size=6)
plot_country_borders(countries, edgecolor='white', linewidth=0.5)
clean_axes(title=f"Lower Mekong Region (2024):\n{pop_data.sum() / 1e6:.1f}M People")
plt.show()
Example 2: Built-in Support for Time-series
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from worldpoppy import wp_raster, bbox_from_location, clean_axes
# Fetch Two Years of Night-light Data for Sihanoukville (Cambodia)
ntl_data = wp_raster(
product_name="ntl_viirs_g2",
aoi=bbox_from_location("Preah Sihanouk", width_km=100),
years=['first', 'last'] # Request first & last available year
)
# Plot: Xarray can create a facet grid by year
p = (ntl_data + 1).plot(
col="year", figsize=(10, 5),
cmap="inferno", vmax=50, norm=LogNorm(),
add_colorbar=False # Remove since radiance units are not intuitive
)
p.fig.suptitle('Night-light Growth in Sihanoukville', fontsize=12, fontweight='bold')
p.fig.subplots_adjust(top=0.875)
clean_axes(p)
plt.show()
Finding Data
Use show_supported_data_products for a quick overview of what is supported by WorldPopPy:
from worldpoppy import show_supported_data_products
# Print data products related to "population" from the Global 2 series
show_supported_data_products(keywords=["population", "global2"])
# Print static (single-year) data products available for Brazil
show_supported_data_products(static_only=True, iso3_codes="BRA")
Alternatively, you can also get the library's full data manifest as a pandas DataFrame:
from worldpoppy import wp_manifest
mdf = wp_manifest()
mdf.head()
Documentation
-
API Reference: https://worldpoppy.readthedocs.io/
-
Examples: See the
examples/folder in this repository.
Example Gallery
1. Visualising Night Lights
Quickly fetch, merge, and reproject night-light data for North and South Korea.
|
2. Analysing Population Growth
Visualise 10-year population change along the coast of West Africa.
|
3. Automatic Memory Optimisation
Handle large source rasters (2GB+) efficiently via automatic spatial subsetting.
|
4. Manual Memory Optimisation
Easily clip country geometries and lazy-load rasters with Dask.
|
Utilities
WorldPopPy includes helper functions to manage the local cache and download bandwidth.
1. Managing the Cache
Downloaded rasters are cached locally by default. You can change the location by setting the WORLDPOPPY_CACHE_DIR
environment variable.
from worldpoppy import purge_cache, get_cache_dir
# Print the cache directory
print(get_cache_dir())
# Check local cache size
purge_cache(dry_run=True)
# Delete all cached files
purge_cache(dry_run=False)
2. Download Dry Run
To estimate the size of a request before downloading, use the download_dry_run flag:
from worldpoppy import wp_raster
# Prints a summary of files to be downloaded without fetching them
wp_raster(
product_name='pop_g1',
aoi=['CAN', 'USA'],
years='all',
download_dry_run=True
)
Data Usage & Attribution
WorldPopPy is a client for accessing data; it does not host or own the data. Please note the following points regarding data provenance and citation:
-
Curated "Product Names": To simplify data discovery, this library organises WorldPop's thousands of raw files into curated "Data Products" with a consistent naming scheme (e.g.,
pop_g1_altorpop_g2_alt). These product names are specific to WorldPopPy. -
Know Your Data: While this library makes downloading and pre-processing easy, we strongly encourage you to understand what you are downloading. WorldPop datasets are often the result of complex modelling. Always check the
summary_urlprovided in the manifest for details and further notes.
from worldpoppy import wp_manifest
# Select country entries for one "product" using its curated WorldPopPy alias
mdf = wp_manifest(product_name='pop_g2_alt', iso3_codes='AFG')
# Inspect the raw metadata for one raster file (sourced from the WorldPop API)
row = mdf.iloc[0]
print(f"Source File Name: {row.dataset_name}")
print(f"Official Dataset Title: {row.api_entry_title}")
print(f"Official Data Category: {row.api_series_category}")
print(f"Dataset Summary: {row.summary_url}") # Read this before using data!
print("-----")
# > The internal fields below are for data discovery in WorldPopPy
print(f"Library Product Name: {row.product_name}")
print(f"Multi-year Product? {row.multi_year}")
print(f"Library Product Notes: {row.product_notes}")
- Cite the Source: If you use this data, please cite its original creators (WorldPop).
The scientific credit belongs to them. Note that the recommended citation style can differ between datasets, so be sure
to check the
summary_urlfor details.
Acknowledgements
WorldPopPy is inspired by the World Bank's BlackMarblePy package, which provided the blueprint for this library's download module and informed the API design.
Licence
This project is licensed under the Mozilla Public License. See LICENSE.txt for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file worldpoppy-0.4.1.tar.gz.
File metadata
- Download URL: worldpoppy-0.4.1.tar.gz
- Upload date:
- Size: 10.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2fae13e8284b36a0e78ba6c937a99de200ce2c097021761f839f772a7d597b03
|
|
| MD5 |
36c61b66f6d839eefd514cac88928fe6
|
|
| BLAKE2b-256 |
715e7d0cb86420bcf05ed7640017f1c5193a3920b82782a732c9eed1b7a462bc
|
File details
Details for the file worldpoppy-0.4.1-py3-none-any.whl.
File metadata
- Download URL: worldpoppy-0.4.1-py3-none-any.whl
- Upload date:
- Size: 10.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aea148717fbe054d74f0a1a5ce687a51fafc1a915e5f0b4af586d79821624abd
|
|
| MD5 |
4caa870983edf5c4146ce4476ce7151f
|
|
| BLAKE2b-256 |
5f6476d4c54ee48707dde4d862f5a746baf792fb8774fe6028856f0518f6f3af
|