Library to help you work with `WorldPop` data for any region on earth
Project description
WorldPopPy
WorldPopPy is a Python package that helps you work with data from the WorldPop project. WorldPop offers global, gridded geo-datasets on population dynamics, land-cover features, night-light emissions, and several other attributes of human and natural geography. This package streamlines the process of downloading, combining, and cleaning WorldPop data for different geographic regions and years.
Key Features
- Fetch data for any part of the world by passing GeoDataFrames, country codes, or bounding boxes.
- Easy handling of annual time-series through integration with
xarray. - Parallel data downloads with retry mechanism and ability to preview estimated download sizes (dry run).
- Auto-updating manifest file so you stay up-to-date with WorldPop’s latest available datasets.
Installation
WorldPopPy is available on PyPI and can be
installed using pip:
pip install worldpoppy
Quickstart
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from worldpoppy import wp_raster, clean_axis
# Fetch night-light data for the Korean Peninsula.
# Data is returned as an `xarray.DataArray` ready for analysis and plotting
viirs_data = wp_raster(
product_name='viirs_100m', # name of WorldPop's night-light product
aoi=['PRK', 'KOR'], # three-letter country codes for North and South Korea
years=2015,
masked=True, # mask missing values with NaN (instead of WorldPop's default fill value),
)
# Downsample the data to speed-up plotting
lowres = viirs_data.coarsen(x=5, y=5, boundary='trim').mean()
# Plot
lowres.plot(vmin=0.1, cmap='inferno', norm=LogNorm())
clean_axis(title='Night Lights (2015)\nKorean Peninsula')
plt.show()
More detailed example
Below, we visualise population growth in a patch of West Africa from 2000 to 2020. The geographic area of interest is selected with a helper function that can convert a location name into a bounding box. The example below also shows you how to re-project WorldPop data into a different Coordinate Reference System (CRS).
import matplotlib.pyplot as plt
import numpy as np
from worldpoppy import *
# Define the area of interest
# Note: `bbox_from_location` runs a `Nomatim` query under the hood
aoi_box = bbox_from_location('Accra', width_km=500) # returns (min_lon, min_lat, max_lon, max_lat)
# Define the target CRS (optional)
aeqa_africa = "ESRI:102022" # an Albers Equal Area projection optimised for Africa
# Fetch the population data
pop_data = wp_raster(
product_name='ppp', # name of the WorldPop product (here: # of people per raster cell)
aoi=aoi_box, # you could also pass a GeoDataFrame or official country codes
years=[2000, 2020], # the years of interest (for annual WorldPop products only)
masked=True, # mask missing values with NaN (instead of WorldPop's default fill value)
to_crs=aeqa_africa # if None is provided, CRS of the source data will be kept (EPSG:4326)
)
# Compute population changes on downsampled data
lowres = pop_data.coarsen(x=10, y=10, year=1, boundary='trim').reduce(np.sum) # will propagate NaNs
pop_change = lowres.sel(year=2020) - lowres.sel(year=2000)
# Plot
pop_change.plot(cmap='coolwarm', vmax=1_000, cbar_kwargs=dict(shrink=0.85))
clean_axis(title='Estimated population change (2000 to 2020)')
# Add visual references
plot_country_borders(['GHA', 'TOG', 'BEN'], edgecolor='white', to_crs=aeqa_africa)
plot_location_markers(['Accra', 'Kumasi', 'Lomé'], to_crs=aeqa_africa)
plt.show()
Further details
Data dimensions
Calling wp_raster() will always
return an xarray.DataArray. The array dimensions, however, depend on the user query. If you request data for more
than one year, the returned array will include a year dimension in addition to the raster data's two spatial dimensions
(x and y). By contrast, the year dimension will be omitted if you request data for a single year only, or if the
WorldPop product in question is static anyway (e.g., when requesting elevation data).
Managing the local cache
By default, downloaded source data from WorldPop will be cached on disk for re-use. To disable caching, set cache_downloads=False
when calling wp_raster(). The default cache directory is ~/.cache/worldpoppy. This can be changed by pointing the WORLDPOPPY_CACHE_DIR
environment variable to the desired location, as shown here.
Use the following function to delete all cached data or simply check the local cache size:
from worldpoppy import purge_cache
purge_cache(dry_run=True)
# dry run will only print a cache summary and not delete any files
Download dry runs
Before you request data for large geographic areas and/or many years, you may want to check download requirements first.
Setting download_dry_run=True will check download requirements and print a summary:
from worldpoppy import wp_raster
_ = wp_raster(
product_name='ppp',
aoi='CAN USA MEX'.split(),
years='all', # query all available years for the specified product
download_dry_run=True # do not actually download anything and merely print a summary
)
# Note that `wp_raster` will return `None` in this case
Selecting data with a GeoDataFrame
... is straightforward, as shown in this example.
The WorldPop data manifest
Use the wp_manifest function
to load and optionally filter the manifest file listing all available WorldPop datasets:
from worldpoppy import wp_manifest
full_manifest = wp_manifest() # returns a `pandas.DataFrame`
full_manifest.head(2)
The local manifest file is auto-updated by comparing it against a remote version hosted on WorldPop servers.
If needed, the remote manifest is downloaded and cleaned for local use. Note that the remote WorldPop manifest sometimes
lists datasets that are not actually available for download. Requesting such datasets will trigger a DownloadError.
Downloads only?
If you are only interested in asynchronous country-data downloads from WorldPop, without any other functionality,
use the WorldPopDownloader class:
from worldpoppy import WorldPopDownloader
raster_fpaths = WorldPopDownloader().download(
product_name='srtm_slope_100m', # topographic slope
iso3_codes=['LIE'], # Liechtenstein
)
Acknowledgements
The implementation of WorldPopPy draws on the World Bank's BlackMarblePy package, which gives users easy access to night-light data from NASA's Black Marble project.
Feedback
If you would like to give feedback, encounter issues, or want to suggest improvements, please open an issue. Since this package is developed and tested on Linux, issues encountered on other platforms may take longer to address.
License
This projects is licensed under the Mozilla Public License. See LICENSE.txt for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file worldpoppy-0.1.0.tar.gz.
File metadata
- Download URL: worldpoppy-0.1.0.tar.gz
- Upload date:
- Size: 6.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a44d0ae8740e7513a2e2edfe1818fa5b6bd546fd35cb655e5510537101d24852
|
|
| MD5 |
e51de0683324e7fb67dd2491ce9d5340
|
|
| BLAKE2b-256 |
cffe3bec5074d4a71a92eb2bb4d3d288f3eeafca6feeb623ce58c10f83ac51bf
|
File details
Details for the file worldpoppy-0.1.0-py3-none-any.whl.
File metadata
- Download URL: worldpoppy-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e4ee8f5f8a5d9fef8f5f1a199c3d33b5ffc7d00967e2be2bd74dd93149d8adc
|
|
| MD5 |
06c69928190392f6c220a62a38e10e66
|
|
| BLAKE2b-256 |
d2256031f7ff8b065ee8dd482fcc7c4db592281eeb26e3cea425673184251329
|