A fast implementation of querying for NUTS regions by location.
Project description
FastPyNUTS
A fast implementation of querying the NUTS - Nomenclature of territorial units for statistics dataset by location, particularly useful for large-scale applications.
Figure: Eurostat
Features
- fast querying of NUTS regions (~0.3ms/query)
- find all NUTS regions of a point or query user-defined NUTS-levels (0-3)
- use your own custom NUTS dataset (other CRS, enriched metadata, etc.)
Installation
pip install fastpynuts
FastPyNUTS
requires numpy
, shapely
, treelib
and rtree
Usage
Initialization and finding NUTS regions
The NUTSfinder
class is the main tool to determine the NUTS regions of a point. It can be initialized from a local file
containing the NUTS regions, or via automatic download from Eurostat.
from fastpynuts import NUTSfinder
# construct from local file
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson")
# retrieve data automatically (file will be downloaded to or if already existing read from '.data')
nf = NUTSfinder.from_web(scale=1, year=2021, epsg=4326)
# find NUTS regions
point = (11.57, 48.13)
regions = nf.find(*point) # find all regions
regions3 = nf.find_level(*point, 3) # only find NUTS-3 regions
Assessing the results
The NUTS regions will be returned as an ordered list of NUTSregion
objects.
>>> regions
[NUTS0: DE, NUTS1: DE2, NUTS2: DE21, NUTS3: DE212]
Each region object holds information about
- its ID and NUTS level
>>> region = regions[0]
>>> region.id
DE
>>> region.level
0
- its geometry (a
shapely
Polygon or MultiPolygon) and the corresponding bounding box
>>> region.geom
<MULTIPOLYGON (((10.454 47.556, 10.44 47.525, 10.441 47.514, 10.432 47.504, ...>
>>> region.bbox
(5.867697, 47.270114, 15.04116, 55.058165)
- further fields from the NUTS dataset and the original input feature in GeoJSON format
>>> region.properties
{
"NUTS_ID": "DE",
"LEVL_CODE": 0,
"CNTR_CODE": "DE",
"NAME_LATN": "Deutschland",
"NUTS_NAME": "Deutschland",
"MOUNT_TYPE": 0,
"URBN_TYPE": 0,
"COAST_TYPE": 0,
"FID": "DE"
}
>>> region.feature
{
'type': 'Feature',
'geometry': {
'type': 'MultiPolygon',
'coordinates': [
[
[
[10.454439, 47.555797],
...
]
]
],
},
'properties': {
"NUTS_ID": "DE",
...
}
Advanced Usage
# apply a buffer to the input regions to catch points on the boundary (for further info on the buffering, see the documentation)
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson", buffer_geoms=1e-5)
# only load certain levels of regions (here levels 2 and 3)
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson", min_level=2, max_level=3)
# if the point to be queried is guaranteed to lie within a NUTS region, setting valid_point to True may speed up the runtime
regions = nf.find(*point, valid_point=True)
Runtime Comparison
FastPyNUTS
is optimized for query speed and result correctness, at the expense of more expensive initialization time.
A R-tree-based approach proved to be the fastest option:
Compared to other packages like nuts-finder, a large performance boost can be achieved
Tips:
- if interested only in certain levels (0-3) of the NUTS dataset, initialize the
NUTSfinder
using itsmin_level
andmax_level
arguments - if it's known beforehand that the queried point lies within the interior of a NUTS region, use
find(valid_point=True)
For a full runtime analysis, see benchmark.ipynb
Contributors
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for fastpynuts-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6cf51bd43c08cbdd251687e45d64c9baf55c97959bc427c9ee38b86977216e0d |
|
MD5 | e8ad52621d5949ae09f844aa8c4e9320 |
|
BLAKE2b-256 | 93d028ff2fac0e39fafe61c99d3eea6b9f42e74da6a7b0426d4579c82e18a03b |