Skip to main content

A fast implementation of querying for NUTS regions by location.

Project description

FastPyNUTS

A fast implementation of querying the NUTS - Nomenclature of territorial units for statistics dataset by location, particularly useful for large-scale applications.

Figure: NUTS levels (Eurostat)
Figure: Eurostat

Features

  • fast querying of NUTS regions (~0.3ms/query)
  • find all NUTS regions of a point or query user-defined NUTS-levels (0-3)
  • use your own custom NUTS dataset (other CRS, enriched metadata, etc.)

Installation

pip install fastpynuts

FastPyNUTS requires numpy, shapely, treelib and rtree

Usage

Initialization and finding NUTS regions

The NUTSfinder class is the main tool to determine the NUTS regions of a point. It can be initialized from a local file containing the NUTS regions, or via automatic download from Eurostat.

from fastpynuts import NUTSfinder

# construct from local file
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson")

# retrieve data automatically (file will be downloaded to or if already existing read from '.data')
nf = NUTSfinder.from_web(scale=1, year=2021, epsg=4326)


# find NUTS regions
point = (11.57, 48.13)
regions = nf.find(*point)                   # find all regions
regions3 = nf.find_level(*point, 3)         # only find NUTS-3 regions

Assessing the results

The NUTS regions will be returned as an ordered list of NUTSregion objects.

>>> regions
[NUTS0: DE, NUTS1: DE2, NUTS2: DE21, NUTS3: DE212]

Each region object holds information about

  • its ID and NUTS level
>>> region = regions[0]
>>> region.id
DE
>>> region.level
0
  • its geometry (a shapely Polygon or MultiPolygon) and the corresponding bounding box
>>> region.geom
<MULTIPOLYGON (((10.454 47.556, 10.44 47.525, 10.441 47.514, 10.432 47.504, ...>
>>> region.bbox
(5.867697, 47.270114, 15.04116, 55.058165)
  • further fields from the NUTS dataset and the original input feature in GeoJSON format
>>> region.properties
{
    "NUTS_ID": "DE",
    "LEVL_CODE": 0,
    "CNTR_CODE": "DE",
    "NAME_LATN": "Deutschland",
    "NUTS_NAME": "Deutschland",
    "MOUNT_TYPE": 0,
    "URBN_TYPE": 0,
    "COAST_TYPE": 0,
    "FID": "DE"
}
>>> region.feature
{
    'type': 'Feature',
    'geometry': {
        'type': 'MultiPolygon',
        'coordinates': [
            [
                [
                    [10.454439, 47.555797],
                    ...
                ]
            ]
        ],
    },
    'properties': {
        "NUTS_ID": "DE",
        ...
}

Advanced Usage

# apply a buffer to the input regions to catch points on the boundary (for further info on the buffering, see the documentation)
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson", buffer_geoms=1e-5)

# only load certain levels of regions (here levels 2 and 3)
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson", min_level=2, max_level=3)


# if the point to be queried is guaranteed to lie within a NUTS region, setting valid_point to True may speed up the runtime
regions = nf.find(*point, valid_point=True)

Runtime Comparison

FastPyNUTS is optimized for query speed and result correctness, at the expense of more expensive initialization time.

A R-tree-based approach proved to be the fastest option:

Benchmark for scale 1. Benchmark for scale 1.

Compared to other packages like nuts-finder, a large performance boost can be achieved

Tips:

  • if interested only in certain levels (0-3) of the NUTS dataset, initialize the NUTSfinder using its min_level and max_level arguments
  • if it's known beforehand that the queried point lies within the interior of a NUTS region, use find(valid_point=True)

For a full runtime analysis, see benchmark.ipynb

Contributors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastpynuts-1.0.0.tar.gz (26.1 kB view hashes)

Uploaded Source

Built Distribution

fastpynuts-1.0.0-py3-none-any.whl (23.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page