Skip to main content

A fast implementation of querying for NUTS regions by location.

Project description

FastPyNUTS

A fast implementation of querying the NUTS - Nomenclature of territorial units for statistics dataset by location, particularly useful for large-scale applications.

Figure: NUTS levels (Eurostat)
Figure: Eurostat

Features

  • fast querying of NUTS regions (~0.3ms/query)
  • find all NUTS regions of a point or query user-defined NUTS-levels (0-3)
  • use your own custom NUTS dataset (other CRS, enriched metadata, etc.)

Installation

pip install fastpynuts

FastPyNUTS requires numpy, shapely, treelib and rtree

Usage

Initialization and finding NUTS regions

The NUTSfinder class is the main tool to determine the NUTS regions of a point. It can be initialized from a local file containing the NUTS regions, or via automatic download from Eurostat.

from fastpynuts import NUTSfinder

# construct from local file
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson")

# retrieve data automatically (file will be downloaded to or if already existing read from '.data')
nf = NUTSfinder.from_web(scale=1, year=2021, epsg=4326)


# find NUTS regions
point = (11.57, 48.13)
regions = nf.find(*point)                   # find all regions via a point

bbox = (11.57, 48.13, 11.62, 49.)           # lon_min, lat_min, lon_max, lat_max
regions = nf.find_bbox()                    # find all regions via a bbox

geom = {
    "type": "Polygon",
    "coordinates": [
        [
            [11.595733032762524, 48.11837184946995],
            [11.631858436052113, 48.14289890153063],
            [11.627498473585405, 48.16409081247133],
            [11.595733032762524, 48.11837184946995]
        ]
    ]
}
regions = nf.find_bbox()                    # find all regions via a GeoJSON geometry (supports shapely geometries and all objects that can be converted into one)


# filter for regions of specific levels
level3 = nf.filter_levels(regions, 3)
level2or3 = nf.filter_levels(regions, 2, 3)

Assessing the results

The NUTS regions will be returned as an ordered list of NUTSregion objects.

>>> regions
[NUTS0: DE, NUTS1: DE2, NUTS2: DE21, NUTS3: DE212]

Each region object holds information about

  • its ID and NUTS level
>>> region = regions[0]
>>> region.id
DE
>>> region.level
0
  • its geometry (a shapely Polygon or MultiPolygon) and the corresponding bounding box
>>> region.geom
<MULTIPOLYGON (((10.454 47.556, 10.44 47.525, 10.441 47.514, 10.432 47.504, ...>
>>> region.bbox
(5.867697, 47.270114, 15.04116, 55.058165)
  • further fields from the NUTS dataset and the original input feature in GeoJSON format
>>> region.properties
{
    "NUTS_ID": "DE",
    "LEVL_CODE": 0,
    "CNTR_CODE": "DE",
    "NAME_LATN": "Deutschland",
    "NUTS_NAME": "Deutschland",
    "MOUNT_TYPE": 0,
    "URBN_TYPE": 0,
    "COAST_TYPE": 0,
    "FID": "DE"
}
>>> region.feature
{
    'type': 'Feature',
    'geometry': {
        'type': 'MultiPolygon',
        'coordinates': [
            [
                [
                    [10.454439, 47.555797],
                    ...
                ]
            ]
        ],
    },
    'properties': {
        "NUTS_ID": "DE",
        ...
}

Advanced Usage

# apply a buffer to the input regions to catch points on the boundary (for further info on the buffering, see the documentation)
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson", buffer_geoms=1e-5)

# only load certain levels of regions (here levels 2 and 3)
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson", min_level=2, max_level=3)


# if the point to be queried is guaranteed to lie within a NUTS region, setting valid_point to True may speed up the runtime
regions = nf.find(*point, valid_point=True)

Runtime Comparison

FastPyNUTS is optimized for query speed and result correctness, at the expense of more expensive initialization time.

A R-tree-based approach proved to be the fastest option:

Benchmark for scale 1. Benchmark for scale 1.

Compared to other packages like nuts-finder, a large performance boost can be achieved

Tips:

  • if interested only in certain levels (0-3) of the NUTS dataset, initialize the NUTSfinder using its min_level and max_level arguments
  • if it's known beforehand that the queried point lies within the interior of a NUTS region, use find(valid_point=True)

For a full runtime analysis, see benchmark.ipynb

Contributors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastpynuts-1.2.0.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

fastpynuts-1.2.0-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file fastpynuts-1.2.0.tar.gz.

File metadata

  • Download URL: fastpynuts-1.2.0.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.8.19

File hashes

Hashes for fastpynuts-1.2.0.tar.gz
Algorithm Hash digest
SHA256 66f2589d3afd9657d1f20a44b51a3ec363f9bcbb6cb1fa2b82e1dc1ca2f79620
MD5 2e077cc3d63b271fba9d8244c6dc6a23
BLAKE2b-256 1d18d3a70b573ba9e1bc7c3ab089b18405103fc17bfb59fc3880f5ef03878300

See more details on using hashes here.

File details

Details for the file fastpynuts-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: fastpynuts-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 24.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.8.19

File hashes

Hashes for fastpynuts-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8c1465c9a1164138a97f808285725dbf151b24d12e770eb15d0aed1fd061fc44
MD5 f19f303f050c904e15495b5ebf41c2c1
BLAKE2b-256 bfc8fadc0dab70c6d49fd23229b48482f8708e3ea6f764f6183401ea8c318601

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page