Skip to main content

An open-source tool for reading OpenStreetMap PBF files using DuckDB

Project description


Generated using DALLยทE 3 model with this prompt: A logo for a python library with White background, high quality, 8k. Cute duck and globe with cartography elements. Library for reading OpenStreetMap data using DuckDB.

QuackOSM

An open-source tool for reading OpenStreetMap PBF files using DuckDB.

What is QuackOSM ๐Ÿฆ†?

  • Scalable reader for OpenStreetMap ProtoBuffer (pbf) files.
  • Is based on top of DuckDB[^1] with its Spatial[^2] extension.
  • Saves files in the GeoParquet[^3] file format for easier integration with modern cloud stacks.
  • Utilizes multithreading unlike GDAL that works in a single thread only.
  • Can filter data based on geometry without the need for ogr2ogr clipping before operation.
  • Can filter data based on OSM tags.
  • Utilizes caching to reduce repeatable computations.
  • Can be used as Python module as well as a beautiful CLI based on Typer[^4].

[^1]: DuckDB Website [^2]: DuckDB Spatial extension repository [^3]: GeoParquet data format [^4]: Typer docs

Installing

As pure Python module

pip install quackosm

With beautiful CLI

pip install quackosm[cli]

Required Python version?

QuackOSM supports Python >= 3.9

Dependencies

Required:

  • duckdb (>=0.9.2)
  • pyarrow (>=13.0.0)
  • geoarrow-pyarrow (>=0.1.1)
  • geopandas
  • shapely
  • typeguard

Optional:

  • typer[all] (click, colorama, rich, shellingham)

Usage

Load data as a GeoDataFrame

>>> import quackosm as qosm
>>> qosm.get_features_gdf(monaco_pbf_path)
                                              tags                      geometry
feature_id
node/10005045289                {'shop': 'bakery'}      POINT (7.42245 43.73105)
node/10020887517  {'leisure': 'swimming_pool', ...      POINT (7.41316 43.73384)
node/10021298117  {'leisure': 'swimming_pool', ...      POINT (7.42777 43.74277)
node/10021298717  {'leisure': 'swimming_pool', ...      POINT (7.42630 43.74097)
node/10025656383  {'ferry': 'yes', 'name': 'Qua...      POINT (7.42550 43.73690)
...                                            ...                           ...
way/990669427     {'amenity': 'shelter', 'shelt...  POLYGON ((7.41461 43.7338...
way/990669428     {'highway': 'secondary', 'jun...  LINESTRING (7.41366 43.73...
way/990669429     {'highway': 'secondary', 'jun...  LINESTRING (7.41376 43.73...
way/990848785     {'addr:city': 'Monaco', 'addr...  POLYGON ((7.41426 43.7339...
way/993121275      {'building': 'yes', 'name': ...  POLYGON ((7.43214 43.7481...

[7906 rows x 2 columns]

Just convert PBF to GeoParquet

>>> import quackosm as qosm
>>> gpq_path = qosm.convert_pbf_to_gpq(monaco_pbf_path)
>>> gpq_path.as_posix()
'files/monaco_nofilter_noclip_compact.geoparquet'

Inspect the file with duckdb

>>> import duckdb
>>> duckdb.load_extension('spatial')
>>> duckdb.read_parquet(str(gpq_path)).project(
...     "* REPLACE (ST_GeomFromWKB(geometry) AS geometry)"
... ).order("feature_id")
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚    feature_id    โ”‚         tags         โ”‚                   geometry                   โ”‚
โ”‚     varchar      โ”‚ map(varchar, varchโ€ฆ  โ”‚                   geometry                   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ node/10005045289 โ”‚ {shop=bakery}        โ”‚ POINT (7.4224498 43.7310532)                 โ”‚
โ”‚ node/10020887517 โ”‚ {leisure=swimming_โ€ฆ  โ”‚ POINT (7.4131561 43.7338391)                 โ”‚
โ”‚ node/10021298117 โ”‚ {leisure=swimming_โ€ฆ  โ”‚ POINT (7.4277743 43.7427669)                 โ”‚
โ”‚ node/10021298717 โ”‚ {leisure=swimming_โ€ฆ  โ”‚ POINT (7.4263029 43.7409734)                 โ”‚
โ”‚ node/10025656383 โ”‚ {ferry=yes, name=Qโ€ฆ  โ”‚ POINT (7.4254971 43.7369002)                 โ”‚
โ”‚ node/10025656390 โ”‚ {amenity=restauranโ€ฆ  โ”‚ POINT (7.4269287 43.7368818)                 โ”‚
โ”‚ node/10025656391 โ”‚ {name=Capitainerieโ€ฆ  โ”‚ POINT (7.4272127 43.7359593)                 โ”‚
โ”‚ node/10025656392 โ”‚ {name=Direction deโ€ฆ  โ”‚ POINT (7.4270392 43.7365262)                 โ”‚
โ”‚ node/10025656393 โ”‚ {name=IQOS, openinโ€ฆ  โ”‚ POINT (7.4275175 43.7373195)                 โ”‚
โ”‚ node/10025656394 โ”‚ {artist_name=Anna โ€ฆ  โ”‚ POINT (7.4293446 43.737448)                  โ”‚
โ”‚       ยท          โ”‚          ยท           โ”‚              ยท                               โ”‚
โ”‚       ยท          โ”‚          ยท           โ”‚              ยท                               โ”‚
โ”‚       ยท          โ”‚          ยท           โ”‚              ยท                               โ”‚
โ”‚ way/986864693    โ”‚ {natural=bare_rock}  โ”‚ POLYGON ((7.4340482 43.745598, 7.4340263 4โ€ฆ  โ”‚
โ”‚ way/986864694    โ”‚ {barrier=wall}       โ”‚ LINESTRING (7.4327547 43.7445382, 7.432808โ€ฆ  โ”‚
โ”‚ way/986864695    โ”‚ {natural=bare_rock}  โ”‚ POLYGON ((7.4332994 43.7449315, 7.4332912 โ€ฆ  โ”‚
โ”‚ way/986864696    โ”‚ {barrier=wall}       โ”‚ LINESTRING (7.4356006 43.7464325, 7.435574โ€ฆ  โ”‚
โ”‚ way/986864697    โ”‚ {natural=bare_rock}  โ”‚ POLYGON ((7.4362767 43.74697, 7.4362983 43โ€ฆ  โ”‚
โ”‚ way/990669427    โ”‚ {amenity=shelter, โ€ฆ  โ”‚ POLYGON ((7.4146087 43.733883, 7.4146192 4โ€ฆ  โ”‚
โ”‚ way/990669428    โ”‚ {highway=secondaryโ€ฆ  โ”‚ LINESTRING (7.4136598 43.7334433, 7.413640โ€ฆ  โ”‚
โ”‚ way/990669429    โ”‚ {highway=secondaryโ€ฆ  โ”‚ LINESTRING (7.4137621 43.7334251, 7.413746โ€ฆ  โ”‚
โ”‚ way/990848785    โ”‚ {addr:city=Monaco,โ€ฆ  โ”‚ POLYGON ((7.4142551 43.7339622, 7.4143113 โ€ฆ  โ”‚
โ”‚ way/993121275    โ”‚ {building=yes, namโ€ฆ  โ”‚ POLYGON ((7.4321416 43.7481309, 7.4321638 โ€ฆ  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 7906 rows (20 shown)                                                         3 columns โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Use as CLI

$ quackosm monaco.osm.pbf
โ น [   1/18] Filtering nodes โ€ข 0:00:00
โ ง [   2/18] Filtering ways โ€ข 0:00:00
โ ด [   3/18] Filtering relations โ€ข 0:00:00
โ น [   4/18] Loading required ways โ€ข 0:00:00
โ ผ [   5/18] Loading required nodes โ€ข 0:00:00
โ ™ [   6/18] Saving nodes with geometries โ€ข 0:00:00
โ ™ [   7/18] Saving filtered nodes with structs โ€ข 0:00:00
โ ‹ [   8/18] Grouping required ways โ€ข 0:00:00
  [   9/18] Saving required ways with linestrings 100% โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1/1 โ€ข 0:00:00 โ€ข 0:00:00
โ น [  10/18] Saving ways with geometries โ€ข 0:00:00
โ ธ [  11/18] Saving valid relation parts โ€ข 0:00:00
โ ‹ [12.1/18] Saving relation inner parts - valid geometries โ€ข 0:00:00
โ ‹ [12.2/18] Saving relation inner parts - invalid geometries โ€ข 0:00:00
โ ‹ [13.1/18] Saving relation outer parts - valid geometries โ€ข 0:00:00
โ ‹ [13.2/18] Saving relation outer parts - invalid geometries โ€ข 0:00:00
โ ‹ [  14/18] Saving relation outer parts with holes โ€ข 0:00:00
โ ‹ [  15/18] Saving relation outer parts without holes โ€ข 0:00:00
โ ™ [  16/18] Saving relation with geometries โ€ข 0:00:00
โ น [17.1/18] Saving valid features โ€ข 0:00:00
โ ‹ [  18/18] Saving final geoparquet file โ€ข 0:00:00
files/monaco_nofilter_noclip_compact.geoparquet

You can find full API + more examples in the docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quackosm-0.3.1.tar.gz (672.4 kB view details)

Uploaded Source

Built Distribution

quackosm-0.3.1-py3-none-any.whl (39.0 kB view details)

Uploaded Python 3

File details

Details for the file quackosm-0.3.1.tar.gz.

File metadata

  • Download URL: quackosm-0.3.1.tar.gz
  • Upload date:
  • Size: 672.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.11.2 CPython/3.11.7

File hashes

Hashes for quackosm-0.3.1.tar.gz
Algorithm Hash digest
SHA256 74694187836e53d1f46139aae2fea7e853dcf19661fccc22c03962981d200c40
MD5 f9efa08df4250fec97d2053c69c62477
BLAKE2b-256 11fd1f8cb2b40fe36266a0a97c8b9fb9048d69cedd03b1b023ff5b4a05094e5f

See more details on using hashes here.

Provenance

File details

Details for the file quackosm-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: quackosm-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 39.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.11.2 CPython/3.11.7

File hashes

Hashes for quackosm-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2c3cb1a444fc49c9133bed0c9036d9f02b460b2868972c5ebbc74bc99f20900d
MD5 d0f7578380a5460c5737d688d66dbec0
BLAKE2b-256 11378a6eafb795a84642e5ae4938a202d6cd98bab51912b39756d67503f2e94d

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page