An open-source tool for reading OpenStreetMap PBF files using DuckDB
Project description
Generated using DALLยทE 3 model with this prompt: A logo for a python library with White background, high quality, 8k. Cute duck and globe with cartography elements. Library for reading OpenStreetMap data using DuckDB.
QuackOSM
An open-source tool for reading OpenStreetMap PBF files using DuckDB.
What is QuackOSM ๐ฆ?
- Scalable reader for OpenStreetMap ProtoBuffer (
pbf
) files. - Is based on top of
DuckDB
[^1] with itsSpatial
[^2] extension. - Saves files in the
GeoParquet
[^3] file format for easier integration with modern cloud stacks. - Utilizes multithreading unlike GDAL that works in a single thread only.
- Can filter data based on geometry without the need for
ogr2ogr
clipping before operation. - Can filter data based on OSM tags.
- Utilizes caching to reduce repeatable computations.
- Can be used as Python module as well as a beautiful CLI based on
Typer
[^4].
[^1]: DuckDB Website [^2]: DuckDB Spatial extension repository [^3]: GeoParquet data format [^4]: Typer docs
Installing
As pure Python module
pip install quackosm
With beautiful CLI
pip install quackosm[cli]
Required Python version?
QuackOSM supports Python >= 3.9
Dependencies
Required:
- duckdb (>=0.9.2)
- pyarrow (>=13.0.0)
- geoarrow-pyarrow (>=0.1.1)
- geopandas
- shapely
- typeguard
Optional:
- typer[all] (click, colorama, rich, shellingham)
Usage
Load data as a GeoDataFrame
>>> import quackosm as qosm
>>> qosm.get_features_gdf(monaco_pbf_path)
tags geometry
feature_id
node/10005045289 {'shop': 'bakery'} POINT (7.42245 43.73105)
node/10020887517 {'leisure': 'swimming_pool', ... POINT (7.41316 43.73384)
node/10021298117 {'leisure': 'swimming_pool', ... POINT (7.42777 43.74277)
node/10021298717 {'leisure': 'swimming_pool', ... POINT (7.42630 43.74097)
node/10025656383 {'ferry': 'yes', 'name': 'Qua... POINT (7.42550 43.73690)
... ... ...
way/990669427 {'amenity': 'shelter', 'shelt... POLYGON ((7.41461 43.7338...
way/990669428 {'highway': 'secondary', 'jun... LINESTRING (7.41366 43.73...
way/990669429 {'highway': 'secondary', 'jun... LINESTRING (7.41376 43.73...
way/990848785 {'addr:city': 'Monaco', 'addr... POLYGON ((7.41426 43.7339...
way/993121275 {'building': 'yes', 'name': ... POLYGON ((7.43214 43.7481...
[7906 rows x 2 columns]
Just convert PBF to GeoParquet
>>> import quackosm as qosm
>>> gpq_path = qosm.convert_pbf_to_gpq(monaco_pbf_path)
>>> gpq_path.as_posix()
'files/monaco_nofilter_noclip_compact.geoparquet'
Inspect the file with duckdb
>>> import duckdb
>>> duckdb.load_extension('spatial')
>>> duckdb.read_parquet(str(gpq_path)).project(
... "* REPLACE (ST_GeomFromWKB(geometry) AS geometry)"
... ).order("feature_id")
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ feature_id โ tags โ geometry โ
โ varchar โ map(varchar, varchโฆ โ geometry โ
โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ node/10005045289 โ {shop=bakery} โ POINT (7.4224498 43.7310532) โ
โ node/10020887517 โ {leisure=swimming_โฆ โ POINT (7.4131561 43.7338391) โ
โ node/10021298117 โ {leisure=swimming_โฆ โ POINT (7.4277743 43.7427669) โ
โ node/10021298717 โ {leisure=swimming_โฆ โ POINT (7.4263029 43.7409734) โ
โ node/10025656383 โ {ferry=yes, name=Qโฆ โ POINT (7.4254971 43.7369002) โ
โ node/10025656390 โ {amenity=restauranโฆ โ POINT (7.4269287 43.7368818) โ
โ node/10025656391 โ {name=Capitainerieโฆ โ POINT (7.4272127 43.7359593) โ
โ node/10025656392 โ {name=Direction deโฆ โ POINT (7.4270392 43.7365262) โ
โ node/10025656393 โ {name=IQOS, openinโฆ โ POINT (7.4275175 43.7373195) โ
โ node/10025656394 โ {artist_name=Anna โฆ โ POINT (7.4293446 43.737448) โ
โ ยท โ ยท โ ยท โ
โ ยท โ ยท โ ยท โ
โ ยท โ ยท โ ยท โ
โ way/986864693 โ {natural=bare_rock} โ POLYGON ((7.4340482 43.745598, 7.4340263 4โฆ โ
โ way/986864694 โ {barrier=wall} โ LINESTRING (7.4327547 43.7445382, 7.432808โฆ โ
โ way/986864695 โ {natural=bare_rock} โ POLYGON ((7.4332994 43.7449315, 7.4332912 โฆ โ
โ way/986864696 โ {barrier=wall} โ LINESTRING (7.4356006 43.7464325, 7.435574โฆ โ
โ way/986864697 โ {natural=bare_rock} โ POLYGON ((7.4362767 43.74697, 7.4362983 43โฆ โ
โ way/990669427 โ {amenity=shelter, โฆ โ POLYGON ((7.4146087 43.733883, 7.4146192 4โฆ โ
โ way/990669428 โ {highway=secondaryโฆ โ LINESTRING (7.4136598 43.7334433, 7.413640โฆ โ
โ way/990669429 โ {highway=secondaryโฆ โ LINESTRING (7.4137621 43.7334251, 7.413746โฆ โ
โ way/990848785 โ {addr:city=Monaco,โฆ โ POLYGON ((7.4142551 43.7339622, 7.4143113 โฆ โ
โ way/993121275 โ {building=yes, namโฆ โ POLYGON ((7.4321416 43.7481309, 7.4321638 โฆ โ
โโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 7906 rows (20 shown) 3 columns โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Use as CLI
$ quackosm monaco.osm.pbf
โ น [ 1/18] Filtering nodes โข 0:00:00
โ ง [ 2/18] Filtering ways โข 0:00:00
โ ด [ 3/18] Filtering relations โข 0:00:00
โ น [ 4/18] Loading required ways โข 0:00:00
โ ผ [ 5/18] Loading required nodes โข 0:00:00
โ [ 6/18] Saving nodes with geometries โข 0:00:00
โ [ 7/18] Saving filtered nodes with structs โข 0:00:00
โ [ 8/18] Grouping required ways โข 0:00:00
[ 9/18] Saving required ways with linestrings 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 1/1 โข 0:00:00 โข 0:00:00
โ น [ 10/18] Saving ways with geometries โข 0:00:00
โ ธ [ 11/18] Saving valid relation parts โข 0:00:00
โ [12.1/18] Saving relation inner parts - valid geometries โข 0:00:00
โ [12.2/18] Saving relation inner parts - invalid geometries โข 0:00:00
โ [13.1/18] Saving relation outer parts - valid geometries โข 0:00:00
โ [13.2/18] Saving relation outer parts - invalid geometries โข 0:00:00
โ [ 14/18] Saving relation outer parts with holes โข 0:00:00
โ [ 15/18] Saving relation outer parts without holes โข 0:00:00
โ [ 16/18] Saving relation with geometries โข 0:00:00
โ น [17.1/18] Saving valid features โข 0:00:00
โ [ 18/18] Saving final geoparquet file โข 0:00:00
files/monaco_nofilter_noclip_compact.geoparquet
You can find full API + more examples in the docs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
quackosm-0.3.0.tar.gz
(672.2 kB
view hashes)
Built Distribution
quackosm-0.3.0-py3-none-any.whl
(38.9 kB
view hashes)