Skip to main content

A Python tool to parse OSM data from Protobuf format into GeoDataFrame.

Project description

Pyrosm

PyPI versionbuild statusCoverage Status

Pyrosm is a Python library for reading OpenStreetMap from protobuf files (*.osm.pbf) into Geopandas GeoDataFrames. Pyrosm makes it easy to extract various datasets from OpenStreetMap pbf-dumps including e.g. road networks (buildings and points of interest in progress).

The library has been developed by keeping performance in mind, hence, it is mainly written in Cython (Python with C-like performance) which makes it probably faster than any other Python alternatives for parsing OpenStreetMap data. Pyrosm is built on top of another Cython library called Pyrobuf which is a faster Cython alternative to Google's Protobuf library: It provides 2-4x boost in performance for deserializing the protocol buffer messages compared to Google's version with C++ backend. Google's Protocol Buffers is a commonly used and efficient method to serialize and compress structured data which is also used by OpenStreetMap contributors do distribute the OSM data in PBF format (Protocolbuffer Binary Format).

Pyrosm is easy to use and it provides a somewhat similar user interface as another popular Python library OSMnx for parsing different datasets from the OpenStreetMap pbf-dump including road networks, buildings and points of interest. The main difference between pyrosm and OSMnx is that OSMnx reads the data over internet using OverPass API, whereas pyrosm reads the data from local OSM data dumps that can be downloaded e.g. from GeoFabrik's website. This makes it possible to read data much faster thus allowing e.g. parsing street networks for whole country in a matter of minutes instead of hours (however, see caveats).

Current features

  • read street networks (separately for driving, cycling, walking and all-combined)
  • filter data based on bounding box

Road map

  • add parsing of building information
  • add parsing of places of interests (POIs)
  • add more tests

Install

Pyrosm is distributed via PyPi and it can be installed with pip:

$ pip install pyrosm

How to use?

Using pyrosm is straightforward. To read drivable street networks from OpenStreetMap protobuf file (package includes a small test protobuf file), simply:

from pyrosm import OSM
from pyrosm import get_path
fp = get_path("test_pbf")
# Initialize the OSM parser object
osm = OSM(fp)

# Read all drivable roads
drive_net = osm.get_network(network_type="driving")
drive_net.head()
...
  access bridge  ...        id                                           geometry
0   None   None  ...   4732994  LINESTRING (26.94310 60.52580, 26.94295 60.525...
1   None   None  ...   5184588  LINESTRING (26.94778 60.52231, 26.94717 60.522...
2   None    yes  ...   5184589  LINESTRING (26.94891 60.52181, 26.94778 60.52231)
3   None   None  ...   5184590  LINESTRING (26.94310 60.52580, 26.94452 60.525...
4   None   None  ...  22731285  LINESTRING (26.93072 60.52252, 26.93094 60.522...

[5 rows x 14 columns]

To get further information how to use the tool, you can use good old help:

help(osm.get_network)

...

Help on method get_network in module pyrosm.pyrosm:

get_network(network_type='walking') method of pyrosm.pyrosm.OSM instance
    Reads data from OSM file and parses street networks
    for walking, driving, and cycling.
    
    Parameters
    ----------
    
    network_type : str
        What kind of network to parse. Possible values are: 'walking' | 'cycling' | 'driving' | 'all'.

Examples

For further usage examples (for now), take a look at the tests, such as:

Performance

Proper benchmarking results are on their way, but to give some idea, reading all drivable roads in Helsinki Region (approx. 85,000 roads) takes approximately 10 seconds (laptop with 16GB memory, SSD drive, and Intel Core i5-8250U CPU 1.6 GHZ). And the result looks something like:

Helsinki_driving_net

Caveats

Filtering large files by bounding box

Although pyrosm provides possibility to filter even larger data files based on bounding box, this process can slow down the reading process significantly (1.5-3x longer) due to necessary lookups when parsing the data. This might not be an issue with smaller files (up to ~100MB) but with larger data dumps this can take longer than necessary.

Hence, a recommended approach with large data files is to first filter the protobuf file based on bounding box into a smaller subset by using a dedicated open source Java tool called Osmosis which is available for all operating systems. Detailed installation instructions are here, and instructions how to filter data based on bounding box are here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrosm-0.1.5.tar.gz (1.1 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page