Skip to main content

Fast file-based format for geometries with Geopandas

Project description

geofeather

Build Status Coverage Status

A faster file-based format for geometries with geopandas.

This project capitalizes on the very fast feather file format to store geometry (points, lines, polygons) data for interoperability with geopandas.

Introductory post.

Why does this exist?

This project exists because reading and writing standard spatial formats (e.g., shapefile) in geopandas is slow. I was working with millions of geometries in multiple processing steps, and needed a fast way to read and write intermediate files.

In our benchmarks, we see about 5-6x faster file writes than writing from geopandas to shapefile via .to_file() on a GeoDataFrame.

We see about 2x faster reads compared to geopandas read_file() function.

How does it work?

The feather format works brilliantly for standard pandas data frames. In order to leverage the feather format, we simply convert the geometry data from shapely objects into Well Known Binary (WKB) format, and then store that column as raw bytes.

We store the coordinate reference system using JSON format in a sidecar file .crs.

Installation

Available on PyPi at: https://pypi.org/project/geofeather/

pip install geofeather

Usage

Write

Given an existing GeoDataFrame my_gdf, pass this into to_geofeather:

to_geofeather(my_gdf, 'test.feather')

Read

my_gdf = from_geofeather('test.feather')

Indexes

Right now, indexes are not supported in feather files. In order to get around this, simply reset your index before calling to_geofeather.

Changes

0.2.0

  • allow reading a subset of columns from a feather file
  • store geometry in 'geometry' column instead of 'wkb' column (simplification to avoid renaming columns)

0.1.0

  • Initial release

Credits

Everything that makes this fast is due to the hard work of contributors to pyarrow, geopandas, and shapely.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for geofeather, version 0.2.0
Filename, size File type Python version Upload date Hashes
Filename, size geofeather-0.2.0-py3-none-any.whl (5.4 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size geofeather-0.2.0.tar.gz (3.9 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page