Fast file-based format for geometries with Geopandas
A faster file-based format for geometries with
This project capitalizes on the very fast
feather file format to store geometry (points, lines, polygons) data for interoperability with
Why does this exist?
This project exists because reading and writing standard spatial formats (e.g., shapefile) in
geopandas is slow. I was working with millions of geometries in multiple processing steps, and needed a fast way to read and write intermediate files.
In our benchmarks, we see about 5-6x faster file writes than writing from geopandas to shapefile via
.to_file() on a
We see about 2x faster reads compared to geopandas
How does it work?
feather format works brilliantly for standard
pandas data frames. In order to leverage the
feather format, we simply convert the geometry data from
shapely objects into Well Known Binary (WKB) format, and then store that column as raw bytes.
We store the coordinate reference system using JSON format in a sidecar file
Available on PyPi at: https://pypi.org/project/geofeather/
pip install geofeather
Given an existing
my_gdf, pass this into
my_gdf = from_geofeather('test.feather')
Right now, indexes are not supported in
feather files. In order to get around this, simply reset your index before calling
- allow reading a subset of columns from a feather file
- store geometry in 'geometry' column instead of 'wkb' column (simplification to avoid renaming columns)
- Initial release
Everything that makes this fast is due to the hard work of contributors to
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size geofeather-0.2.0-py3-none-any.whl (5.4 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size geofeather-0.2.0.tar.gz (3.9 kB)||File type Source||Python version None||Upload date||Hashes View hashes|
Hashes for geofeather-0.2.0-py3-none-any.whl