Skip to main content

Easy geospatial data processing.

Project description

meridian

PyPI version Documentation Status Coverage Status Code style: black

Performant geospatial data processing in Python's language.

Meridian lets you treat your geospatial dataset like you would any other Python data structure, but it is backed with a spatial index for fast spatial queries. All data is stored in tuple-like objects, which makes it very memory-efficient.

Note: this library is still in alpha. The API and functionality will change often and without notice.

Usage

When shouldn't I use Meridian?

Meridian is not meant to be a replacement for a database system, and as such it's not particularly optimized or ergonomic for operations like finding specific records, though this is pretty easy to do with a filter. Also, if your data is highly mutable, e.g. you want to modify records in place, then you should probably look elsewhere.

When should I use Meridian?

Meridian shines when you have some reference dataset that you want to compare to an input dataset or single record.

Meridian expects that you have a decent understanding of the data which you would like to work with. It requires you to define an annotated model class which lists the attributes of the dataset which you want to work with. You do this by subclassing the meridian.Record object:

import meridian

class County(meridian.Record):
    name: str
    fips: str

Supposing you had a shape file with county geometry and the fields above, you could create a Dataset of Countys like so:

counties = County.load_from("path/to/counties.shp")

Meridian depends on the Fiona library to open most data files, which requires GDAL/OGR. Wheels are available for many platforms, but not all.

Creating a Dataset will immediately load the data into memory and create a spatial index which will be used for all queries. A Dataset has many attributes of other Python data structures: it is iterable, has a len, etc.

import meridian

from shapely import geometry


class County(meridian.Record):
    name: str
    fips: str

counties = County.load_from("path/to/counties.shp")

# Find out how many records you have
print(len(counties))

poi = geometry.shape({
    'type': 'Point',
    'coordinates': [-72.319261, 43.648956]
})

# Check if your poi intersects with the dataset
print(counties.intersects(poi)) # True

# See how many records intersect
print(counties.count(poi)) # 1

# Find the n nearest records to the query geometry
print(counties.nearest(poi, 3))

# The dataset itself is iterable.
for county in counties:
    print(county.name)

# iterate through all records in the dataset which bbox-intersect with poi
# Dataset.intersection returns a tuple of Records.
for county in counties.intersection(poi):
    print(county.name)

Please note that spatial methods check only for a bounding-box intersection; you must confirm that the objects returned actually intersect with your input.

All of the spatial query methods on a Dataset require only that the query object has a bounds property which returns a 4-tuple like (xmin, ymin, xmax, ymax). As long as that exists, meridian is agnostic of query geometry implementation, however it does use shapely geometry under the hood for the records stored within.

poi = geometry.shape({
    'type': 'Point',
    'coordinates': [-72.319261, 43.648956]
})

for county in counties:
    print(county.geojson)  # get back the record as GeoJSON
    print(county.bounds)  # The bounds of the geometry
    print(county.name) 

    # Record objects are fully compatible with all of the
    # objects & operations defined in the shapely package.
    print(poi.intersects(county))


# Even advanced operations like cascaded union work as expected.
from shapely.ops import cascaded_union

subset = counties.intersection(poi)

unioned = cascaded_union(subset)
print(unioned.wkt)

Finally, Meridian also includes utilities to easily and efficiently relate multiple datasets.

For now, see the examples directory.

TO BE FILLED IN:

  • Product / intersection helpers
  • Model behavior
    • Field defaults
    • Derived attributes

Installation

meridian requires GEOS (for the shapely library), GDAL/OGR for reading data formats, and Rtree/libspatialindex to create the spatial index used for querying.

Rtree does not have wheels and this the libspatialindex library must be installed independently. Installation info can be found here.

On Ubuntu you can use apt:

apt install -y libspatialindex-dev

Arch:

pacman -Syu spatialindex

On most systems, libspatialindex can be compiled from source. These instructions should work on Linux & macOS:

wget -qO- http://download.osgeo.org/libspatialindex/spatialindex-src-1.8.5.tar.gz | tar xz -C /tmp
cd /tmp/spatialindex-src-1.8.5 && ./configure; make; make install

On Linux, you might need to run ldconfig afterwards to ensure that the rtree python library can find the library correctly.

From pypi:

pip install meridian

Or, clone the repo and run

python path/to/repo/setup.py install

You can also use pip to install directly from the github repo:

pip install git+git://github.com/tomplex/meridian.git

If you use docker, there are images with all dependencies and the latest version of meridian pre-installed available on docker hub.

Opinions

meridian is opinionated and believes that data should generally be immutable. If you need your data to change, you should create new data representing your input + processing instead of changing old data. Thus, a Dataset is more like a frozenset in behavior than a list.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

meridian-0.4.0.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

meridian-0.4.0-py2.py3-none-any.whl (15.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file meridian-0.4.0.tar.gz.

File metadata

  • Download URL: meridian-0.4.0.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for meridian-0.4.0.tar.gz
Algorithm Hash digest
SHA256 55367086c35478bed4182b45e70e98cd0dbedd135c1d59e26dbe8b767b4661a5
MD5 64f858fb71596b1aaf0d4bf97f99247b
BLAKE2b-256 04fa59b75eba2a2ddf7e85859384944d97917bcfb084ac5d442a3df0924feb38

See more details on using hashes here.

File details

Details for the file meridian-0.4.0-py2.py3-none-any.whl.

File metadata

  • Download URL: meridian-0.4.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for meridian-0.4.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7589ead3d37208b7c6beb67d6dbf13ae5f4a2e5003f7139833c4df3de053ea66
MD5 3ad2f9e4efd2fa571569eec8a2b75e7f
BLAKE2b-256 fb6aa5fe9fe0f6ea274d59d7680b256f4631c6893baa7f1fa2c3ee443ad930ae

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page