Skip to main content

Access and process geospatial raster data in PySpark DataFrames

Project description

PyRasterFrames

PyRasterFrames enables access and processing of geospatial raster data in PySpark DataFrames.

Getting started

The quickest way to get started is to pip install the pyrasterframes package.

pip install pyrasterframes

You can then access a pyspark SparkSession using the local[*] master in your python interpreter as follows.

from pyrasterframes.utils import create_rf_spark_session
spark = create_rf_spark_session()

Then you can read a raster and do some work with it.

from pyrasterframes.rasterfunctions import *
from pyspark.sql.functions import lit
# Read a MODIS surface reflectance granule
df = spark.read.raster('https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF')
# Add 3 element-wise, show some rows of the dataframe
df.select(rf_local_add(df.tile, lit(3))).show(5, False)

Support

Reach out to us on gitter!

Issue tracking is through github.

Contributing

Community contributions are always welcome. To get started, please review our contribution guidelines, code of conduct, and developer's guide. Reach out to us on gitter so the community can help you get started!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

pyrasterframes-0.8.5-py3-none-any.whl (64.9 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page