Skip to main content

High performance rasterization tool for Python built in Rust

Project description

rusterize

High performance rasterization tool for Python built in Rust. This repository stems from the fasterize package built in C++ for R and ports parts of the logics into Python with a Rust backend, in addition to some useful improvements.

rusterize is designed to work on (multi)polygons and (multi)linestrings. Functionally, it takes an input geopandas dataframe and returns a xarray.

Installation

Install the current version with pip:

pip install rusterize

Contributing

Any contribution is welcome! You can install rusterize directly from this repo using maturin as an editable package. For this to work, you’ll need to have Rust and cargo installed.

# Clone repo
git clone https://github.com/<username>/rusterize.git
cd rusterize

# Install the Rust nightly toolchain
rustup toolchain install nightly-2025-01-05

 # Install maturin
pip install maturin

# Install editable version with optmized code
maturin develop --profile dist-release

API

This function has a simple API:

from rusterize.core import rusterize

# gdf = <import/modify dataframe as needed>

# rusterize
rusterize(gdf,
          res=(30, 30),
          out_shape=(10, 10)
          extent=(0, 300, 0, 300)
          field="field",
          by="by",
          fun="sum",
          background=0) 
  • gdf: geopandas dataframe to rasterize
  • res: tuple of (xres, yres) for desired resolution
  • out_shape: tuple of (nrows, ncols) for desired output shape
  • extent: tuple of (xmin, ymin, xmax, ymax) for desired output extent
  • field: field to rasterize. Default is None (a value of 1 is rasterized).
  • by: column to rasterize. Assigns each group to a band in the stack. Values are taken from field. Default is None (singleband raster)
  • fun: pixel function to use when multiple values overlap. Default is last. Available options are sum, first, last, min, max, count, or any
  • background: background value in final raster. Default is None (NaN)

Note that control over the desired extent is not as strict as for resolution and shape. That is, when resolution, output shape, and extent are specified, priority is given to resolution and shape. So, extent is not guaranteed, but resolution and shape are. If extent is not given, it is taken from the polygons and is not modified, unless you specify a resolution value. If you only specify an output shape, the extent is maintained. This mimics the logics of gdal_rasterize.

Usage

rusterize consists of a single function rusterize(). The Rust implementation returns an array that is converted to a xarray on the Python side for simpliicty.

from rusterize.core import rusterize
import geopandas as gpd
from shapely import wkt
import matplotlib.pyplot as plt

# example from fasterize
geoms = [
    "POLYGON ((-180 -20, -140 55, 10 0, -140 -60, -180 -20), (-150 -20, -100 -10, -110 20, -150 -20))",
    "POLYGON ((-10 0, 140 60, 160 0, 140 -55, -10 0))",
    "POLYGON ((-125 0, 0 60, 40 5, 15 -45, -125 0))",
    "MULTILINESTRING ((-180 -70, -140 -50), (-140 -50, -100 -70), (-100 -70, -60 -50), (-60 -50, -20 -70), (-20 -70, 20 -50), (20 -50, 60 -70), (60 -70, 100 -50), (100 -50, 140 -70), (140 -70, 180 -50))"
]

# Convert WKT strings to Shapely geometries
geometries = [wkt.loads(geom) for geom in geoms]

# Create a GeoDataFrame
gdf = gpd.GeoDataFrame({'value': range(1, len(geoms) + 1)}, geometry=geometries, crs='EPSG:32619')

# rusterize
output = rusterize(
    gdf,
    res=(1, 1),
    field="value",
    fun="sum"
).squeeze()

# plot it
fig, ax = plt.subplots(figsize=(12, 6))
output.plot.imshow(ax=ax)
plt.show()

Benchmarks

rusterize is fast! Let’s try it on small and large datasets.

from rusterize.core import rusterize
import geopandas as gpd
import requests
import zipfile
from io import BytesIO

# large dataset (~380 MB)
url = "https://s3.amazonaws.com/hp3-shapefiles/Mammals_Terrestrial.zip"
response = requests.get(url)

# unzip
with zipfile.ZipFile(BytesIO(response.content), 'r') as zip_ref:
    zip_ref.extractall()
    
# read
gdf_large = gpd.read_file("Mammals_Terrestrial/Mammals_Terrestrial.shp")

# small dataset (first 1000 rows)
gdf_small = gdf_large.iloc[:1000, :]

# rusterize at 1/6 degree resolution
def test_large(benchmark):
  benchmark(rusterize, gdf_large, (1/6, 1/6), fun="sum")
   
def test_small(benchmark):
  benchmark(rusterize, gdf_small, (1/6, 1/6), fun="sum")  

Then you can run it with pytest and pytest-benchmark:

pytest <python file> --benchmark-min-rounds=20 --benchmark-time-unit='s'

--------------------------------------------- benchmark: 1 tests --------------------------------------------
Name (time in s)         Min      Max     Mean  StdDev   Median     IQR  Outliers     OPS  Rounds  Iterations
-------------------------------------------------------------------------------------------------------------
test_large           10.5870  11.2302  10.8633  0.1508  10.8417  0.1594       4;1  0.0921      20           1
test_small            0.5083   0.6416   0.5265  0.0393   0.5120  0.0108       2;2  1.8995      20           1
-------------------------------------------------------------------------------------------------------------

And fasterize:

large <- st_read("Mammals_Terrestrial/Mammals_Terrestrial.shp", quiet = TRUE)
small <- large[1:1000, ]
fn <- function(v) {
  r <- raster(v, res = 1/6)
  return(fasterize(v, r, fun = "sum"))
}
microbenchmark(
  fasterize_large = f <- fn(large),
  fasterize_small = f <- fn(small),
  times=20L,
  unit='s'
)
Unit: seconds
      expr             min        lq      mean    median        uq       max  neval
 fasterize_large  9.565781  9.815375  10.02838  9.984965  10.18532  10.66656     20
 fasterize_small  0.469389  0.500616  0.571851  0.558818  0.613419  0.795159     20

And on even larger datasets? This is a benchmark with 350K+ geometries rasterized at 30 meters (20 rounds) with no field value and pixel function sum.

# rusterize
--------------------------------------------- benchmark: 1 tests --------------------------------------------
Name (time in s)         Min      Max     Mean  StdDev   Median     IQR  Outliers     OPS  Rounds  Iterations
-------------------------------------------------------------------------------------------------------------
test_sbw             46.5711  49.0212  48.4340  0.5504  48.5812  0.5054       3;1  0.0206      20           1
-------------------------------------------------------------------------------------------------------------

# fasterize
Unit: seconds
      expr      min       lq     mean   median       uq      max neval
 fasterize 62.12409 72.13832 74.53424 75.12375 77.72899 84.77415    20

In terms of (multi)line rasterization speed, here's a benchmark against gdal_rasterize using a layer from the province of Quebec, Canada, representing water courses for a total of ~4.5 million multilinestrings.

Comparison with other tools

While rusterize is fast, there are other fast alternatives out there, including:

  • GDAL
  • rasterio
  • geocube

However, rusterize allows for a seamless, Rust-native processing with similar or lower memory footprint that doesn't require you to leave Python, and returns the geoinformation you need for downstream processing with ample control over resolution, shape, and extent.

The following is a time comparison run on a dataset with 340K+ geometries, rasterized at 2m resolution.

rusterize:   24 sec
fasterize:   47 sec
GDAL (cli):  40 sec (read from fast drive, write to fast drive)
rasterio:    20 sec (but no spatial information)
geocube:     42 sec (larger memory footprint)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rusterize-0.2.0.tar.gz (71.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rusterize-0.2.0-cp310-abi3-win_amd64.whl (11.3 MB view details)

Uploaded CPython 3.10+Windows x86-64

rusterize-0.2.0-cp310-abi3-musllinux_1_2_x86_64.whl (11.7 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

rusterize-0.2.0-cp310-abi3-musllinux_1_2_armv7l.whl (11.9 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARMv7l

rusterize-0.2.0-cp310-abi3-musllinux_1_2_aarch64.whl (10.7 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

rusterize-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

rusterize-0.2.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (12.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ppc64le

rusterize-0.2.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (11.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARMv7l

rusterize-0.2.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (10.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

rusterize-0.2.0-cp310-abi3-macosx_11_0_arm64.whl (10.2 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

rusterize-0.2.0-cp310-abi3-macosx_10_12_x86_64.whl (11.3 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file rusterize-0.2.0.tar.gz.

File metadata

  • Download URL: rusterize-0.2.0.tar.gz
  • Upload date:
  • Size: 71.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for rusterize-0.2.0.tar.gz
Algorithm Hash digest
SHA256 394e7e7dc829d1b9a3b3ea1dc373a2ff748a08903f0e42aba043faed5e0c6d73
MD5 94a1230a879c5483f6612f75d9f4e7ec
BLAKE2b-256 4e3b8f2c21cea7f8c8b39d5bee3cb9567c68bd6c009935897099ed5d1c087b60

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0.tar.gz:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.2.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: rusterize-0.2.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 11.3 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for rusterize-0.2.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 3cc6b44ceaa0c2440e76c3ad8d4c3330ba1d9abb1486f3385b2392c0e81f39d2
MD5 572033f78141040d4b06926cb4c7a67c
BLAKE2b-256 5228f32cdd2c9f83bfbdd223090b40f2b7e3d18d23af8a3a025dbead29488bb8

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0-cp310-abi3-win_amd64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.2.0-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for rusterize-0.2.0-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 6f1fdcfe528307ab44ed5e5686b81862fa4f5bf78760adbdb07267be918ed8b9
MD5 ea4b856c9d6b8602640d6dee88ccf27f
BLAKE2b-256 be05975895ae5f1ec2fb92224179011cfbf6a480f2a5f87c1d6f21fdf7243e82

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0-cp310-abi3-musllinux_1_2_x86_64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.2.0-cp310-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for rusterize-0.2.0-cp310-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 45bc37b396e8aca5a51ddcd66b0b95d89845ff2f6a609478a5afb4cc48827034
MD5 3f8492caf743cdca78312d6116c837ab
BLAKE2b-256 63dd9436affbe355a89e227322aafb77e3e551bf24b6aaf264148db5ff4e09b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0-cp310-abi3-musllinux_1_2_armv7l.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.2.0-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for rusterize-0.2.0-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 7da5fe609be0b56255693f030bda9557410cf64ac51a3ad846ef15e9d097eefa
MD5 2883bc34e55075fca5284afe363455c7
BLAKE2b-256 1efacc49acad4ba976df401255517e03e95d2c109e145984fd9512166cb39878

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0-cp310-abi3-musllinux_1_2_aarch64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rusterize-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6e3140ee213b7e4cbb2256cf21697e00a4898d6c0cfa785eddfe5f8e036b7d74
MD5 784f881ed752efe2fb20c38401019b22
BLAKE2b-256 b278b14d988437e71e96c83a2d277679c73317782a263aff4952d482d41a7921

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.2.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for rusterize-0.2.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 0bc4474660f934bbe950fdadadc4221507c292ca30a2b1f0dcbea0fb54d46f53
MD5 826275aa504609ed8ef48dc024dca606
BLAKE2b-256 60b513a76249fcc82c93d6bf1776efa1e34cb6f9807fff9b0199bcb92b2165a6

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.2.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for rusterize-0.2.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 21f01b67239cb2d62795bb041c8576f895e4101825669f7a0e52e3cda43cc342
MD5 0af2f208ec97acd9d8dda533a224a779
BLAKE2b-256 6f25efe1880c07e728ae30d87e27010e60a6daef48b0437b038d299edacf79c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.2.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for rusterize-0.2.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 284573658a2c29e112df14c79e1241b3981c4b6a64854115344fe7b5fcdaf670
MD5 d96613ebbe1fbe3b7ec644f6a2402c7c
BLAKE2b-256 03eeef5f9a19ff11fe1b96b6d2a87868e4a867489df8ba6b434f1065874f9157

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.2.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rusterize-0.2.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 138755a2719fbe1b60debf6a58f4d27837d09bcf8111e4a490359c9f692cd6a9
MD5 56da4fab97f391d2004efbfc90fd4c5e
BLAKE2b-256 93a86b75cea9a5c0ea409530839db98f0be83f788c75730ac47ec44af5352e2f

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.2.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for rusterize-0.2.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 c583d482f52c9953e8f7c9ee6f3444522596713d17985c6b4b553faf36112fc8
MD5 cd47df8060e44788c7d04a7ad07ea152
BLAKE2b-256 8f8243c5a1df05475f25b4cc5dd501c118f5bd9f47ed1fe6e7938dfb3af7ef59

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.2.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page