Skip to main content

High performance rasterization tool for Python built in Rust

Project description

rusterize

High performance rasterization tool for Python built in Rust. This repository stems from the fasterize package built in C++ for R and ports parts of the logics into Python with a Rust backend, in addition to some useful improvements.

rusterize is designed to work on (multi)polygons and (multi)linestrings. Functionally, it takes an input geopandas dataframe and returns a xarray.

Installation

Install the current version with pip:

pip install rusterize

Contributing

Any contribution is welcome! You can install rusterize directly from this repo using maturin as an editable package. For this to work, you’ll need to have Rust and cargo installed.

# Clone repo
git clone https://github.com/<username>/rusterize.git
cd rusterize

# Install the Rust nightly toolchain
rustup toolchain install nightly-2025-01-05

 # Install maturin
pip install maturin

# Install editable version with optmized code
maturin develop --profile dist-release

API

This package has a simple API:

from rusterize.core import rusterize

# gdf = <import/modify dataframe as needed>

# rusterize
rusterize(gdf,
          res=(30, 30),
          out_shape=(10, 10)
          extent=(0, 300, 0, 300)
          field="field",
          by="by",
          fun="sum",
          background=0) 
  • gdf: geopandas dataframe to rasterize
  • res: tuple of (xres, yres) for desired resolution (default: None)
  • out_shape: tuple of (nrows, ncols) for desired output shape (default: None)
  • extent: tuple of (xmin, ymin, xmax, ymax) for desired output extent (default: None)
  • field: field to rasterize. (default: None -> a value of 1 is rasterized).
  • by: column to rasterize. Assigns each group to a band in the stack. Values are taken from field. (default: None -> singleband raster)
  • fun: pixel function to use when multiple values overlap. Available options are sum, first, last, min, max, count, or any. (default: last)
  • background: background value in final raster. (default: np.nan)

Note that control over the desired extent is not as strict as for resolution and shape. That is, when resolution, output shape, and extent are specified, priority is given to resolution and shape. So, extent is not guaranteed, but resolution and shape are. If extent is not given, it is taken from the polygons and is not modified, unless you specify a resolution value. If you only specify an output shape, the extent is maintained. This mimics the logics of gdal_rasterize.

Usage

rusterize consists of a single function rusterize(). The Rust implementation returns a dictionary that is converted to a xarray on the Python side for simpliicty.

from rusterize.core import rusterize
import geopandas as gpd
from shapely import wkt
import matplotlib.pyplot as plt

# Construct geometries
geoms = [
    "POLYGON ((-180 -20, -140 55, 10 0, -140 -60, -180 -20), (-150 -20, -100 -10, -110 20, -150 -20))",
    "POLYGON ((-10 0, 140 60, 160 0, 140 -55, -10 0))",
    "POLYGON ((-125 0, 0 60, 40 5, 15 -45, -125 0))",
    "MULTILINESTRING ((-180 -70, -140 -50), (-140 -50, -100 -70), (-100 -70, -60 -50), (-60 -50, -20 -70), (-20 -70, 20 -50), (20 -50, 60 -70), (60 -70, 100 -50), (100 -50, 140 -70), (140 -70, 180 -50))"
]

# Convert WKT strings to Shapely geometries
geometries = [wkt.loads(geom) for geom in geoms]

# Create a GeoDataFrame
gdf = gpd.GeoDataFrame({'value': range(1, len(geoms) + 1)}, geometry=geometries, crs='EPSG:32619')

# rusterize
output = rusterize(
    gdf,
    res=(1, 1),
    field="value",
    fun="sum"
).squeeze()

# plot it
fig, ax = plt.subplots(figsize=(12, 6))
output.plot.imshow(ax=ax)
plt.show()

Benchmarks

rusterize is fast! Let’s try it on small and large datasets.

from rusterize.core import rusterize
import geopandas as gpd
import requests
import zipfile
from io import BytesIO

# large dataset (~380 MB)
url = "https://s3.amazonaws.com/hp3-shapefiles/Mammals_Terrestrial.zip"
response = requests.get(url)

# unzip
with zipfile.ZipFile(BytesIO(response.content), 'r') as zip_ref:
    zip_ref.extractall()
    
# read
gdf_large = gpd.read_file("Mammals_Terrestrial/Mammals_Terrestrial.shp")

# small dataset (first 1000 rows)
gdf_small = gdf_large.iloc[:1000, :]

# rusterize at 1/6 degree resolution
def test_large(benchmark):
  benchmark(rusterize, gdf_large, res=(1/6, 1/6), fun="sum")
   
def test_small(benchmark):
  benchmark(rusterize, gdf_small, res=(1/6, 1/6), fun="sum")  

Then you can run it with pytest and pytest-benchmark:

pytest <python file> --benchmark-min-rounds=20 --benchmark-time-unit='s'

--------------------------------------------- benchmark: 1 tests --------------------------------------------
Name (time in s)         Min      Max     Mean  StdDev   Median     IQR  Outliers     OPS  Rounds  Iterations
-------------------------------------------------------------------------------------------------------------
rusterize_large       1.6430   1.9249   1.7442  0.1024   1.6878   0.1974      6;0  0.5733      20           1
rusterize_small       0.0912   0.1194   0.1014  0.0113   0.0953   0.0223      7;0  9.8633      20           1 
-------------------------------------------------------------------------------------------------------------

And fasterize:

library(sf)
library(raster)
library(fasterize)
library(microbenchmark)

large <- st_read("Mammals_Terrestrial/Mammals_Terrestrial.shp", quiet = TRUE)
small <- large[1:1000, ]
fn <- function(v) {
  r <- raster(v, res = 1/6)
  return(fasterize(v, r, fun = "sum"))
}
microbenchmark(
  fasterize_large = f <- fn(large),
  fasterize_small = f <- fn(small),
  times=20L,
  unit='s'
)
Unit: seconds
            expr       min         lq       mean     median         uq        max neval
 fasterize_large 9.9450280 10.6674467 10.8632224 10.9182963 11.1943478 11.3768210    20
 fasterize_small 0.4906411  0.5140836  0.5581061  0.5320919  0.5603512  0.8750579    20

And on an even larger datasets? Here we use a layer from the province of Quebec, Canada representing ~2M polygons of forest stands, rasterized at 30 meters (20 rounds) with no field value and pixel function any. The comparison with gdal_rasterize was run with hyperfine --runs 20 "gdal_rasterize -tr 30 30 -burn 1 <data_in> <data_out>".

# rusterize
--------------------------------------------- benchmark: 1 tests --------------------------------------------
Name (time in s)         Min      Max     Mean  StdDev   Median     IQR  Outliers     OPS  Rounds  Iterations
-------------------------------------------------------------------------------------------------------------
rusterize             6.7270   7.0098   6.7824  0.0646   6.7686   0.0266      2;2  0.1474      20           1
-------------------------------------------------------------------------------------------------------------

# fasterize
Unit: seconds
      expr      min       lq     mean   median       uq      max neval
 fasterize 157.4734 177.2055 194.3222 194.6455 213.9195 230.6504    20

# gdal_rasterize (CLI) - read from fast drive, write to fast drive
Time (mean ± σ):      5.801 s ±  0.124 s    [User: 4.381 s, System: 1.396 s]
Range (min … max):    5.649 s …  6.023 s    20 runs

In terms of (multi)line rasterization speed, here's a benchmark against gdal_rasterize using a layer from the province of Quebec, Canada, representing a subset of the road network for a total of ~535K multilinestrings.

# rusterize
--------------------------------------------- benchmark: 1 tests --------------------------------------------
Name (time in s)         Min      Max     Mean  StdDev   Median     IQR  Outliers     OPS  Rounds  Iterations
-------------------------------------------------------------------------------------------------------------
test                  4.5272   5.9488   4.7171  0.3236   4.6360  0.1680       2;2  0.2120      20           1
-------------------------------------------------------------------------------------------------------------

# gdal_rasterize (CLI) - read from fast drive, write to fast drive
Time (mean ± σ):      8.719 s ±  0.063 s    [User: 3.782 s, System: 4.917 s]
Range (min … max):    8.658 s …  8.874 s    20 runs

Comparison with other tools

While rusterize is fast, there are other fast alternatives out there, including GDAL, rasterio and geocube. However, rusterize allows for a seamless, Rust-native processing with similar or lower memory footprint that doesn't require you to leave Python, and returns the geoinformation you need for downstream processing with ample control over resolution, shape, and extent.

The following is a time comparison on a single run on the same forest stands dataset used earlier.

rusterize:    6.7 sec
rasterio:     68  sec (but no spatial information)
fasterize:    157 sec (including raster creation)
geocube:      260 sec (larger memory footprint)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rusterize-0.3.0.tar.gz (64.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rusterize-0.3.0-cp311-abi3-win_amd64.whl (11.8 MB view details)

Uploaded CPython 3.11+Windows x86-64

rusterize-0.3.0-cp311-abi3-musllinux_1_2_x86_64.whl (12.3 MB view details)

Uploaded CPython 3.11+musllinux: musl 1.2+ x86-64

rusterize-0.3.0-cp311-abi3-musllinux_1_2_armv7l.whl (12.4 MB view details)

Uploaded CPython 3.11+musllinux: musl 1.2+ ARMv7l

rusterize-0.3.0-cp311-abi3-musllinux_1_2_aarch64.whl (11.3 MB view details)

Uploaded CPython 3.11+musllinux: musl 1.2+ ARM64

rusterize-0.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

rusterize-0.3.0-cp311-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (12.8 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ppc64le

rusterize-0.3.0-cp311-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (12.1 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARMv7l

rusterize-0.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (11.2 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

rusterize-0.3.0-cp311-abi3-macosx_11_0_arm64.whl (10.6 MB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

rusterize-0.3.0-cp311-abi3-macosx_10_12_x86_64.whl (11.7 MB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file rusterize-0.3.0.tar.gz.

File metadata

  • Download URL: rusterize-0.3.0.tar.gz
  • Upload date:
  • Size: 64.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for rusterize-0.3.0.tar.gz
Algorithm Hash digest
SHA256 39011a88736a2d731805ff26bc9d4ec6db558297e8f5df1b80d16be6cd265748
MD5 c345f252d7d56b2f75683ace1cae7d72
BLAKE2b-256 57753715958245dd2b5b9a52ffd322dd7e0eeb25954463abdaffd264185ac82c

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0.tar.gz:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.3.0-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: rusterize-0.3.0-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 11.8 MB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for rusterize-0.3.0-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 1be740b39f09841e43fdf4cf65bb0a20d190978d5724bc5fa2d3fa32b883be4e
MD5 c3aecd8e4db39305fdaf9458d18bb884
BLAKE2b-256 430f15a68d277ff4bbaf4ff946a0b6ee8199f64fdc854f884316fba2c166b9f0

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0-cp311-abi3-win_amd64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.3.0-cp311-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for rusterize-0.3.0-cp311-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 a24f54e8c69dc2beee72b1d480e64f3e5fb7bd87f5f3bea1665228669e17ac04
MD5 5ae980d3cecc192f6590b2010bbdfefb
BLAKE2b-256 78c3bfbba53a57932414d8be7409e609c5505697252567178d449745e191d888

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0-cp311-abi3-musllinux_1_2_x86_64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.3.0-cp311-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for rusterize-0.3.0-cp311-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 4334a75d836764574a50a0c451fdfbf0065d92d750a791eb37681a688992a0bf
MD5 7bbfecb0fe8d891541c10c1498a61692
BLAKE2b-256 aa0436cce6d5aded25b264edd05af792e9a2562091b30060e808a7e79080ad45

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0-cp311-abi3-musllinux_1_2_armv7l.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.3.0-cp311-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for rusterize-0.3.0-cp311-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 ada4f7e1ebf1564f44d266478c42ee66281224133324039ea08388d3db13a038
MD5 fb20e936d0575006244b599b8e5635bf
BLAKE2b-256 fe89ad41aba12c92f80e1a1728c5c30cdabc1872f906d86b74b875d56eeb5046

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0-cp311-abi3-musllinux_1_2_aarch64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rusterize-0.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 70a213fd22a9ebde59be522e29da36431ce1d9c9c3380508f9107a41ac289917
MD5 7f95bcbe6d1d462593edccf23bdc10f9
BLAKE2b-256 59cc8e2460cba1fb809cc390291dd313c3aabc84f858324d40313d9e2960d798

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.3.0-cp311-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for rusterize-0.3.0-cp311-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 c319147a4f861deb3413db23ffe183216c54866ee00435870a88f2c6f887fe81
MD5 ef8c64d99861692a6b93f35fb245c82e
BLAKE2b-256 eecd6cb1a9f56d7608f46135cfd27ee4307c74c527543975c75026a1ef3dea07

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0-cp311-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.3.0-cp311-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for rusterize-0.3.0-cp311-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 53d373f0bbb372ebcca576f1080daf177e5e067ad3488103be2aa5fa2110568e
MD5 21ed18f2948e7fa36fcb5280a531b547
BLAKE2b-256 b3d49e0a36ead30f10ed33b37c177ccce25934d4ca93f218cc30f2a267c17bb8

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0-cp311-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for rusterize-0.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7aa830977c4618a55ee3403654d8c12359a94d4b973bd7ae7765370faaacd054
MD5 21075ce2e4170c1af64e9ee28551564d
BLAKE2b-256 d76120e11b2d53ea23fc969960e006be4071fc4cb690ec72059154669c195fff

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.3.0-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rusterize-0.3.0-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7cf2652dbe81f53264de5d702f8e9b36c3e5823e5efdf7278c109568668517ea
MD5 255b82b8ab81701c8ada515857238202
BLAKE2b-256 5c7b39626d97c797903e39f5e1a1565f1ad37fe27f08f5aa24fea8d968c18245

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.3.0-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for rusterize-0.3.0-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b5e196c8fc98efa43036c67e8296490fceb84af46f3d4ae5439f82ceb9230544
MD5 99bb59371cceabf0776c2410f28c5e2c
BLAKE2b-256 f5126342b6f98200419b292e1ef30cd4bda23044b73154ad18e71162649b4a0f

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.3.0-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page