Skip to main content

High performance rasterization tool for Python built in Rust

Project description

rusterize

High performance rasterization tool for Python built in Rust. This repository stems from the fasterize package built in C++ for R and ports parts of the logics into Python with a Rust backend, in addition to some useful improvements (see API).

rusterize is designed to work on (multi)polygons and (multi)linestrings, even when they are nested inside complex geometry collections. Functionally, it takes an input geopandas dataframe and returns a xarray.

Installation

Install the current version with pip:

pip install rusterize

Contributing

Any contribution is welcome! You can install rusterize directly from this repo using maturin as an editable package. For this to work, you’ll need to have Rust and cargo installed.

# Clone repo
git clone https://github.com/<username>/rusterize.git
cd rusterize

# Install the Rust nightly toolchain
rustup toolchain install nightly-2025-07-31

 # Install maturin
pip install maturin

# Install editable version with optmized code
maturin develop --profile dist-release

API

This package has a simple API:

from rusterize import rusterize

# gdf = <import/modify dataframe as needed>

# rusterize
rusterize(
    gdf,
    like=None,
    res=(30, 30),
    out_shape=(10, 10),
    extent=(0, 10, 10, 20),
    field="field",
    by="by",
    burn=None,
    fun="sum",
    background=0,
    dtype="uint8"
)
  • gdf: geopandas dataframe to rasterize
  • like: xr.DataArray to use as template for res, out_shape, and extent. Mutually exclusive with these parameters (default: None)
  • res: (xres, yres) for desired resolution (default: None)
  • out_shape: (nrows, ncols) for desired output shape (default: None)
  • extent: (xmin, ymin, xmax, ymax) for desired output extent (default: None)
  • field: column to rasterize. Mutually exclusive with burn. (default: None -> a value of 1 is rasterized)
  • by: column for grouping. Assign each group to a band in the stack. Values are taken from field if specified, else burn is rasterized. (default: None -> singleband raster)
  • burn: a single value to burn. Mutually exclusive with field. (default: None). If no field is found in gdf or if field is None, then burn=1
  • fun: pixel function to use when multiple values overlap. Available options are sum, first, last, min, max, count, or any. (default: last)
  • background: background value in final raster. (default: np.nan). A None value corresponds to the default of the specified dtype. An illegal value for a dtype will be replaced with the default of that dtype. For example, a background=np.nan for dtype="uint8" will become background=0, where 0 is the default for uint8.
  • dtype: dtype of the final raster. Possible values are uint8, uint16, uint32, uint64, int8, int16, int32, int64, float32, float64 (default: float64)

Note that control over the desired extent is not as strict as for resolution and shape. That is, when resolution, output shape, and extent are specified, priority is given to resolution and shape. So, extent is not guaranteed, but resolution and shape are. If extent is not given, it is taken from the polygons and is not modified, unless you specify a resolution value. If you only specify an output shape, the extent is maintained. This mimics the logics of gdal_rasterize.

Usage

rusterize consists of a single function rusterize(). The Rust implementation returns a dictionary that is converted to a xarray on the Python side for simpliicty.

from rusterize import rusterize
import geopandas as gpd
from shapely import wkt
import matplotlib.pyplot as plt

# Construct geometries
geoms = [
    "POLYGON ((-180 -20, -140 55, 10 0, -140 -60, -180 -20), (-150 -20, -100 -10, -110 20, -150 -20))",
    "POLYGON ((-10 0, 140 60, 160 0, 140 -55, -10 0))",
    "POLYGON ((-125 0, 0 60, 40 5, 15 -45, -125 0))",
    "MULTILINESTRING ((-180 -70, -140 -50), (-140 -50, -100 -70), (-100 -70, -60 -50), (-60 -50, -20 -70), (-20 -70, 20 -50), (20 -50, 60 -70), (60 -70, 100 -50), (100 -50, 140 -70), (140 -70, 180 -50))",
    "GEOMETRYCOLLECTION (POINT (50 -40), POLYGON ((75 -40, 75 -30, 100 -30, 100 -40, 75 -40)), LINESTRING (80 -40, 100 0), GEOMETRYCOLLECTION (POLYGON ((100 20, 100 30, 110 30, 110 20, 100 20))))"
]

# Convert WKT strings to Shapely geometries
geometries = [wkt.loads(geom) for geom in geoms]

# Create a GeoDataFrame
gdf = gpd.GeoDataFrame({'value': range(1, len(geoms) + 1)}, geometry=geometries, crs='EPSG:32619')

# rusterize
output = rusterize(
    gdf,
    res=(1, 1),
    field="value",
    fun="sum",
).squeeze()

# plot it
fig, ax = plt.subplots(figsize=(12, 6))
output.plot.imshow(ax=ax)
plt.show()

Benchmarks

rusterize is fast! Let’s try it on small and large datasets.

from rusterize import rusterize
import geopandas as gpd
import requests
import zipfile
from io import BytesIO

# large dataset (~380 MB)
url = "https://s3.amazonaws.com/hp3-shapefiles/Mammals_Terrestrial.zip"
response = requests.get(url)

# unzip
with zipfile.ZipFile(BytesIO(response.content), 'r') as zip_ref:
    zip_ref.extractall()

# read
gdf_large = gpd.read_file("Mammals_Terrestrial/Mammals_Terrestrial.shp")

# small dataset (first 1000 rows)
gdf_small = gdf_large.iloc[:1000, :]

# rusterize at 1/6 degree resolution
def test_large(benchmark):
  benchmark(rusterize, gdf_large, res=(1/6, 1/6), fun="sum")

def test_small(benchmark):
  benchmark(rusterize, gdf_small, res=(1/6, 1/6), fun="sum")

Then you can run it with pytest and pytest-benchmark:

pytest <python file> --benchmark-min-rounds=20 --benchmark-time-unit='s'

--------------------------------------------- benchmark: 1 tests --------------------------------------------
Name (time in s)         Min      Max     Mean  StdDev   Median     IQR  Outliers     OPS  Rounds  Iterations
-------------------------------------------------------------------------------------------------------------
rusterize_small       0.0791    0.0899   0.0812  0.0027   0.0803  0.0020       2;2  12.3214     20          1
rusterize_large     1.379545    1.4474   1.4006  0.0178   1.3966  0.0214       5;1   0.7140     20          1
-------------------------------------------------------------------------------------------------------------

And fasterize:

library(sf)
library(raster)
library(fasterize)
library(microbenchmark)

large <- st_read("Mammals_Terrestrial/Mammals_Terrestrial.shp", quiet = TRUE)
small <- large[1:1000, ]
fn <- function(v) {
  r <- raster(v, res = 1/6)
  return(fasterize(v, r, fun = "sum"))
}
microbenchmark(
  fasterize_large = f <- fn(large),
  fasterize_small = f <- fn(small),
  times=20L,
  unit='s'
)
Unit: seconds
            expr       min         lq       mean     median        uq        max neval
 fasterize_small 0.4741043  0.4926114  0.5191707  0.5193289  0.536741  0.5859029    20
 fasterize_large 9.2199426 10.3595465 10.6653139 10.5369429 11.025771 11.7944567    20

And on an even larger datasets? Here we use a layer from the province of Quebec, Canada representing ~2M polygons of forest stands, rasterized at 30 meters (20 rounds) with no field value and pixel function any. The comparison with gdal_rasterize was run with hyperfine --runs 20 "gdal_rasterize -tr 30 30 -burn 1 <data_in> <data_out>".

# rusterize
--------------------------------------------- benchmark: 1 tests --------------------------------------------
Name (time in s)         Min      Max     Mean  StdDev   Median     IQR  Outliers     OPS  Rounds  Iterations
-------------------------------------------------------------------------------------------------------------
rusterize             5.9331   7.2308   6.1302  0.3183  5.9903   0.1736       2;4  0.1631      20           1
-------------------------------------------------------------------------------------------------------------

# fasterize
Unit: seconds
      expr      min       lq     mean   median       uq      max neval
 fasterize 157.4734 177.2055 194.3222 194.6455 213.9195 230.6504    20

# gdal_rasterize (CLI) - read from fast drive, write to fast drive
Time (mean ± σ):      5.495 s ±  0.038 s    [User: 4.268 s, System: 1.225 s]
Range (min … max):    5.452 s …  5.623 s    20 runs

In terms of (multi)line rasterization speed, here's a benchmark against gdal_rasterize using a layer from the province of Quebec, Canada, representing a subset of the road network for a total of ~535K multilinestrings.

# rusterize
--------------------------------------------- benchmark: 1 tests --------------------------------------------
Name (time in s)         Min      Max     Mean  StdDev   Median     IQR  Outliers     OPS  Rounds  Iterations
-------------------------------------------------------------------------------------------------------------
test                  4.5272   5.9488   4.7171  0.3236   4.6360  0.1680       2;2  0.2120      20           1
-------------------------------------------------------------------------------------------------------------

# gdal_rasterize (CLI) - read from fast drive, write to fast drive
Time (mean ± σ):      8.719 s ±  0.063 s    [User: 3.782 s, System: 4.917 s]
Range (min … max):    8.658 s …  8.874 s    20 runs

Comparison with other tools

While rusterize is fast, there are other fast alternatives out there, including GDAL, rasterio and geocube. However, rusterize allows for a seamless, Rust-native processing with similar or lower memory footprint that doesn't require you to leave Python, and returns the geoinformation you need for downstream processing with ample control over resolution, shape, extent, and data type.

The following is a time comparison on a single run on the same forest stands dataset used earlier.

rusterize:    5.9 sec
rasterio:     68  sec (but no spatial information)
fasterize:    157 sec (including raster creation)
geocube:      260 sec (larger memory footprint)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rusterize-0.4.1.tar.gz (75.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rusterize-0.4.1-cp311-abi3-win_amd64.whl (15.4 MB view details)

Uploaded CPython 3.11+Windows x86-64

rusterize-0.4.1-cp311-abi3-manylinux_2_28_x86_64.whl (15.7 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.28+ x86-64

rusterize-0.4.1-cp311-abi3-manylinux_2_28_ppc64le.whl (16.7 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.28+ ppc64le

rusterize-0.4.1-cp311-abi3-manylinux_2_28_armv7l.whl (15.4 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.28+ ARMv7l

rusterize-0.4.1-cp311-abi3-manylinux_2_28_aarch64.whl (14.6 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.28+ ARM64

rusterize-0.4.1-cp311-abi3-macosx_11_0_arm64.whl (14.0 MB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

rusterize-0.4.1-cp311-abi3-macosx_10_12_x86_64.whl (15.3 MB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file rusterize-0.4.1.tar.gz.

File metadata

  • Download URL: rusterize-0.4.1.tar.gz
  • Upload date:
  • Size: 75.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rusterize-0.4.1.tar.gz
Algorithm Hash digest
SHA256 d34698eb9096d0cc796514880b22f564a00e4f8f6aaed17918a996a5bcb7e0fc
MD5 c22e8f04fca06e52304f9cd694e450c4
BLAKE2b-256 763ccf966d2b031f8256c82d5524186ca6674c881d9fe34a678540cde686c150

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.4.1.tar.gz:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.4.1-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: rusterize-0.4.1-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 15.4 MB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rusterize-0.4.1-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 321ebd7a1cac24776d8a5178835c26cf338d5e09e340d1288c343c2ecd107da9
MD5 4556fed102744d5e57b4b68b41c1ca4c
BLAKE2b-256 83d510550d2967a2fcd06c2e54bc5cf7cee886b807d7f05510a168bb6310d8e4

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.4.1-cp311-abi3-win_amd64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.4.1-cp311-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for rusterize-0.4.1-cp311-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4630bee9854264160211e549a84ef960dc643aac9d13ea5292e4d60a10123cbe
MD5 2bc39277132ffed79e664aef7a22c26f
BLAKE2b-256 22b69f64ffc80fde889eb68952b06293895ea09992af1f7daecb2ff7aa076934

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.4.1-cp311-abi3-manylinux_2_28_x86_64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.4.1-cp311-abi3-manylinux_2_28_ppc64le.whl.

File metadata

File hashes

Hashes for rusterize-0.4.1-cp311-abi3-manylinux_2_28_ppc64le.whl
Algorithm Hash digest
SHA256 9f12467b2911ca3104ac721129e2f742c4bf03eec3a4407bf6fb6767497def0f
MD5 9fd60fc4e0597e8ca323c85c11f714e5
BLAKE2b-256 4533f83c44d3cefbabe26207f41e9f25510287081531a1b2dd70cdd1ccf7a188

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.4.1-cp311-abi3-manylinux_2_28_ppc64le.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.4.1-cp311-abi3-manylinux_2_28_armv7l.whl.

File metadata

File hashes

Hashes for rusterize-0.4.1-cp311-abi3-manylinux_2_28_armv7l.whl
Algorithm Hash digest
SHA256 3501e76cffb6c28a8d1f596964647bd502449b31b479e189985c27ee32d652b6
MD5 4ca265b2b54d51e3a0db459e9557f5c1
BLAKE2b-256 0ff64a5c17bebc91e20b8d401866883605037b52f0fa006339f9bd7917f18ca6

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.4.1-cp311-abi3-manylinux_2_28_armv7l.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.4.1-cp311-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for rusterize-0.4.1-cp311-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3171d4cd3d9f0bd6ff0aabbceddb43aedc0f871013ac943b1f36906ab90d9c9f
MD5 0aa6a854b2b568779111b045ade001d5
BLAKE2b-256 96de8431cc98bad8c83f86e125f1412729ce97adfb2220cd2017c51a42e3a8ed

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.4.1-cp311-abi3-manylinux_2_28_aarch64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.4.1-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rusterize-0.4.1-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5a2ea5151677a1b90b93f865bb0983a93205f0ccd1f7456cad55850ccac288e0
MD5 bd6772e4171b890e99c0752e56e2ddbd
BLAKE2b-256 60812cd1c3f62d5a74b7a5ba615222feed29d6f1e15f20e587ffd4af37b24937

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.4.1-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rusterize-0.4.1-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for rusterize-0.4.1-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 4abfa9d610f71ad46fba8915726bbd314e90b33960c45a6ba476d94774693eab
MD5 519eeece7f22117d15e1b8daf01f59ac
BLAKE2b-256 7d5dd8831ff6179577ef852593043a5aa1ff76e1e270dbfdc68bf1a807177e3f

See more details on using hashes here.

Provenance

The following attestation bundles were made for rusterize-0.4.1-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: CI.yml on ttrotto/rusterize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page