Skip to main content

A Python package to rasterize GeoDataFrames

Project description

Rasterizer

PyPI - Version Documentation Status Pytest

rasterizer is a lightweight Python package that speeds up rasterization of geopandas GeoDataFrames by specializing in regular, axis-aligned rectangular grids.

Features

  • Rasterize lines into a binary (presence/absence) or length-based grid.
  • Rasterize polygons into a binary (presence/absence) or area-based grid.
  • Fast because it targets regular rectilinear grids described by 1D x and y cell-center coordinates with constant spacing.
  • Hybrid polygon rasterization for large polygon bounding boxes: exact clipping on boundary cells, faster scanline filling for interior cells.
  • Weighted rasterization: Rasterize geometries while weighting the output by a numerical column in the GeoDataFrame.
  • Works with geopandas GeoDataFrames.
  • Outputs an xarray.DataArray for easy integration with other scientific Python libraries.
  • No GDAL dependency for the rasterization algorithm itself.

For detailed usage and API documentation, please see the full documentation.

Installation

You can install the package directly from PyPI:

pip install rasterizer

Usage

Here are some examples of what you can do with rasterizer.

import geopandas as gpd
from rasterizer import rasterize_polygons

polys = gpd.read_file("polygons.gpkg")
area_raster = rasterize_polygons(polys, your_x_grid, your_y_grid, polys.crs, mode="area")

# Enable a tqdm progress bar when processing large geometry collections.
area_raster = rasterize_polygons(
    polys,
    your_x_grid,
    your_y_grid,
    polys.crs,
    mode="area",
    progress_bar=True,
)

Rasterizing Lines

You can rasterize lines in either binary or length mode.

Binary Mode Length Mode
Lines - Binary Lines - Length

Rasterizing Polygons

You can rasterize polygons in either binary or area mode.

For polygon workloads, rasterizer now uses two internal strategies. Small polygon bounding boxes are handled with exact per-cell clipping. Larger ones switch to a hybrid path that still clips boundary cells exactly, but fills interior spans with a scanline pass to reduce the amount of geometric clipping required. The resulting area and binary outputs stay exact at cell boundaries while scaling better on large polygons.

Binary Mode Area Mode
Polygons - Binary Polygons - Area

Large Dataset Showcase

This real-world example uses 606,667 building polygons on a 10 m Lambert-93 grid covering Paris. The area rasterization step completes in 13.1 s on a regular laptop used as the local documentation machine for a 2804 x 1978 grid.

import geopandas as gpd
import numpy as np
from rasterizer import rasterize_polygons

buildings = gpd.read_file(
    "BDT_3-5_GPKG_LAMB93_D075-ED2026-03-15.gpkg",
    layer="batiment",
    columns=[],
)

xmin, ymin, xmax, ymax = buildings.total_bounds
x = np.arange(xmin, xmax, 10.0)
y = np.arange(ymin, ymax, 10.0)

coverage = rasterize_polygons(buildings, x=x, y=y, crs=buildings.crs, mode="area")

Large dataset showcase

The full walkthrough, including the benchmark context and reproduction script, is available in the large dataset showcase documentation.

Why rasterizer

This package provides functionalities that are not present in rasterio.features, such as area and length-based rasterization. It is also lighter and faster than using more general GDAL-based solutions because it is specialized for regular rectilinear grids instead of arbitrary raster layouts. GDAL's rasterization only burns values per pixel; it cannot return exact fractional area or length contributions without an expensive workaround. The common workaround is to rasterize at a much finer resolution and then downsample with averaging, which approximates the true area/length but is not exact and can be slow, e.g.:

gdal_rasterize -burn 1 -tr 1 1 -ot Float32 -of GTiff input.gpkg tmp_fine.tif
gdalwarp -tr 10 10 -r average tmp_fine.tif out_area_approx.tif

Doing this purely in geopandas by generating one polygon per grid cell and overlaying it with the input geometry is also slow because it creates a huge number of tiny geometries, triggers expensive overlay operations, and scales poorly with grid size.

That speed-up comes with a deliberate constraint: rasterizer is built for regular, axis-aligned rectangular grids, not for arbitrary affine transforms or irregular meshes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rasterizer-0.3.3.tar.gz (19.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rasterizer-0.3.3-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file rasterizer-0.3.3.tar.gz.

File metadata

  • Download URL: rasterizer-0.3.3.tar.gz
  • Upload date:
  • Size: 19.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for rasterizer-0.3.3.tar.gz
Algorithm Hash digest
SHA256 01c9a455d5ba85ea4e09b353be38414f4bc2bb9f4df27c04a3a7bcde49c254ec
MD5 b14664bc45b737156ecf3620c5e0122e
BLAKE2b-256 45db1404334478a5d39ca4db9aa37662e5568f8f6e71b5495895624e0250f726

See more details on using hashes here.

File details

Details for the file rasterizer-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: rasterizer-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 15.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for rasterizer-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 938e6fe32d824bde94803fb0516cc2440aad1cce9630c6b9cbc27ac2d1c38b54
MD5 d346729a4714db771c3e82e5dd12fc8e
BLAKE2b-256 fb5a76f77b797df58acbcb33aa9f337dc981291dab14b710b2042e7ae0180684

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page