Skip to main content

A Python package to rasterize GeoDataFrames

Project description

Rasterizer

PyPI - Version Documentation Status Pytest

rasterizer is a lightweight Python package that speeds up rasterization of geopandas GeoDataFrames by specializing in regular, axis-aligned rectangular grids.

Features

  • Rasterize lines into a binary (presence/absence) or length-based grid.
  • Rasterize polygons into a binary (presence/absence) or area-based grid.
  • Fast because it targets regular rectilinear grids described by 1D x and y cell-center coordinates with constant spacing.
  • Hybrid polygon rasterization for large polygon bounding boxes: exact clipping on boundary cells, faster scanline filling for interior cells.
  • Weighted rasterization: Rasterize geometries while weighting the output by a numerical column in the GeoDataFrame.
  • Works with geopandas GeoDataFrames.
  • Outputs an xarray.DataArray for easy integration with other scientific Python libraries.
  • No GDAL dependency for the rasterization algorithm itself.

For detailed usage and API documentation, please see the full documentation.

Installation

You can install the package directly from PyPI:

pip install rasterizer

Usage

Here are some examples of what you can do with rasterizer.

import geopandas as gpd
from rasterizer import rasterize_polygons

polys = gpd.read_file("polygons.gpkg")
area_raster = rasterize_polygons(polys, your_x_grid, your_y_grid, polys.crs, mode="area")

# Enable a tqdm progress bar when processing large geometry collections.
area_raster = rasterize_polygons(
    polys,
    your_x_grid,
    your_y_grid,
    polys.crs,
    mode="area",
    progress_bar=True,
)

Rasterizing Lines

You can rasterize lines in either binary or length mode.

Binary Mode Length Mode
Lines - Binary Lines - Length

Rasterizing Polygons

You can rasterize polygons in either binary or area mode.

For polygon workloads, rasterizer now uses two internal strategies. Small polygon bounding boxes are handled with exact per-cell clipping. Larger ones switch to a hybrid path that still clips boundary cells exactly, but fills interior spans with a scanline pass to reduce the amount of geometric clipping required. The resulting area and binary outputs stay exact at cell boundaries while scaling better on large polygons.

Binary Mode Area Mode
Polygons - Binary Polygons - Area

Large Dataset Showcase

This real-world example uses 606,667 building polygons on a 10 m Lambert-93 grid covering Paris. The area rasterization step completes in 13.1 s on a regular laptop used as the local documentation machine for a 2804 x 1978 grid.

import geopandas as gpd
import numpy as np
from rasterizer import rasterize_polygons

buildings = gpd.read_file(
    "BDT_3-5_GPKG_LAMB93_D075-ED2026-03-15.gpkg",
    layer="batiment",
    columns=[],
)

xmin, ymin, xmax, ymax = buildings.total_bounds
x = np.arange(xmin, xmax, 10.0)
y = np.arange(ymin, ymax, 10.0)

coverage = rasterize_polygons(buildings, x=x, y=y, crs=buildings.crs, mode="area")

Large dataset showcase

The full walkthrough, including the benchmark context and reproduction script, is available in the large dataset showcase documentation.

Why rasterizer

This package provides functionalities that are not present in rasterio.features, such as area and length-based rasterization. It is also lighter and faster than using more general GDAL-based solutions because it is specialized for regular rectilinear grids instead of arbitrary raster layouts. GDAL's rasterization only burns values per pixel; it cannot return exact fractional area or length contributions without an expensive workaround. The common workaround is to rasterize at a much finer resolution and then downsample with averaging, which approximates the true area/length but is not exact and can be slow, e.g.:

gdal_rasterize -burn 1 -tr 1 1 -ot Float32 -of GTiff input.gpkg tmp_fine.tif
gdalwarp -tr 10 10 -r average tmp_fine.tif out_area_approx.tif

Doing this purely in geopandas by generating one polygon per grid cell and overlaying it with the input geometry is also slow because it creates a huge number of tiny geometries, triggers expensive overlay operations, and scales poorly with grid size.

That speed-up comes with a deliberate constraint: rasterizer is built for regular, axis-aligned rectangular grids, not for arbitrary affine transforms or irregular meshes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rasterizer-0.3.2.tar.gz (19.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rasterizer-0.3.2-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file rasterizer-0.3.2.tar.gz.

File metadata

  • Download URL: rasterizer-0.3.2.tar.gz
  • Upload date:
  • Size: 19.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for rasterizer-0.3.2.tar.gz
Algorithm Hash digest
SHA256 583d2706bc29a574a349db89e2f639d7b61011cb9fbe19a22c853e7fafc87f22
MD5 5ce5b6642f2048a3b6e330e3f87c7937
BLAKE2b-256 76f0ddcf77d68b71f3938ad369a69796b339e758688a11d8e593e2f55dddb8d5

See more details on using hashes here.

File details

Details for the file rasterizer-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: rasterizer-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for rasterizer-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fa2474c701bd4d98704592c44c0b4a9ce75c964a682790d5b3b88dec8bea0070
MD5 ec231e65c6d1dcc72c9c2ba0b2c167ce
BLAKE2b-256 e482b2f619e70590f4e44877644f7646242ef6663a283d4b5a6b42772aa5821f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page