RaQuet - Raster data in Parquet format with QUADBIN spatial indexing
Project description
RaQuet is a specification for storing and querying raster data using Apache Parquet, a column-oriented data file format. Users of data warehouse platforms rely on the simple interoperability of Parquet files to move data and perform queries.
Documentation | Online Viewer | Specification
Overview
Each row in a RaQuet file represents a single rectangular block of data. Location and zoom are given by a Web Mercator tile z/x/y tile identifier stored in the block column as a single 64-bit cell Quadbin identifier. Empty tiles can be omitted to reduce file size.
Raster data pixels are stored in row-major order binary packed blobs. By default, each band is stored in a separate column (band_1, band_2, etc.) with optional gzip compression. For RGB imagery, an interleaved layout stores all bands in a single pixels column, enabling lossy compression (JPEG/WebP) for 10-15x smaller files.
Pixel bands can be decoded via simple binary unpacking in any programming environment and converted to wire image formats like PNG or displayed directly in web visualization libraries like MapLibre.
Similar to GeoParquet, RaQuet metadata is stored as a JSON object with details on coverage area, raster resolution, pixel data format, and other needed information. For compatibility with data warehouses the metadata is stored within a Parquet row at a special “0” row (block=0x00).
Installation
# Basic installation (GeoTIFF conversion)
pip install raquet-io
# With rich output for CLI
pip install "raquet-io[rich]"
# With ImageServer support
pip install "raquet-io[imageserver]"
# All features
pip install "raquet-io[all]"
Note: GDAL must be installed separately. On macOS: brew install gdal. On Ubuntu: apt install gdal-bin libgdal-dev.
CLI Usage
The raquet CLI provides commands for inspecting, converting, and exporting Raquet files.
Inspect a Raquet file
# Display metadata and statistics
raquet-io inspect landcover.parquet
# With verbose output
raquet-io inspect landcover.parquet -v
Convert to Raquet
From GeoTIFF
# Basic conversion
raquet-io convert geotiff input.tif output.parquet
# With custom options
raquet-io convert geotiff input.tif output.parquet \
--resampling bilinear \
--block-size 512 \
-v
Options:
| Option | Description |
|---|---|
--zoom-strategy |
Zoom level strategy: auto, lower, upper (default: auto) |
--resampling |
Resampling algorithm: near, bilinear, cubic, etc. (default: near) |
--block-size |
Block size: 256 (default), 512, or 1024. Use 512 for fewer HTTP requests. |
--target-size |
Target size for auto zoom calculation |
--overviews |
Overview generation: auto (full pyramid) or none (native resolution only) |
--min-zoom |
Minimum zoom level for overviews (overrides auto calculation) |
--streaming |
Memory-safe two-pass conversion for large files |
--band-layout |
Band storage: sequential (default) or interleaved |
--compression |
Compression: gzip (default), jpeg, webp, or none |
--compression-quality |
Quality for lossy compression (1-100, default: 85) |
-v, --verbose |
Enable verbose output |
Lossy compression for satellite imagery:
# WebP compression (best quality/size ratio for RGB imagery)
raquet-io convert geotiff satellite.tif output.parquet \
--band-layout interleaved \
--compression webp \
--compression-quality 85
# JPEG compression (wider compatibility)
raquet-io convert geotiff satellite.tif output.parquet \
--band-layout interleaved \
--compression jpeg
# 512px blocks for fewer HTTP requests (recommended for mobile/high-latency)
raquet-io convert geotiff satellite.tif output.parquet \
--block-size 512 \
--band-layout interleaved \
--compression webp
Large file conversion:
# Skip overviews for faster conversion (native resolution only)
raquet-io convert geotiff large.tif output.parquet --overviews none
# Memory-safe streaming mode for very large files
raquet-io convert geotiff huge.tif output.parquet --streaming -v
# Limit overview pyramid to zoom 5 and above
raquet-io convert geotiff input.tif output.parquet --min-zoom 5
From ArcGIS ImageServer
# Basic conversion
raquet-io convert imageserver https://server/arcgis/rest/services/dem/ImageServer dem.parquet
# With bounding box filter
raquet-io convert imageserver https://server/.../ImageServer output.parquet \
--bbox "-122.5,37.5,-122.0,38.0"
# With specific resolution
raquet-io convert imageserver https://server/.../ImageServer output.parquet \
--resolution 12 \
-v
Options:
| Option | Description |
|---|---|
--token |
ArcGIS authentication token |
--bbox |
Bounding box filter in WGS84: xmin,ymin,xmax,ymax |
--block-size |
Block size: 256 (default), 512, or 1024. Use 512 for fewer HTTP requests. |
--resolution |
Target QUADBIN pixel resolution (auto if not specified) |
--no-compression |
Disable gzip compression for block data |
-v, --verbose |
Enable verbose output |
Export from Raquet
To GeoTIFF
# Export to GeoTIFF
raquet-io export geotiff input.parquet output.tif
# Include RaQuet overviews in GeoTIFF (uses pre-computed overview tiles)
raquet-io export geotiff input.parquet output.tif --overviews
# With verbose output
raquet-io export geotiff input.parquet output.tif -v
Legacy Commands
For backwards compatibility, standalone commands are also available:
geotiff2raquet input.tif output.parquet
raquet2geotiff input.parquet output.tif
Python API
from raquet import geotiff2raquet, raquet2geotiff
# Convert GeoTIFF to Raquet
geotiff2raquet.main(
"input.tif",
"output.parquet",
geotiff2raquet.ZoomStrategy.AUTO,
geotiff2raquet.ResamplingAlgorithm.NEAR,
block_zoom=8, # 256px blocks
target_size=None,
)
# Convert Raquet to GeoTIFF
raquet2geotiff.main("input.parquet", "output.tif")
ImageServer Conversion
from raquet.imageserver import imageserver_to_raquet
# Convert ImageServer to Raquet
result = imageserver_to_raquet(
"https://server/arcgis/rest/services/dem/ImageServer",
"output.parquet",
bbox=(-122.5, 37.5, -122.0, 38.0), # Optional WGS84 bounds
block_size=256,
target_resolution=12, # Optional, auto-calculated if not specified
)
print(f"Created {result['num_blocks']} blocks with {result['num_bands']} bands")
Testing
# Run unit tests (skips integration tests)
python -m pytest -m "not integration"
Querying with DuckDB
Raquet files can be queried directly with DuckDB using the DuckDB Raquet Extension:
INSTALL raquet FROM community;
LOAD raquet;
-- Load Raquet file (automatically excludes metadata row)
SELECT * FROM read_raquet('raster.parquet') LIMIT 10;
-- Get metadata
SELECT metadata FROM read_raquet_metadata('raster.parquet');
-- Query specific tiles using QUADBIN
SELECT block, band_1
FROM read_raquet('raster.parquet')
WHERE block = quadbin_from_tile(x, y, z);
RaQuet files also work as standard Parquet without the extension:
-- Without extension (manual metadata filtering)
SELECT * FROM read_parquet('raster.parquet') WHERE block != 0 LIMIT 10;
SELECT metadata FROM read_parquet('raster.parquet') WHERE block = 0;
Online Viewer
Try the RaQuet Viewer - a client-side viewer powered by DuckDB-WASM that runs entirely in your browser. Load any publicly accessible RaQuet file and explore it interactively.
Performance Tips
For optimal remote query performance:
- Block sorting: Blocks are automatically sorted by QUADBIN ID during conversion, enabling Parquet row group pruning
- Row group size: Use smaller row groups (default: 200) for cloud storage access
- Zoom splitting: For large datasets, use
raquet-io split-zoomto create per-zoom-level files
# Convert with optimized settings for remote access
raquet-io convert geotiff input.tif output.parquet --row-group-size 200
See the full documentation for more details.
Specification
See format-specs/raquet.md for the full specification.
Examples
See examples/example_metadata.json for an example of the metadata.
See examples/example_data.parquet for an example of the data.
License
See LICENSE for the license.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file raquet_io-0.9.0.tar.gz.
File metadata
- Download URL: raquet_io-0.9.0.tar.gz
- Upload date:
- Size: 72.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b2837b5598e64490487d128dd9f82e1a087c1a391e000b34a8d0a3ccfda2dd1b
|
|
| MD5 |
aadb996922d3354b2d3cf40598e8b162
|
|
| BLAKE2b-256 |
764f22f40ae92fe69f30a3c8f56afebdcf2505dd2049162899e7fecaa7ff247e
|
Provenance
The following attestation bundles were made for raquet_io-0.9.0.tar.gz:
Publisher:
publish.yml on CartoDB/raquet
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
raquet_io-0.9.0.tar.gz -
Subject digest:
b2837b5598e64490487d128dd9f82e1a087c1a391e000b34a8d0a3ccfda2dd1b - Sigstore transparency entry: 1090471421
- Sigstore integration time:
-
Permalink:
CartoDB/raquet@46a20b2d6f68154a55c9b398e6f5e2772703a6de -
Branch / Tag:
refs/tags/v0.9.0 - Owner: https://github.com/CartoDB
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@46a20b2d6f68154a55c9b398e6f5e2772703a6de -
Trigger Event:
push
-
Statement type:
File details
Details for the file raquet_io-0.9.0-py3-none-any.whl.
File metadata
- Download URL: raquet_io-0.9.0-py3-none-any.whl
- Upload date:
- Size: 64.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61ebc8dc5712f5017144c51382844537e5903c9508c6bfc0c858b9c66217602b
|
|
| MD5 |
29ec7f613624a11b210ce3c290d2053a
|
|
| BLAKE2b-256 |
75452309b50242c27c2e5048b002312d52f5c0fd880c76bdb443675f28b6985e
|
Provenance
The following attestation bundles were made for raquet_io-0.9.0-py3-none-any.whl:
Publisher:
publish.yml on CartoDB/raquet
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
raquet_io-0.9.0-py3-none-any.whl -
Subject digest:
61ebc8dc5712f5017144c51382844537e5903c9508c6bfc0c858b9c66217602b - Sigstore transparency entry: 1090471436
- Sigstore integration time:
-
Permalink:
CartoDB/raquet@46a20b2d6f68154a55c9b398e6f5e2772703a6de -
Branch / Tag:
refs/tags/v0.9.0 - Owner: https://github.com/CartoDB
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@46a20b2d6f68154a55c9b398e6f5e2772703a6de -
Trigger Event:
push
-
Statement type: