Skip to main content

RaQuet - Raster data in Parquet format with QUADBIN spatial indexing

Project description

RaQuet

RaQuet is a specification for storing and querying raster data using Apache Parquet, a column-oriented data file format. Users of data warehouse platforms rely on the simple interoperability of Parquet files to move data and perform queries.

Documentation | Online Viewer | Specification

Overview

Each row in a RaQuet file represents a single rectangular block of data. Location and zoom are given by a Web Mercator tile z/x/y tile identifier stored in the block column as a single 64-bit cell Quadbin identifier. Empty tiles can be omitted to reduce file size.

Raster data pixels are stored in row-major order binary packed blobs in per-band columns named band_1, band_2, etc. Valid pixel values include integers or floating point values. These blobs can be optionally compressed with gzip to further reduce file size.

Pixel bands can be decoded via simple binary unpacking in any programming environment and converted to wire image formats like PNG or displayed directly in web visualization libraries like MapLibre.

Similar to GeoParquet, RaQuet metadata is stored as a JSON object with details on coverage area, raster resolution, pixel data format, and other needed information. For compatibility with data warehouses the metadata is stored within a Parquet row at a special “0” row (block=0x00).

Installation

# Basic installation (GeoTIFF conversion)
pip install raquet-io

# With rich output for CLI
pip install "raquet-io[rich]"

# With ImageServer support
pip install "raquet-io[imageserver]"

# All features
pip install "raquet-io[all]"

Note: GDAL must be installed separately. On macOS: brew install gdal. On Ubuntu: apt install gdal-bin libgdal-dev.

CLI Usage

The raquet CLI provides commands for inspecting, converting, and exporting Raquet files.

Inspect a Raquet file

# Display metadata and statistics
raquet-io inspect landcover.parquet

# With verbose output
raquet-io inspect landcover.parquet -v

Convert to Raquet

From GeoTIFF

# Basic conversion
raquet-io convert geotiff input.tif output.parquet

# With custom options
raquet-io convert geotiff input.tif output.parquet \
  --resampling bilinear \
  --block-size 512 \
  -v

Options:

Option Description
--zoom-strategy Zoom level strategy: auto, lower, upper (default: auto)
--resampling Resampling algorithm: near, bilinear, cubic, etc. (default: near)
--block-size Block size in pixels (default: 256)
--target-size Target size for auto zoom calculation
--overviews Overview generation: auto (full pyramid) or none (native resolution only)
--min-zoom Minimum zoom level for overviews (overrides auto calculation)
--streaming Memory-safe two-pass conversion for large files
-v, --verbose Enable verbose output

Large file conversion:

# Skip overviews for faster conversion (native resolution only)
raquet-io convert geotiff large.tif output.parquet --overviews none

# Memory-safe streaming mode for very large files
raquet-io convert geotiff huge.tif output.parquet --streaming -v

# Limit overview pyramid to zoom 5 and above
raquet-io convert geotiff input.tif output.parquet --min-zoom 5

From ArcGIS ImageServer

# Basic conversion
raquet-io convert imageserver https://server/arcgis/rest/services/dem/ImageServer dem.parquet

# With bounding box filter
raquet-io convert imageserver https://server/.../ImageServer output.parquet \
  --bbox "-122.5,37.5,-122.0,38.0"

# With specific resolution
raquet-io convert imageserver https://server/.../ImageServer output.parquet \
  --resolution 12 \
  -v

Options:

Option Description
--token ArcGIS authentication token
--bbox Bounding box filter in WGS84: xmin,ymin,xmax,ymax
--block-size Block size in pixels (default: 256)
--resolution Target QUADBIN pixel resolution (auto if not specified)
--no-compression Disable gzip compression for block data
-v, --verbose Enable verbose output

Export from Raquet

To GeoTIFF

# Export to GeoTIFF
raquet-io export geotiff input.parquet output.tif

# Include RaQuet overviews in GeoTIFF (uses pre-computed overview tiles)
raquet-io export geotiff input.parquet output.tif --overviews

# With verbose output
raquet-io export geotiff input.parquet output.tif -v

Legacy Commands

For backwards compatibility, standalone commands are also available:

geotiff2raquet input.tif output.parquet
raquet2geotiff input.parquet output.tif

Python API

from raquet import geotiff2raquet, raquet2geotiff

# Convert GeoTIFF to Raquet
geotiff2raquet.main(
    "input.tif",
    "output.parquet",
    geotiff2raquet.ZoomStrategy.AUTO,
    geotiff2raquet.ResamplingAlgorithm.NEAR,
    block_zoom=8,  # 256px blocks
    target_size=None,
)

# Convert Raquet to GeoTIFF
raquet2geotiff.main("input.parquet", "output.tif")

ImageServer Conversion

from raquet.imageserver import imageserver_to_raquet

# Convert ImageServer to Raquet
result = imageserver_to_raquet(
    "https://server/arcgis/rest/services/dem/ImageServer",
    "output.parquet",
    bbox=(-122.5, 37.5, -122.0, 38.0),  # Optional WGS84 bounds
    block_size=256,
    target_resolution=12,  # Optional, auto-calculated if not specified
)

print(f"Created {result['num_blocks']} blocks with {result['num_bands']} bands")

Querying with DuckDB

Raquet files can be queried directly with DuckDB:

-- Load Raquet file
SELECT * FROM read_parquet('raster.parquet') WHERE block != 0 LIMIT 10;

-- Get metadata
SELECT metadata FROM read_parquet('raster.parquet') WHERE block = 0;

-- Query specific tiles using QUADBIN
SELECT block, band_1
FROM read_parquet('raster.parquet')
WHERE block = quadbin_from_tile(x, y, z);

Online Viewer

Try the RaQuet Viewer - a client-side viewer powered by DuckDB-WASM that runs entirely in your browser. Load any publicly accessible RaQuet file and explore it interactively.

Performance Tips

For optimal remote query performance:

  1. Block sorting: Blocks are automatically sorted by QUADBIN ID during conversion, enabling Parquet row group pruning
  2. Row group size: Use smaller row groups (default: 200) for cloud storage access
  3. Zoom splitting: For large datasets, use raquet-io split-zoom to create per-zoom-level files
# Convert with optimized settings for remote access
raquet-io convert geotiff input.tif output.parquet --row-group-size 200

See the full documentation for more details.

Specification

See format-specs/raquet.md for the full specification.

Examples

See examples/example_metadata.json for an example of the metadata.

See examples/example_data.parquet for an example of the data.

License

See LICENSE for the license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raquet_io-0.7.0.tar.gz (52.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

raquet_io-0.7.0-py3-none-any.whl (48.2 kB view details)

Uploaded Python 3

File details

Details for the file raquet_io-0.7.0.tar.gz.

File metadata

  • Download URL: raquet_io-0.7.0.tar.gz
  • Upload date:
  • Size: 52.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for raquet_io-0.7.0.tar.gz
Algorithm Hash digest
SHA256 f57fd92bffea8e1332ef49ad977f37dca0baf49a5b2ade8bac1d6c67297c1e15
MD5 2a015427dc16a84b568957a0f9f950a3
BLAKE2b-256 3cd0cf956eeb27961b7c17927e4ac171946aafd06b9d228cfd62614316c695ce

See more details on using hashes here.

Provenance

The following attestation bundles were made for raquet_io-0.7.0.tar.gz:

Publisher: publish.yml on CartoDB/raquet

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file raquet_io-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: raquet_io-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 48.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for raquet_io-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 42d82289714b4a09e34081bb454f079d666cc3f53560e0fbfa1250030f5a3c29
MD5 4ac8badd52c69029e99e7efc528b376a
BLAKE2b-256 66226ee6a2c47d6f30ecef4b7473efd8c6a34431b918c42535553a1ebced43e3

See more details on using hashes here.

Provenance

The following attestation bundles were made for raquet_io-0.7.0-py3-none-any.whl:

Publisher: publish.yml on CartoDB/raquet

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page