Experimental CLI tool for converting TIFF raster data to FLAC format
Project description
FLAC-Raster: Experimental Raster to FLAC Converter
An experimental CLI tool that converts TIFF raster data files into FLAC audio format while preserving all geospatial metadata, CRS, and bounds information. This proof-of-concept explores using FLAC's lossless compression for geospatial data storage and introduces revolutionary HTTP range streaming for efficient geospatial data access - "Zarr for geospatial data using audio compression".
๐ NEW: Lazy Loading & HTTP Range Streaming
FLAC-Raster now supports lazy loading with HTTP range request streaming - like "Zarr for geospatial data" but using audio compression!
- ๐โโ๏ธ Lazy loading: Load only metadata first (1MB), stream tiles on-demand
- Web-optimized: Upload FLAC files to any HTTP server, query by bbox with range requests
- Spatial indexing: Each tile has bbox metadata for efficient spatial queries
- Streaming: Download only the data you need, not entire files (80%+ bandwidth savings)
- Precise: Query specific geographic areas with pixel-perfect accuracy
- HTTP compatible: Works with standard web servers, CDNs, and browsers
- URL support: Query remote FLAC files directly via HTTPS URLs
Features
- Bidirectional conversion: TIFF โ FLAC and FLAC โ TIFF
- Complete metadata preservation: CRS, bounds, transform, data type, nodata values
- ๐ Embedded metadata: All geospatial metadata stored directly in FLAC files (no sidecar files!)
- ๐ Spatial tiling: Convert rasters to tiled FLAC with bbox metadata per tile
- ๐ HTTP range streaming: Query and stream data by bounding box with 90%+ bandwidth savings
- ๐ Exceptional compression: 7-15ร file size reduction while maintaining lossless quality
- Intelligent audio parameters: Automatically selects sample rate and bit depth based on raster properties
- Multi-band support: Seamlessly handles multi-band rasters (RGB, multispectral) as multi-channel audio
- Lossless compression: Perfect reconstruction verified - no data loss
- FLAC chunking: Uses FLAC's frame-based compression (4096 samples/frame)
- Comprehensive logging: Verbose mode with detailed progress tracking
- Colorful CLI: Built with Typer and Rich for an intuitive experience
Installation
Prerequisites
First, install pixi:
# Install pixi (cross-platform package manager)
curl -fsSL https://pixi.sh/install.sh | bash
# or via conda: conda install -c conda-forge pixi
Clone and Setup
git clone https://github.com/Youssef-Harby/flac-raster.git
cd flac-raster
pixi install # Install all dependencies
Install the CLI tool
# For regular use:
pixi run pip install .
# For development (editable install):
pixi run pip install -e .
Alternative: Direct pip installation
pip install rasterio numpy typer rich tqdm pyflac mutagen
# Or install from PyPI (when published):
pip install flac-raster
Usage
Basic Commands
After installation, you can use the CLI directly:
-
Convert TIFF to FLAC:
flac-raster convert input.tif -o output.flac
-
Convert FLAC back to TIFF:
flac-raster convert input.flac -o output.tif
-
Get file information:
flac-raster info file.tif flac-raster info file.flac
-
Compare two TIFF files:
flac-raster compare original.tif reconstructed.tif
๐ Spatial Tiling & HTTP Range Streaming
-
Create spatial FLAC with tiling:
# Enable spatial tiling with 512x512 tiles (default) flac-raster convert input.tif --spatial -o spatial.flac # Custom tile size (256x256) flac-raster convert input.tif --spatial --tile-size 256 -o spatial.flac
-
Query spatial FLAC by bounding box:
# Query local file flac-raster query spatial.flac --bbox "xmin,ymin,xmax,ymax" # Query remote file (lazy loading!) flac-raster query https://example.com/data.flac --bbox "34.1,28.6,34.3,28.8" # Example with real coordinates flac-raster query spatial.flac --bbox "-105.3,40.3,-105.1,40.5"
-
View spatial index information:
flac-raster spatial-info spatial.flac
Alternative: Use python main.py if you haven't installed the package:
python main.py convert input.tif # Direct script usage
Options
Convert command:
--output, -o: Specify output file path (auto-generates if not provided)--compression, -c: FLAC compression level 0-8 (default: 5)--force, -f: Overwrite existing output files--verbose, -v: Enable verbose logging for detailed progress
Compare command:
--show-bands/--no-bands: Show per-band statistics (default: True)--export, -e: Export comparison results to JSON file--help: Show help message
Example Workflow
# Create sample data
python examples/create_test_data.py
# Convert DEM to FLAC
flac-raster convert test_data/sample_dem.tif -v
# Check the FLAC file info
flac-raster info test_data/sample_dem.flac
# Convert back to TIFF
flac-raster convert test_data/sample_dem.flac -o test_data/dem_reconstructed.tif
# Compare original and reconstructed
flac-raster compare test_data/sample_dem.tif test_data/dem_reconstructed.tif
# Export comparison to JSON
flac-raster compare test_data/sample_dem.tif test_data/dem_reconstructed.tif --export comparison.json
# Test with multi-band data
flac-raster convert test_data/sample_rgb.tif
flac-raster convert test_data/sample_rgb.flac -o test_data/rgb_reconstructed.tif
flac-raster compare test_data/sample_rgb.tif test_data/rgb_reconstructed.tif
# Open in QGIS to verify
# The reconstructed files should be viewable in QGIS with all metadata intact
How It Works
TIFF to FLAC Conversion
- Read raster data and extract all metadata (CRS, bounds, transform, etc.)
- Spatial tiling (if enabled): Divide raster into configurable tile sizes
- Calculate audio parameters:
- Sample rate: Based on image resolution (44.1kHz to 192kHz)
- Bit depth: Matches the raster's bit depth (16 or 24-bit, minimum 16-bit due to FLAC decoder limitations)
- Normalize data to audio range (-1 to 1)
- Reshape data: Bands become audio channels, pixels become samples
- Single-band โ Mono audio
- Multi-band (RGB, multispectral) โ Multi-channel audio
- Encode to FLAC with configurable compression
- Embed metadata directly in FLAC using VORBIS_COMMENT blocks
- Generate spatial index with bbox and byte range information for each tile
FLAC to TIFF Conversion
- Decode FLAC file and extract audio samples
- Load metadata from embedded FLAC metadata (with JSON sidecar fallback)
- Reconstruct spatial index for tiled data
- Reshape audio back to raster dimensions
- Mono โ Single-band raster
- Multi-channel โ Multi-band raster
- Denormalize to original data range
- Write GeoTIFF with all original metadata preserved
Metadata Preservation
The tool preserves all geospatial metadata directly embedded in FLAC files:
- Width and height dimensions
- Number of bands
- Data type (uint8, int16, float32, etc.)
- Coordinate Reference System (CRS)
- Geospatial transform (affine transformation matrix)
- Bounding box coordinates
- Original data min/max values
- NoData values
- Spatial index: Compressed tile bbox and byte range information
- Original driver information
Embedded Metadata Format
Metadata is stored in FLAC VORBIS_COMMENT blocks:
GEOSPATIAL_CRS=EPSG:4326
GEOSPATIAL_WIDTH=1201
GEOSPATIAL_HEIGHT=1201
GEOSPATIAL_SPATIAL_INDEX=<base64(gzip(spatial_index_json))>
...
Lazy Loading & HTTP Range Streaming for Web GIS
Concept: "Zarr for Geospatial Data using Audio Compression"
The lazy loading feature transforms FLAC-Raster into a web-native geospatial format that enables efficient HTTP range request streaming:
FLAC URL: https://cdn.example.com/elevation.flac
โ
๐โโ๏ธ Lazy Load: Download first 1MB for metadata only
โ
Query Spatial Index: Find intersecting tiles for bbox
โ
HTTP Range Request: bytes=48152-73513,87850-113211
โ
โฌ๏ธ Smart Download: Only 76KB instead of 189KB (60% savings!)
โ
Decode FLAC: Get pixels for visible area only
Lazy Loading Workflow
- Metadata First: Download only 1MB to read embedded spatial index
- On-Demand Streaming: Query specific geographic areas
- Precise Downloads: HTTP Range requests for intersecting tiles only
- Progressive Loading: Cache tiles for repeated access
Use Cases
-
Interactive Web Maps
- Progressive loading as users pan/zoom
- Only download visible area data
- Works with any HTTP server/CDN
-
Cloud-Native GIS
- Stream large rasters without specialized servers
- Compatible with S3, CloudFront, etc.
- No need for complex tiling servers
-
Bandwidth-Constrained Applications
- Mobile mapping apps
- Satellite/field data collection
- IoT sensor networks
Web Server Integration
// JavaScript lazy loading client example
async function loadRasterData(bbox) {
const flacUrl = '/data/elevation.flac';
// 1. Lazy load: get metadata only (first 1MB)
const metadataResponse = await fetch(flacUrl, {
headers: { 'Range': 'bytes=0-1048575' }
});
const spatialIndex = extractEmbeddedMetadata(metadataResponse);
// 2. Find byte ranges for bbox
const ranges = calculateRanges(bbox, spatialIndex);
// 3. Stream only needed tiles via HTTP ranges
const rangeHeader = ranges.map(r => `${r.start}-${r.end}`).join(',');
const dataResponse = await fetch(flacUrl, {
headers: { 'Range': `bytes=${rangeHeader}` }
});
// 4. Decode FLAC data for bbox
return decodeFLACTiles(dataResponse.body, bbox);
}
Technical Details
- FLAC frames: Utilizes FLAC's frame structure for efficient chunking (4096 samples/frame)
- ๐ Spatial tiling: Each tile becomes a separate FLAC stream with bbox metadata
- ๐ HTTP byte ranges: Precise byte offsets enable partial downloads
- ๐ Embedded metadata: All geospatial info stored in FLAC VORBIS_COMMENT blocks
- Multi-band support: Each raster band becomes an audio channel (up to 8 channels supported by FLAC)
- Lossless conversion: Data is normalized but the process is completely reversible
- Exceptional compression: Leverages FLAC's compression algorithms (7-15ร size reduction)
- Self-contained files: No external dependencies or sidecar files required
- Data type mapping:
- uint8 โ 16-bit FLAC (due to decoder limitations)
- int16/uint16 โ 16-bit FLAC
- int32/uint32/float32 โ 24-bit FLAC
Performance Examples
From comprehensive testing against report.md analysis:
Compression Results
- DEM file (1201ร1201, int16): 2.8 MB โ 185 KB FLAC (15.25ร compression)
- Multispectral (200ร200ร6, uint8): 235 KB โ 32 KB FLAC (7.38ร compression)
- RGB (256ร256ร3, uint8): 193 KB โ 27 KB FLAC (7.26ร compression)
HTTP Range Streaming Efficiency
- Small area queries: Up to 98.8% bandwidth savings vs full download
- Geographic precision: Query exact areas with pixel-perfect accuracy
- Optimized ranges: Smart merging of contiguous tiles reduces HTTP requests
All conversions are perfectly lossless (verified with numpy array comparison)
Limitations
- Maximum 8 bands (FLAC channel limitation)
- Minimum 16-bit encoding (pyflac decoder limitation)
- Large rasters may take time to process
- FLAC format limitations apply (specific bit depths: 16, 24-bit)
- Requires mutagen library for embedded metadata support
- Experimental: Not recommended for production use without thorough testing
Project Structure
flac-raster/
โโโ src/flac_raster/ # Main package
โ โโโ __init__.py # Package initialization
โ โโโ cli.py # Command-line interface
โ โโโ converter.py # Core conversion logic
โ โโโ spatial_encoder.py # ๐ Spatial tiling & HTTP range streaming
โ โโโ metadata_encoder.py # ๐ Embedded metadata handling
โ โโโ compare.py # Comparison utilities
โโโ examples/ # Example scripts
โ โโโ create_test_data.py # Generate test datasets
โ โโโ spatial_streaming_example.py # ๐ HTTP range streaming demo
โโโ test_data/ # Test datasets
โ โโโ dem-raw.tif # Large DEM for testing
โ โโโ sample_multispectral.tif # 6-band multispectral
โ โโโ sample_rgb.tif # RGB test data
โโโ report.md # ๐ Comprehensive analysis & benchmarks
โโโ main.py # Main entry point
โโโ pyproject.toml # Project configuration
โโโ README.md # This file
โโโ pixi.toml # Pixi package configuration
CI/CD & Publishing
This project uses GitHub Actions for:
- Continuous Integration: Tests on Python 3.9-3.12 across Windows, macOS, and Linux
- Automated Building: Package building and validation
- PyPI Publishing: Automatic publishing on release creation
- Quality Assurance: Integration testing via CLI commands
Publishing to PyPI
See PUBLISHING.md for detailed instructions on publishing releases.
Contributing
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes and test them
- Commit your changes:
git commit -am 'Add feature' - Push to the branch:
git push origin feature-name - Create a Pull Request
Future Improvements
- Adaptive tiling: Variable tile sizes based on data complexity
- Temporal support: Time-series data with temporal indexing
- Band selection: Spectral subsetting for multispectral data
- Compression tuning: Automatic optimization of FLAC parameters
- Caching strategy: Intelligent tile caching for frequently accessed areas
- JavaScript client: Browser-based FLAC decoder for web mapping
- Parallel processing: Multi-threaded encoding/decoding
- More formats: Support for HDF5, NetCDF, Zarr integration
- Performance optimization: Memory usage and processing speed improvements
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flac_raster-0.1.1.tar.gz.
File metadata
- Download URL: flac_raster-0.1.1.tar.gz
- Upload date:
- Size: 36.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3004f78696326282812d44c960f3c7a69a213eec6dd5a738a4c81b919eb3649
|
|
| MD5 |
f34880ae584949fcf2dbe96579d090bc
|
|
| BLAKE2b-256 |
30b04f076dd284c1f4a26b25897f93c9087bf7d6dc275bfa193d284765e2780e
|
Provenance
The following attestation bundles were made for flac_raster-0.1.1.tar.gz:
Publisher:
ci.yml on Youssef-Harby/flac-raster
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flac_raster-0.1.1.tar.gz -
Subject digest:
f3004f78696326282812d44c960f3c7a69a213eec6dd5a738a4c81b919eb3649 - Sigstore transparency entry: 295521477
- Sigstore integration time:
-
Permalink:
Youssef-Harby/flac-raster@9ab82bb2866e9ee972c846ab962b7274e401c0cc -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/Youssef-Harby
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@9ab82bb2866e9ee972c846ab962b7274e401c0cc -
Trigger Event:
release
-
Statement type:
File details
Details for the file flac_raster-0.1.1-py3-none-any.whl.
File metadata
- Download URL: flac_raster-0.1.1-py3-none-any.whl
- Upload date:
- Size: 28.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18f7d2f2c63f95959787a3a6a0b32fbe8d95319fd2c0d8a60e7a4e2fdcc36888
|
|
| MD5 |
8ee497a5516c3adfa21330439fc281bc
|
|
| BLAKE2b-256 |
a15225b47e0c600c18512582accf604391df8b957fc132a9bf0626530c748d88
|
Provenance
The following attestation bundles were made for flac_raster-0.1.1-py3-none-any.whl:
Publisher:
ci.yml on Youssef-Harby/flac-raster
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flac_raster-0.1.1-py3-none-any.whl -
Subject digest:
18f7d2f2c63f95959787a3a6a0b32fbe8d95319fd2c0d8a60e7a4e2fdcc36888 - Sigstore transparency entry: 295521479
- Sigstore integration time:
-
Permalink:
Youssef-Harby/flac-raster@9ab82bb2866e9ee972c846ab962b7274e401c0cc -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/Youssef-Harby
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@9ab82bb2866e9ee972c846ab962b7274e401c0cc -
Trigger Event:
release
-
Statement type: