IGN LiDAR HD Dataset Processing Library for Building LOD Classification
Project description
IGN LiDAR HD Processing Library
Version 2.3.2 | ๐ Full Documentation
Transform IGN LiDAR HD point clouds into ML-ready datasets for building classification
Quick Start โข Features โข Documentation โข Examples
๐ Overview
A comprehensive Python library for processing French IGN LiDAR HD data into machine learning-ready datasets. Features include GPU acceleration, rich geometric features, RGB/NIR augmentation, and flexible YAML-based configuration.
Key Capabilities:
- ๐ GPU Acceleration: 6-20x speedup with RAPIDS cuML
- ๐จ Multi-modal Data: Geometry + RGB + Infrared (NDVI-ready)
- ๐๏ธ Building Classification: LOD2/LOD3 schemas with 15-30+ classes
- ๐ฆ Flexible Output: NPZ, HDF5, PyTorch, LAZ formats
- โ๏ธ YAML Configuration: Reproducible workflows with example configs
โจ What's New in v2.3.2
Input Data Preservation & RGB Bug Fix:
- ๐จ Preserve RGB/NIR/NDVI from Input LAZ: Automatically detects and preserves RGB, NIR, and NDVI from input files
- ๐ CRITICAL RGB Bug Fix: Fixed coordinate mismatch in augmented patches - RGB now applied at tile level before extraction
- โก 3x Faster RGB Processing: Fetch RGB once per tile instead of per patch
- ๐ Patch Metadata: Added
_patch_centerand_patch_boundsfor debugging and validation - โ Comprehensive Testing: RGB consistency verified across all augmentation types
v2.3.1 - Memory Optimization & System Compatibility:
- ๐ง Memory-optimized configurations for 8GB-32GB+ systems
- ๐ Automatic worker scaling based on memory pressure detection
- โ๏ธ Sequential processing mode for minimal memory footprint
- ๐ Comprehensive memory optimization guide (
examples/MEMORY_OPTIMIZATION.md) - ๐ง Three configuration profiles: Original (32GB+), Optimized (16-24GB), Sequential (8-16GB)
v2.3.0 - Processing Modes & Custom Configurations:
- ๐ง Memory-optimized configurations for 8GB-32GB+ systems
- ๐ Automatic worker scaling based on memory pressure detection
- โ๏ธ Sequential processing mode for minimal memory footprint
- ๐ Comprehensive memory optimization guide (
examples/MEMORY_OPTIMIZATION.md) - ๐ง Three configuration profiles: Original (32GB+), Optimized (16-24GB), Sequential (8-16GB)
v2.3.0 - Processing Modes & Custom Configurations:
- Clear processing modes:
patches_only,both,enriched_only - YAML config files in
examples/directory for common workflows - CLI parameter overrides with
--config-fileand--show-config
๐ Full Release History
๐ Quick Start
Installation
# Standard installation (CPU)
pip install ign-lidar-hd
# Optional: GPU acceleration (6-20x speedup)
./install_cuml.sh # or follow GPU_SETUP.md
Basic Usage
# Download sample data
ign-lidar-hd download --bbox 2.3,48.8,2.4,48.9 --output data/ --max-tiles 5
# Enrich with features (GPU accelerated if available)
ign-lidar-hd enrich --input-dir data/ --output enriched/ --use-gpu
# Create training patches
ign-lidar-hd patch --input-dir enriched/ --output patches/ --lod-level LOD2
Python API
from ign_lidar import LiDARProcessor
# Initialize and process
processor = LiDARProcessor(lod_level="LOD2")
patches = processor.process_tile("data.laz", "output/")
๐ Key Features
Core Processing
- Pure LiDAR - Geometric analysis without RGB dependencies
- Multi-level Classification - LOD2 (15 classes) and LOD3 (30+ classes)
- Rich Features - Normals, curvature, planarity, verticality, density, wall/roof scores
- Augmentation - Optional RGB from orthophotos, NIR for NDVI
- Auto-parameters - Intelligent tile analysis for optimal settings
Performance
- GPU Acceleration - RAPIDS cuML support (6-20x faster)
- Parallel Processing - Multi-worker with automatic CPU detection
- Memory Optimized - Per-chunk architecture, 50-60% reduction
- Smart Skip - Resume interrupted workflows automatically
Flexibility
- Processing Modes - Three clear modes: patches only, both, or LAZ only
- YAML Configs - Declarative workflows with example templates
- Multiple Formats - NPZ, HDF5, PyTorch, LAZ (single or multi-format)
- CLI & API - Command-line tool and Python library
๐ก Usage Examples
Mode 1: Create Training Patches (Default)
# Using example config
ign-lidar-hd process \
--config-file examples/config_training_dataset.yaml \
input_dir=data/raw \
output_dir=data/patches
# Or with CLI parameters
ign-lidar-hd process \
input_dir=data/raw \
output_dir=data/patches \
output.processing_mode=patches_only
Mode 2: Both Patches & Enriched LAZ
ign-lidar-hd process \
--config-file examples/config_complete.yaml \
input_dir=data/raw \
output_dir=data/both
Mode 3: LAZ Enrichment Only
ign-lidar-hd process \
--config-file examples/config_quick_enrich.yaml \
input_dir=data/raw \
output_dir=data/enriched
โ ๏ธ Note on Enriched LAZ Files: When generating enriched LAZ tile files, geometric features (normals, curvature, planarity, etc.) may show artifacts at tile boundaries due to the nature of the source data. These artifacts are inherent to tile-based processing and do not appear in patch exports, which provide the best results for machine learning applications. For optimal quality, use
patches_onlyorbothmodes.
GPU-Accelerated Processing
ign-lidar-hd process \
--config-file examples/config_gpu_processing.yaml \
input_dir=data/raw \
output_dir=data/output
Preview Configuration
ign-lidar-hd process \
--config-file examples/config_training_dataset.yaml \
--show-config \
input_dir=data/raw
Python API Examples
from ign_lidar import LiDARProcessor, IGNLiDARDownloader
# Download tiles
downloader = IGNLiDARDownloader("downloads/")
tiles = downloader.download_by_bbox(bbox=(2.3, 48.8, 2.4, 48.9), max_tiles=5)
# Process with custom config
processor = LiDARProcessor(
lod_level="LOD3",
patch_size=150.0,
num_points=16384,
use_gpu=True
)
# Single tile
patches = processor.process_tile("input.laz", "output/")
# Batch processing
patches = processor.process_directory("input_dir/", "output_dir/", num_workers=4)
# PyTorch integration
from torch.utils.data import DataLoader
dataset = LiDARPatchDataset("patches/")
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
๐ฆ Output Format
NPZ Structure
Each patch is saved as NPZ with:
{
'points': np.ndarray, # [N, 3] XYZ coordinates
'normals': np.ndarray, # [N, 3] surface normals
'curvature': np.ndarray, # [N] principal curvature
'intensity': np.ndarray, # [N] normalized intensity
'planarity': np.ndarray, # [N] planarity measure
'verticality': np.ndarray, # [N] verticality measure
'density': np.ndarray, # [N] local point density
'labels': np.ndarray, # [N] building class labels
# Facultative features:
'wall_score': np.ndarray, # [N] wall likelihood (planarity * verticality)
'roof_score': np.ndarray, # [N] roof likelihood (planarity * horizontality)
# Optional with augmentation:
'red': np.ndarray, # [N] RGB red
'green': np.ndarray, # [N] RGB green
'blue': np.ndarray, # [N] RGB blue
'infrared': np.ndarray, # [N] NIR values
}
Available Formats
- NPZ - Default NumPy format (recommended for ML)
- HDF5 - Hierarchical data format
- PyTorch -
.ptfiles for PyTorch - LAZ - Point cloud format for visualization (may show boundary artifacts in tile mode)
- Multi-format - Save in multiple formats:
hdf5,laz,npz,torch
๐ก Tip: For machine learning applications, NPZ/HDF5/PyTorch patch formats provide cleaner geometric features than enriched LAZ tiles.
๐ Documentation
Quick Links
- ๐ Full Documentation
- ๐ Installation Guide
- โก GPU Setup
- ๐ฏ Quick Reference
- ๐บ๏ธ QGIS Integration
Examples & Workflows
examples/- Python usage examplesexamples/*.yaml- Configuration templates- PyTorch Integration
- Parallel Processing
Architecture & API
๐ ๏ธ Development
# Clone and install in development mode
git clone https://github.com/sducournau/IGN_LIDAR_HD_DATASET
cd IGN_LIDAR_HD_DATASET
pip install -e ".[dev]"
# Run tests
pytest tests/
# Format code
black ign_lidar/
๐ Requirements
Core:
- Python 3.8+
- NumPy >= 1.21.0
- laspy >= 2.3.0
- scikit-learn >= 1.0.0
Optional GPU Acceleration:
- CUDA >= 12.0
- CuPy >= 12.0.0
- RAPIDS cuML >= 24.10 (recommended)
๐ License
MIT License - see LICENSE file for details.
๐ค Support & Contributing
- ๐ Report Issues
- ๐ก Feature Requests
- ๐ Contributing Guide
๐ Cite Me
If you use this library in your research or projects, please cite:
@software{ign_lidar_hd_dataset,
author = {Simon Ducournau},
title = {IGN LiDAR HD Processing Library},
year = {2025},
publisher = {ImagoData},
url = {https://github.com/sducournau/IGN_LIDAR_HD_DATASET},
version = {2.3.0}
}
Project maintained by: ImagoData
Made with โค๏ธ for the LiDAR and Machine Learning communities
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ign_lidar_hd-2.3.2.tar.gz.
File metadata
- Download URL: ign_lidar_hd-2.3.2.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26e40bb4bfc82615e182836da6ba327701be2fe403ceab875771c52eaa4a5363
|
|
| MD5 |
aea29b2fd9744d87f73f6a9fea6aac82
|
|
| BLAKE2b-256 |
fc6ff54464aaa35ed059fb95d39dc8aacbefcecd70fe32f1b8e213d9c22ecd5f
|
File details
Details for the file ign_lidar_hd-2.3.2-py3-none-any.whl.
File metadata
- Download URL: ign_lidar_hd-2.3.2-py3-none-any.whl
- Upload date:
- Size: 2.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e3a4c49341cd5e25f3179063630fe2c43af6b44e93ae8c663b632064becc8d0
|
|
| MD5 |
1b3aa3f5eae73cc70b4495ec50214fbf
|
|
| BLAKE2b-256 |
ac4fe42e173e760e013e3df90e3f8573a29559b17e4f2ecdc63ca30371bed9f0
|