Skip to main content

IGN LiDAR HD Dataset Processing Library for Building LOD Classification

Project description

IGN LiDAR HD Processing Library

PyPI version PyPI - Downloads Python 3.8+ License: MIT Documentation

Version 2.3.4 | ๐Ÿ“š Full Documentation

LoD3 Building Model

Transform IGN LiDAR HD point clouds into ML-ready datasets for building classification

Quick Start โ€ข Features โ€ข Documentation โ€ข Examples


๐Ÿ“Š Overview

A comprehensive Python library for processing French IGN LiDAR HD data into machine learning-ready datasets. Features include GPU acceleration, rich geometric features, RGB/NIR augmentation, and flexible YAML-based configuration.

Key Capabilities:

  • ๐Ÿš€ GPU Acceleration: 6-20x speedup with RAPIDS cuML
  • ๐ŸŽจ Multi-modal Data: Geometry + RGB + Infrared (NDVI-ready)
  • ๐Ÿ—๏ธ Building Classification: LOD2/LOD3 schemas with 15-30+ classes
  • ๐Ÿ“ฆ Flexible Output: NPZ, HDF5, PyTorch, LAZ formats
  • โš™๏ธ YAML Configuration: Reproducible workflows with example configs

โœจ What's New in v2.3.4

Robust Feature Validation:

  • ๐Ÿ”ง Feature Robustness: All geometric features now guaranteed within valid ranges [0, 1]
  • ๐ŸŽฏ Eigenvalue Clamping: Prevents negative eigenvalues from numerical artifacts
  • ๐Ÿ“Š Density Normalization: Capped at 1000 points/mยณ for ML stability
  • โœ… Boundary Feature Parity: Complete feature set across all computation paths
  • ๐Ÿ”„ Formula Standardization: Consistent ฮป0 normalization (Weinmann et al.)
  • ๐Ÿ“ˆ Zero Overhead: <1% performance impact from validation

v2.3.3 - Input Data Preservation & RGB Bug Fix:

Input Data Preservation & RGB Bug Fix:

  • ๐ŸŽจ Preserve RGB/NIR/NDVI from Input LAZ: Automatically detects and preserves RGB, NIR, and NDVI from input files
  • ๐Ÿ› CRITICAL RGB Bug Fix: Fixed coordinate mismatch in augmented patches - RGB now applied at tile level before extraction
  • โšก 3x Faster RGB Processing: Fetch RGB once per tile instead of per patch
  • ๐Ÿ“Š Patch Metadata: Added _patch_center and _patch_bounds for debugging and validation
  • โœ… Comprehensive Testing: RGB consistency verified across all augmentation types

v2.3.1 - Memory Optimization & System Compatibility:

  • ๐Ÿง  Memory-optimized configurations for 8GB-32GB+ systems
  • ๐Ÿ“Š Automatic worker scaling based on memory pressure detection
  • โš™๏ธ Sequential processing mode for minimal memory footprint
  • ๐Ÿ“– Comprehensive memory optimization guide (examples/MEMORY_OPTIMIZATION.md)
  • ๐Ÿ”ง Three configuration profiles: Original (32GB+), Optimized (16-24GB), Sequential (8-16GB)

v2.3.0 - Processing Modes & Custom Configurations:

  • ๐Ÿง  Memory-optimized configurations for 8GB-32GB+ systems
  • ๐Ÿ“Š Automatic worker scaling based on memory pressure detection
  • โš™๏ธ Sequential processing mode for minimal memory footprint
  • ๐Ÿ“– Comprehensive memory optimization guide (examples/MEMORY_OPTIMIZATION.md)
  • ๐Ÿ”ง Three configuration profiles: Original (32GB+), Optimized (16-24GB), Sequential (8-16GB)

v2.3.0 - Processing Modes & Custom Configurations:

  • Clear processing modes: patches_only, both, enriched_only
  • YAML config files in examples/ directory for common workflows
  • CLI parameter overrides with --config-file and --show-config

๐Ÿ“– Full Release History


๐Ÿš€ Quick Start

Installation

# Standard installation (CPU)
pip install ign-lidar-hd

# Optional: GPU acceleration (6-20x speedup)
./install_cuml.sh  # or follow GPU_SETUP.md

Basic Usage

# Download sample data
ign-lidar-hd download --bbox 2.3,48.8,2.4,48.9 --output data/ --max-tiles 5

# Enrich with features (GPU accelerated if available)
ign-lidar-hd enrich --input-dir data/ --output enriched/ --use-gpu

# Create training patches
ign-lidar-hd patch --input-dir enriched/ --output patches/ --lod-level LOD2

Python API

from ign_lidar import LiDARProcessor

# Initialize and process
processor = LiDARProcessor(lod_level="LOD2")
patches = processor.process_tile("data.laz", "output/")

๐Ÿ“‹ Key Features

Core Processing

  • Pure LiDAR - Geometric analysis without RGB dependencies
  • Multi-level Classification - LOD2 (15 classes) and LOD3 (30+ classes)
  • Rich Features - Normals, curvature, planarity, verticality, density, wall/roof scores
  • Augmentation - Optional RGB from orthophotos, NIR for NDVI
  • Auto-parameters - Intelligent tile analysis for optimal settings

Performance

  • GPU Acceleration - RAPIDS cuML support (6-20x faster)
  • Parallel Processing - Multi-worker with automatic CPU detection
  • Memory Optimized - Per-chunk architecture, 50-60% reduction
  • Smart Skip - Resume interrupted workflows automatically

Flexibility

  • Processing Modes - Three clear modes: patches only, both, or LAZ only
  • YAML Configs - Declarative workflows with example templates
  • Multiple Formats - NPZ, HDF5, PyTorch, LAZ (single or multi-format)
  • CLI & API - Command-line tool and Python library

๐Ÿ’ก Usage Examples

Mode 1: Create Training Patches (Default)

# Using example config
ign-lidar-hd process \
  --config-file examples/config_training_dataset.yaml \
  input_dir=data/raw \
  output_dir=data/patches

# Or with CLI parameters
ign-lidar-hd process \
  input_dir=data/raw \
  output_dir=data/patches \
  output.processing_mode=patches_only

Mode 2: Both Patches & Enriched LAZ

ign-lidar-hd process \
  --config-file examples/config_complete.yaml \
  input_dir=data/raw \
  output_dir=data/both

Mode 3: LAZ Enrichment Only

ign-lidar-hd process \
  --config-file examples/config_quick_enrich.yaml \
  input_dir=data/raw \
  output_dir=data/enriched

โš ๏ธ Note on Enriched LAZ Files: When generating enriched LAZ tile files, geometric features (normals, curvature, planarity, etc.) may show artifacts at tile boundaries due to the nature of the source data. These artifacts are inherent to tile-based processing and do not appear in patch exports, which provide the best results for machine learning applications. For optimal quality, use patches_only or both modes.

GPU-Accelerated Processing

ign-lidar-hd process \
  --config-file examples/config_gpu_processing.yaml \
  input_dir=data/raw \
  output_dir=data/output

Preview Configuration

ign-lidar-hd process \
  --config-file examples/config_training_dataset.yaml \
  --show-config \
  input_dir=data/raw

Python API Examples

from ign_lidar import LiDARProcessor, IGNLiDARDownloader

# Download tiles
downloader = IGNLiDARDownloader("downloads/")
tiles = downloader.download_by_bbox(bbox=(2.3, 48.8, 2.4, 48.9), max_tiles=5)

# Process with custom config
processor = LiDARProcessor(
    lod_level="LOD3",
    patch_size=150.0,
    num_points=16384,
    use_gpu=True
)

# Single tile
patches = processor.process_tile("input.laz", "output/")

# Batch processing
patches = processor.process_directory("input_dir/", "output_dir/", num_workers=4)

# PyTorch integration
from torch.utils.data import DataLoader
dataset = LiDARPatchDataset("patches/")
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)

๐Ÿ“ฆ Output Format

NPZ Structure

Each patch is saved as NPZ with:

{
    'points': np.ndarray,        # [N, 3] XYZ coordinates
    'normals': np.ndarray,       # [N, 3] surface normals
    'curvature': np.ndarray,     # [N] principal curvature
    'intensity': np.ndarray,     # [N] normalized intensity
    'planarity': np.ndarray,     # [N] planarity measure
    'verticality': np.ndarray,   # [N] verticality measure
    'density': np.ndarray,       # [N] local point density
    'labels': np.ndarray,        # [N] building class labels
    # Facultative features:
    'wall_score': np.ndarray,    # [N] wall likelihood (planarity * verticality)
    'roof_score': np.ndarray,    # [N] roof likelihood (planarity * horizontality)
    # Optional with augmentation:
    'red': np.ndarray,           # [N] RGB red
    'green': np.ndarray,         # [N] RGB green
    'blue': np.ndarray,          # [N] RGB blue
    'infrared': np.ndarray,      # [N] NIR values
}

Available Formats

  • NPZ - Default NumPy format (recommended for ML)
  • HDF5 - Hierarchical data format
  • PyTorch - .pt files for PyTorch
  • LAZ - Point cloud format for visualization (may show boundary artifacts in tile mode)
  • Multi-format - Save in multiple formats: hdf5,laz, npz,torch

๐Ÿ’ก Tip: For machine learning applications, NPZ/HDF5/PyTorch patch formats provide cleaner geometric features than enriched LAZ tiles.


๐Ÿ“š Documentation

Quick Links

Examples & Workflows

Architecture & API


๐Ÿ› ๏ธ Development

# Clone and install in development mode
git clone https://github.com/sducournau/IGN_LIDAR_HD_DATASET
cd IGN_LIDAR_HD_DATASET
pip install -e ".[dev]"

# Run tests
pytest tests/

# Format code
black ign_lidar/

๐Ÿ“‹ Requirements

Core:

  • Python 3.8+
  • NumPy >= 1.21.0
  • laspy >= 2.3.0
  • scikit-learn >= 1.0.0

Optional GPU Acceleration:

  • CUDA >= 12.0
  • CuPy >= 12.0.0
  • RAPIDS cuML >= 24.10 (recommended)

๐Ÿ“„ License

MIT License - see LICENSE file for details.


๐Ÿค Support & Contributing


๐Ÿ“ Cite Me

If you use this library in your research or projects, please cite:

@software{ign_lidar_hd_dataset,
  author       = {Simon Ducournau},
  title        = {IGN LiDAR HD Processing Library},
  year         = {2025},
  publisher    = {ImagoData},
  url          = {https://github.com/sducournau/IGN_LIDAR_HD_DATASET},
  version      = {2.3.0}
}

Project maintained by: ImagoData


Made with โค๏ธ for the LiDAR and Machine Learning communities

โฌ† Back to top

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ign_lidar_hd-2.3.4.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ign_lidar_hd-2.3.4-py3-none-any.whl (2.0 MB view details)

Uploaded Python 3

File details

Details for the file ign_lidar_hd-2.3.4.tar.gz.

File metadata

  • Download URL: ign_lidar_hd-2.3.4.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for ign_lidar_hd-2.3.4.tar.gz
Algorithm Hash digest
SHA256 19924cf3cf1639ff9faa3acbf31a3bfcbe7c938cdb31c44b4464ac9b26577a4b
MD5 8d2878f5872ce75a6acbd36a9d72a4c6
BLAKE2b-256 9c27fac2849d584d455cb983e6614b40e044abed3ea8d993e13cc96bfb7ad04f

See more details on using hashes here.

File details

Details for the file ign_lidar_hd-2.3.4-py3-none-any.whl.

File metadata

  • Download URL: ign_lidar_hd-2.3.4-py3-none-any.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for ign_lidar_hd-2.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 c7c81c01e309bb2680ff5da4a072a04acebf6753bd1c0cc668b951c93756a88f
MD5 50884d11b79e7c3628cca9569938f497
BLAKE2b-256 1e5a0e90da1c8d19d041f75bf555b32e3839ea3f0307cce92cf05635f1fd2454

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page