Skip to main content

IGN LiDAR HD Dataset Processing Library for Building LOD Classification

Project description

IGN LiDAR HD Processing Library

PyPI version PyPI - Downloads Python 3.8+ License: MIT Documentation

Version 2.5.0 | ๐Ÿ“š Full Documentation

LoD3 Building Model

Transform IGN LiDAR HD point clouds into ML-ready datasets for building classification

Quick Start โ€ข What's New โ€ข Features โ€ข Documentation โ€ข Examples


๐Ÿ“Š Overview

A comprehensive Python library for processing French IGN LiDAR HD data into machine learning-ready datasets. Features include GPU acceleration, rich geometric features, RGB/NIR augmentation, and flexible YAML-based configuration.

Key Capabilities:

  • ๐Ÿš€ GPU Acceleration: 6-20x speedup with RAPIDS cuML
  • ๐ŸŽจ Multi-modal Data: Geometry + RGB + Infrared (NDVI-ready)
  • ๐Ÿ—๏ธ Building Classification: LOD2/LOD3 schemas with 15-30+ classes
  • ๐Ÿ“ฆ Flexible Output: NPZ, HDF5, PyTorch, LAZ formats
  • โš™๏ธ YAML Configuration: Reproducible workflows with example configs

โœจ What's New in v2.5.0

๐ŸŽ‰ Major Release: System Consolidation & Modernization

v2.5.0 represents a complete internal modernization while maintaining 100% backward compatibility!

Unified Feature System โœจ

  • FeatureOrchestrator: New unified class replaces FeatureManager + FeatureComputer
  • Simpler API: One class handles all feature computation with automatic strategy selection
  • Better organized: Clear separation of concerns with strategy pattern
  • Fully compatible: All existing code works without changes

Improved Code Quality

  • 67% reduction in feature orchestration code complexity
  • Enhanced error messages and validation throughout
  • Complete type hints for better IDE support
  • Modular architecture for easier maintenance and extension

Migration Made Easy

  • Zero breaking changes: Your v1.x code continues to work
  • Deprecation warnings: Clear guidance for future-proofing your code
  • Migration guide: Step-by-step instructions in MIGRATION_GUIDE.md
  • Backward compatible: Legacy APIs will be maintained through v2.x series
# NEW (v2.0) - Recommended unified API
from ign_lidar import LiDARProcessor

processor = LiDARProcessor(
    config_path="config.yaml",
    feature_mode="lod3"  # Clearer mode specification
)

# Access unified orchestrator
orchestrator = processor.feature_orchestrator
print(f"Feature mode: {orchestrator.mode}")
print(f"Has RGB: {orchestrator.has_rgb}")
print(f"Available features: {orchestrator.get_feature_list('lod3')}")

# OLD (v1.x) - Still works with deprecation warnings
# feature_manager = processor.feature_manager  # Deprecated but functional
# feature_computer = processor.feature_computer  # Deprecated but functional

Why upgrade?

  • Future-proof your code for v3.0
  • Access to new features and improvements
  • Better performance and error handling
  • Professional, maintainable codebase

๐Ÿ“– See MIGRATION_GUIDE.md for complete upgrade instructions
๐Ÿ“– Full Release History


๐Ÿš€ Quick Start

Installation

# Standard installation (CPU)
pip install ign-lidar-hd

# Optional: GPU acceleration (6-20x speedup)
./install_cuml.sh  # or follow GPU_SETUP.md

Basic Usage

# Download sample data
ign-lidar-hd download --bbox 2.3,48.8,2.4,48.9 --output data/ --max-tiles 5

# Enrich with features (GPU accelerated if available)
ign-lidar-hd enrich --input-dir data/ --output enriched/ --use-gpu

# Create training patches
ign-lidar-hd patch --input-dir enriched/ --output patches/ --lod-level LOD2

Python API

from ign_lidar import LiDARProcessor

# Initialize and process
processor = LiDARProcessor(lod_level="LOD2")
patches = processor.process_tile("data.laz", "output/")

๐Ÿ“‹ Key Features

Core Processing

  • ๐ŸŽฏ Complete Feature Export - All 35-45 computed geometric features saved to disk (v2.4.2+)
  • ๐Ÿ—๏ธ Multi-level Classification - LOD2 (12 features), LOD3 (38 features), Full (43+ features) modes
  • ๐Ÿ“Š Rich Geometry - Normals, curvature, eigenvalues, shape descriptors, architectural features, building scores
  • ๐ŸŽจ Optional Augmentation - RGB from orthophotos, NIR, NDVI for vegetation analysis
  • โš™๏ธ Auto-parameters - Intelligent tile analysis for optimal settings
  • ๐Ÿ“ Feature Tracking - Metadata includes feature names and counts for reproducibility

Performance

  • ๐Ÿš€ GPU Acceleration - RAPIDS cuML support (6-20x faster)
  • โšก Parallel Processing - Multi-worker with automatic CPU detection
  • ๐Ÿง  Memory Optimized - Chunked processing, 50-60% reduction
  • ๐Ÿ’พ Smart Skip - Resume interrupted workflows automatically (~1800x faster)

Flexibility

  • ๐Ÿ“ Processing Modes - Three clear modes: patches only, both, or LAZ only
  • ๐Ÿ“‹ YAML Configs - Declarative workflows with example templates
  • ๐Ÿ“ฆ Multiple Formats - NPZ, HDF5, PyTorch, LAZ (single or multi-format)
  • ๐Ÿ”ง CLI & API - Command-line tool and Python library

๐Ÿ’ก Usage Examples

Mode 1: Create Training Patches (Default)

# Using example config
ign-lidar-hd process \
  --config-file examples/config_training_dataset.yaml \
  input_dir=data/raw \
  output_dir=data/patches

# Or with CLI parameters
ign-lidar-hd process \
  input_dir=data/raw \
  output_dir=data/patches \
  output.processing_mode=patches_only

Mode 2: Both Patches & Enriched LAZ

ign-lidar-hd process \
  --config-file examples/config_complete.yaml \
  input_dir=data/raw \
  output_dir=data/both

Mode 3: LAZ Enrichment Only

ign-lidar-hd process \
  --config-file examples/config_quick_enrich.yaml \
  input_dir=data/raw \
  output_dir=data/enriched

โš ๏ธ Note on Enriched LAZ Files: When generating enriched LAZ tile files, geometric features (normals, curvature, planarity, etc.) may show artifacts at tile boundaries due to the nature of the source data. These artifacts are inherent to tile-based processing and do not appear in patch exports, which provide the best results for machine learning applications. For optimal quality, use patches_only or both modes.

GPU-Accelerated Processing

ign-lidar-hd process \
  --config-file examples/config_gpu_processing.yaml \
  input_dir=data/raw \
  output_dir=data/output

Preview Configuration

ign-lidar-hd process \
  --config-file examples/config_training_dataset.yaml \
  --show-config \
  input_dir=data/raw

Python API Examples

from ign_lidar import LiDARProcessor, IGNLiDARDownloader

# Download tiles
downloader = IGNLiDARDownloader("downloads/")
tiles = downloader.download_by_bbox(bbox=(2.3, 48.8, 2.4, 48.9), max_tiles=5)

# Process with custom config
processor = LiDARProcessor(
    lod_level="LOD3",
    patch_size=150.0,
    num_points=16384,
    use_gpu=True
)

# Single tile
patches = processor.process_tile("input.laz", "output/")

# Batch processing
patches = processor.process_directory("input_dir/", "output_dir/", num_workers=4)

# PyTorch integration
from torch.utils.data import DataLoader
dataset = LiDARPatchDataset("patches/")
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)

๐ŸŽ“ Feature Modes (LOD2 vs LOD3 vs Full)

LOD2 Mode (12 features) - Fast Training

Best for: Basic building classification, quick prototyping, baseline models

Features: XYZ, normal_z, planarity, linearity, height, verticality, RGB, NDVI

Performance: ~15s per 1M points (CPU), fast convergence

LOD3 Mode (38 features) - Detailed Modeling

Best for: Architectural modeling, fine structure detection, research

Additional Features: Complete normals (3), eigenvalues (5), curvature (2), shape descriptors (6), height features (3), building scores (3), density features (5), architectural features (4)

Performance: ~45s per 1M points (CPU), best accuracy

Full Mode (43+ features) - Complete Feature Set

Best for: Research, feature analysis, maximum information extraction

All Features: Everything from LOD3 plus additional height variants (z_absolute, z_from_ground, z_from_median), distance_to_center, local_roughness, horizontality

Performance: ~50s per 1M points (CPU), complete geometric description

Output Format:

  • NPZ/HDF5/PyTorch: Full feature matrix with all features
  • LAZ: All features as extra dimensions for GIS tools
  • Metadata: feature_names and num_features for tracking

๐Ÿ“– See Feature Modes Documentation for complete details.


๐Ÿ“ฆ Output Format

NPZ Structure

Each patch is saved as NPZ with:

{
    'points': np.ndarray,        # [N, 3] XYZ coordinates
    'normals': np.ndarray,       # [N, 3] surface normals
    'curvature': np.ndarray,     # [N] principal curvature
    'intensity': np.ndarray,     # [N] normalized intensity
    'planarity': np.ndarray,     # [N] planarity measure
    'verticality': np.ndarray,   # [N] verticality measure
    'density': np.ndarray,       # [N] local point density
    'labels': np.ndarray,        # [N] building class labels
    # Facultative features:
    'wall_score': np.ndarray,    # [N] wall likelihood (planarity * verticality)
    'roof_score': np.ndarray,    # [N] roof likelihood (planarity * horizontality)
    # Optional with augmentation:
    'red': np.ndarray,           # [N] RGB red
    'green': np.ndarray,         # [N] RGB green
    'blue': np.ndarray,          # [N] RGB blue
    'infrared': np.ndarray,      # [N] NIR values
}

Available Formats

  • NPZ - Default NumPy format (recommended for ML)
  • HDF5 - Hierarchical data format
  • PyTorch - .pt files for PyTorch
  • LAZ - Point cloud format for visualization (may show boundary artifacts in tile mode)
  • Multi-format - Save in multiple formats: hdf5,laz, npz,torch

๐Ÿ’ก Tip: For machine learning applications, NPZ/HDF5/PyTorch patch formats provide cleaner geometric features than enriched LAZ tiles.


๐Ÿ“š Documentation

Quick Links

Examples & Workflows

  • examples/ - Python usage examples and configuration templates
  • examples/config_lod2_simplified_features.yaml - Fast LOD2 training (12 features)
  • examples/config_lod3_full_features.yaml - Detailed LOD3 modeling (38 features)
  • examples/config_complete.yaml - Full mode with all 43+ features
  • examples/config_multiscale_hybrid.yaml - Multi-scale adaptive features
  • PyTorch Integration
  • Parallel Processing

Architecture & API


๐Ÿ› ๏ธ Development

# Clone and install in development mode
git clone https://github.com/sducournau/IGN_LIDAR_HD_DATASET
cd IGN_LIDAR_HD_DATASET
pip install -e ".[dev]"

# Run tests
pytest tests/

# Format code
black ign_lidar/

๐Ÿ“‹ Requirements

Core:

  • Python 3.8+
  • NumPy >= 1.21.0
  • laspy >= 2.3.0
  • scikit-learn >= 1.0.0

Optional GPU Acceleration:

  • CUDA >= 12.0
  • CuPy >= 12.0.0
  • RAPIDS cuML >= 24.10 (recommended)

๐Ÿ“„ License

MIT License - see LICENSE file for details.


๐Ÿค Support & Contributing


๐Ÿ“ Cite Me

If you use this library in your research or projects, please cite:

@software{ign_lidar_hd,
  author       = {Ducournau, Simon},
  title        = {IGN LiDAR HD Processing Library},
  year         = {2024},
  publisher    = {GitHub},
  url          = {https://github.com/sducournau/IGN_LIDAR_HD_DATASET},
  version      = {2.5.0}
}

Project maintained by: ImagoData


Made with โค๏ธ for the LiDAR and Machine Learning communities

โฌ† Back to top

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ign_lidar_hd-2.5.0.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ign_lidar_hd-2.5.0-py3-none-any.whl (2.0 MB view details)

Uploaded Python 3

File details

Details for the file ign_lidar_hd-2.5.0.tar.gz.

File metadata

  • Download URL: ign_lidar_hd-2.5.0.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for ign_lidar_hd-2.5.0.tar.gz
Algorithm Hash digest
SHA256 db519445ff6d7fcd16625abe83a98afd1d0fbb53fb6f4d6a51268eb927e61516
MD5 1e6766fc1d8907d8513ed29b98d3844b
BLAKE2b-256 41b64ac969419b6eab7fe49a8b71c39be822894a0b971d9cf148c5cc0c454299

See more details on using hashes here.

File details

Details for the file ign_lidar_hd-2.5.0-py3-none-any.whl.

File metadata

  • Download URL: ign_lidar_hd-2.5.0-py3-none-any.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for ign_lidar_hd-2.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 07558e83c3e44fa185051e9fbdae8840f051c284dda9c95134cad05c4ec5e3b7
MD5 bfc99ea5f0e8fccb70d5d24e899334cc
BLAKE2b-256 3b91bae2a01fb2c95cf7f43be2641ce5ff6c9a2a08067c7d9559a707cabfcf99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page