Skip to main content

A flexible and efficient semantic segmentation pipeline for processing images and videos

Project description

CitySeg: Urban Semantic Segmentation Pipeline

CitySeg is a flexible and efficient semantic segmentation pipeline for processing images and videos of urban environments. It supports multiple segmentation models and datasets, with capabilities for processing high-resolution inputs and comprehensive result analysis.

Features

  • Support for multiple segmentation models (OneFormer)
  • Compatible with various datasets (Cityscapes, ADE20k, Mapillary Vistas)
  • Flexible image resizing for processing high-resolution inputs
  • Comprehensive analysis of segmentation results, including category-wise statistics
  • Support for both image and video inputs
  • Multi-video processing capability for entire directories
  • Caching of processed segmentation maps in HDF5 format for quick re-analysis
  • Output includes segmentation maps, colored segmentations, overlay visualizations, and detailed CSV reports
  • Configurable logging with console and file outputs

Installation

You can install CitySeg using pip:

pip install cityseg

Alternatively, clone the repository and install the dependencies:

git clone https://github.com/your-username/cityseg.git
cd cityseg
pip install -e .

Dependencies

CitySeg requires the following main packages:

  • PyTorch
  • torchvision
  • transformers
  • opencv-python (cv2)
  • numpy
  • Pillow (PIL)
  • h5py
  • pandas
  • tqdm
  • pyyaml
  • loguru

For a complete list of dependencies, please refer to the pyproject.toml file.

Usage

  1. Prepare a configuration YAML file (e.g., config.yaml) with your desired settings.

  2. Run the pipeline using the following command:

    python -m cityseg.main --config path/to/your/config.yaml
    

    Optional arguments:

    • --log-level: Set the logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
    • --verbose: Enable verbose logging
  3. The pipeline will process the input and generate various outputs including segmentation maps, visualizations, and analysis reports.

Configuration

Here's an example of a basic configuration file:

input: path/to/your/input/file_or_directory
output_prefix: path/to/your/output/directory/output
model:
  name: shi-labs/oneformer_cityscapes_swin_large
  max_size: 1920  # Set to null to maintain original resolution
  device: cuda  # or cpu or mps
frame_step: 5  # For video processing, process every 5th frame
save_raw_segmentation: true
save_colored_segmentation: true
save_overlay: true
visualization:
  alpha: 0.5
  colormap: default

Models

CitySeg currently supports Mask2Former and BEIT models. The verified models include:

  • "facebook/mask2former-swin-large-cityscapes-semantic"
  • "facebook/mask2former-swin-large-mapillary-vistas-semantic"
  • "facebook/maskformer-swin-small-ade" (sort of, this often leads to segfaults. Recommend using disable_tqdm in the config.)
  • "microsoft/beit-large-finetuned-ade-640-640"
  • "nvidia/segformer-b5-finetuned-cityscapes-1024-1024"
  • "zoheb/mit-b5-finetuned-sidewalk-semantic"
  • "nickmuchi/segformer-b4-finetuned-segments-sidewalk"

Mask2Former are by far the most stable.

Some models which seem to load correctly but continually produce segfault errors on my machine are:

  • "facebook/maskformer-swin-large-ade"
  • "nvidia/segformer-b5-finetuned-ade-640-640"
  • "nvidia/segformer-b0-finetuned-cityscapes-1024-1024"
  • "zoheb/mit-b5-finetuned-sidewalk-semantic" (use model_type: segformer in the config)

Confirmed not to work due to issues with the Hugging Face pipeline:

  • "shi-labs/oneformer_ade20k_dinat_large"

Note on dinat models: The dinat backbone models require the natten package, which may have installation issues on some systems. These models are also significantly slower than the swin backbone models, especially when forced to run on CPU. However, they may produce better quality outputs in some cases.

Project Structure

The project is organized into several Python modules, each serving a specific purpose within the CitySeg pipeline:

  • main.py: Entry point of the application, responsible for initializing and running the segmentation pipeline.
  • config.py: Defines configuration classes and handles loading and validating configuration settings.
  • pipeline.py: Implements the core segmentation pipeline, including model loading and inference.
  • processors.py: Contains classes for processing images, videos, and directories, managing the segmentation workflow.
  • segmentation_analyzer.py: Provides functionality for analyzing segmentation results, including computing statistics and generating reports.
  • video_file_iterator.py: Implements an iterator for efficiently processing multiple video files in a directory.
  • visualization_handler.py: Handles the visualization of segmentation results using color palettes.
  • file_handler.py: Manages file operations related to saving and loading segmentation data and metadata.
  • utils.py: Provides utility functions for various tasks, including data handling and logging.
  • palettes.py: Defines color palettes for different datasets used in segmentation.
  • exceptions.py: Custom exception classes for error handling throughout the pipeline.

This modular structure allows for easy maintenance and extension of the CitySeg pipeline, facilitating the addition of new features and models.

Logging

The pipeline uses the loguru library for flexible and configurable logging. You can set the log level and enable verbose output using command-line arguments:

python main.py --config path/to/your/config.yaml --log-level INFO # or DEBUG, WARNING, ERROR, CRITICAL or --verbose

Logs are output to both the console and a file (segmentation.log). The file log is in JSON format for easy parsing and analysis.

Extending CitySeg

To add support for new segmentation models:

  1. Implement a new model class in pipeline.py
  2. Update the create_segmentation_pipeline function in pipeline.py
  3. Update the configuration handling in config.py if necessary

Contributing

Contributions to CitySeg are welcome! Please refer to the CONTRIBUTING.md file for guidelines on how to contribute to this project.

License

CitySeg is released under the BSD 3-Clause License. See the LICENSE file for details.

Contact

For support or inquiries, please open an issue on the GitHub repository or contact Andrew Mitchell.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cityseg-0.3.1rc0.tar.gz (118.4 kB view details)

Uploaded Source

Built Distribution

cityseg-0.3.1rc0-py3-none-any.whl (30.4 kB view details)

Uploaded Python 3

File details

Details for the file cityseg-0.3.1rc0.tar.gz.

File metadata

  • Download URL: cityseg-0.3.1rc0.tar.gz
  • Upload date:
  • Size: 118.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cityseg-0.3.1rc0.tar.gz
Algorithm Hash digest
SHA256 a9cd34d72969429c3b54fe34d568221823ec361bdef3ded800aaa7fc2485625d
MD5 ded9b39370948e52db274d71f0e9668d
BLAKE2b-256 672473c541d98b0e29d468a31f9689d02ded694f0bf8bde0dab08b105683a271

See more details on using hashes here.

Provenance

The following attestation bundles were made for cityseg-0.3.1rc0.tar.gz:

Publisher: tag-release.yml on MitchellAcoustics/cityseg

Attestations:

File details

Details for the file cityseg-0.3.1rc0-py3-none-any.whl.

File metadata

  • Download URL: cityseg-0.3.1rc0-py3-none-any.whl
  • Upload date:
  • Size: 30.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cityseg-0.3.1rc0-py3-none-any.whl
Algorithm Hash digest
SHA256 dfa889e0748ac3aa70b066a5c3898741ce6bcaa3981e0df3d0fda935d944e3d4
MD5 5f715a4dd594120c47c35bd2af9ded68
BLAKE2b-256 bb1fd187c41c4e20f69472d6517c2f399172597e47eb870ad00baa58993d604b

See more details on using hashes here.

Provenance

The following attestation bundles were made for cityseg-0.3.1rc0-py3-none-any.whl:

Publisher: tag-release.yml on MitchellAcoustics/cityseg

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page