Skip to main content

Python library for water segmentation in high to moderate resolution remotely sensed imagery

Project description

OmniWaterMask

image image image Conda Recipe

OmniWaterMask is a Python library for high accuracy water segmentation in high to moderate resolution satellite imagery, supporting a wide range of resolutions, sensors, and processing levels.

Check out the paper here

Features

  • Process imagery resolutions from 0.2 m to 50 m.
  • Any imagery processing level
  • Only requires Red, Green, Blue and NIR bands
  • Known to work well with Sentinel-2, Landsat 8, PlanetScope, Maxar and NAIP

Try in Colab

Colab_Button

How it works

OmniWaterMask integrates a sensor agnostic deep learning segmentation model with NDWI and vector datasets to detect water bodies within remote sensing products.

Installation

To use OmniWaterMask, you need to install the package. It is recommended to use an environment manager such as conda or uv to avoid conflicts with other packages.

Install the package using pip

pip install omniwatermask

Install the package using uv

uv add omniwatermask

Create a new conda environment and install from conda-forge

conda create -n owm python=3.12
conda activate owm
conda install -c conda-forge omniwatermask

Install the package from source

pip install git+https://github.com/DPIRD-DMA/OmniWaterMask.git

Usage

To predict a water mask for a list of scenes simply pass a list of geotiff files to the make_water_mask function along with the band order for the Red, Green, Blue and NIR bands. Predictions are saved to disk along side the input as geotiffs, a list of prediction file paths is returned:

from pathlib import Path
from omniwatermask import make_water_mask

scene_paths = [Path("path/to/scene1.tif"), Path("path/to/scene2.tif")]

# Predict water masks for scenes
water_mask_path = make_water_mask(
    scene_paths=scene_paths,  # you can pass a list of images
    band_order=[1, 2, 3, 4],  # band order of the input images, expects RGB+NIR
)

Output

  • Output classes are:
  • 0 = Non-water
  • 1 = water

Usage tips

  • OWM requires an active internet connection to function properly, as it needs to download OpenStreetMap (OSM) data.

  • Hardware acceleration is strongly recommended:

    • NVIDIA GPU
    • Apple Silicon Mac
    • Other PyTorch-compatible accelerators
  • Consider enabling "bf16" inference_dtype on compatible hardware - this typically results in faster processing speeds.

  • If experiencing VRAM limitations even with batch_size=1, switching the 'mosaic_device' parameter to 'cpu' can help.

  • Improve accuracy by providing known water body locations as 'aux_vector_sources' - simply pass a list of file paths pointing to your water polygon datasets.

  • Reduce false positives by including vector data for common misidentification sources (buildings, roads) through the 'aux_negative_vector_sources' parameter.

  • When working with scenes containing no-data regions, explicitly set the 'no_data_value' parameter to ensure proper handling of these areas.

Cloudy imagery

If you are working with cloudy imagery, either:

  • use a temporal mosaic that is already cloud and cloud-shadow free (e.g. via s2mosaic for Sentinel-2), or
  • apply a high quality cloud and cloud shadow mask and set those pixels to 0 (the no_data_value) before running OWM.

This matters because OWM optimises its detection thresholds both locally (per region/patch) and globally (across the whole scene). Cloud and cloud-shadow pixels are out-of-distribution and can skew those optimisations, so bad data in one part of a scene can degrade the water prediction in other, otherwise-clean parts. Masking those pixels to no-data removes them from the optimisation entirely.

OmniCloudMask is a good choice for the masking step. See the cloudy Sentinel-2 example for an end-to-end mask-then-infer workflow.

Parameters

  • scene_paths: List of paths or single path (supports both Path and string types) to the input satellite/aerial imagery

  • band_order: List of integers specifying the band order for input imagery (e.g., [1,2,3,4] if your input image is stored with band order red, green, blue then NIR data). This tells OWM which bands correspond to Red, Green, Blue, and Near-Infrared channels

  • batch_size: Number of patches processed simultaneously during inference. Default is 1, increase for better GPU utilization

  • version: Version identifier for the output files. Defaults to current OmniWaterMask version

  • output_dir: Optional path for output files. If not specified, outputs are saved alongside input files

  • mosaic_device: Device for mosaic operations ("cpu", "cuda" or "mps"). Defaults to system's default device

  • inference_device: Device for model inference ("cpu", "cuda" or "mps"). Defaults to system's default device

  • aux_vector_sources: List of paths to supplementary water body vector data to aid detection

  • aux_negative_vector_sources: List of paths to vector data marking areas commonly misidentified as water

  • inference_dtype: Data type for inference operations. Defaults to torch.float32

  • no_data_value: Value indicating no-data regions in the input imagery. Defaults to 0

  • inference_patch_size: Size of image patches for inference. Defaults to 1000 pixels

  • inference_overlap_size: Overlap between adjacent patches during inference. Defaults to 300 pixels

  • overwrite: Whether to overwrite existing output files. Defaults to True

  • use_cache: Whether to cache vector data processing results. Defaults to True

  • use_osm_building: Whether to use OpenStreetMap building data to reduce false positives. Defaults to True

  • use_osm_roads: Whether to use OpenStreetMap road data to reduce false positives. Defaults to True

  • cache_dir: Directory for storing cached vector data. Defaults to "OWM_cache" in current directory

  • destination_model_dir: Directory to save the model weights. Defaults to None

  • model_download_source: Source from which to download the model weights. Defaults to "hugging_face", can also be "google_drive".

Examples

Example notebooks are available in the examples/ directory:

Changelog

See CHANGELOG.md for a full list of changes across versions.

Contributing

Contributions are welcome! Please submit a pull request or open an issue to discuss any changes.

Development setup

Clone the repository and install the dependencies (including the dev group) with uv:

uv sync --all-extras --dev

Optionally install the git hooks (ruff lint/format on commit, mypy + the fast tests on push):

uv run pre-commit install
uv run pre-commit install --hook-type pre-push

Running the tests

Tests use pytest. The fast suite (unit tests + model-mocked pipeline tests) runs in a few seconds and is what CI runs by default:

uv run pytest                              # full fast suite
uv run pytest tests/test_orchestration.py  # one file
uv run pytest -k make_water_mask           # match by name

End-to-end tests that download the real model weights and run inference on real imagery are marked e2e and excluded by default (see addopts in pyproject.toml). To run them explicitly:

uv run pytest -m e2e                        # only the e2e/inference tests
uv run pytest -m ""                         # everything, including e2e

Lint, format and type-check:

uv run ruff check .
uv run ruff format .
uv run mypy omniwatermask/

For maintainers: pushing a version tag (e.g. git tag v0.4.4 && git push --tags) builds the package and publishes it to PyPI via GitHub Actions trusted publishing — no tokens required.

License

This project is licensed under the MIT License

Acknowledgements

Special thanks to the S1S2-Water dataset authors and The FLAIR #1 dataset authors for providing the valuable training datasets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniwatermask-0.5.0.tar.gz (21.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omniwatermask-0.5.0-py3-none-any.whl (22.8 kB view details)

Uploaded Python 3

File details

Details for the file omniwatermask-0.5.0.tar.gz.

File metadata

  • Download URL: omniwatermask-0.5.0.tar.gz
  • Upload date:
  • Size: 21.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniwatermask-0.5.0.tar.gz
Algorithm Hash digest
SHA256 7b86a958218173291a5fe71725a9ebb31bd6be98bfbf6e6cf4d9f2db52b4b755
MD5 82372b8beb9953f855e3b680753ad853
BLAKE2b-256 debc697d5c77398933273565c8a5785c52609c7a717819536de0f4e120ab00ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for omniwatermask-0.5.0.tar.gz:

Publisher: publish.yml on DPIRD-DMA/OmniWaterMask

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file omniwatermask-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: omniwatermask-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 22.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniwatermask-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5666e6c49226b098796982e3f72991b2f376f096452e34081b3f411c2561c4d7
MD5 0f4a1ced025c8d635690dd962917e5b2
BLAKE2b-256 2dca624e67ef8468e9e74d6900019822bc73332ad36cbc7f7e909dc002779b9b

See more details on using hashes here.

Provenance

The following attestation bundles were made for omniwatermask-0.5.0-py3-none-any.whl:

Publisher: publish.yml on DPIRD-DMA/OmniWaterMask

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page