Skip to main content

Easily apply foundational CV models (SAM, etc.) to detect and generate river masks from imagery.

Project description

RiverCV

RiverCV is a Python library for detecting and segmenting rivers in imagery using foundational computer-vision models. A single unified API works across SAM 1, SAM 2/2.1, YOLOv8-seg, and Florence 2, with built-in text-driven prompting via GroundingDINO.


Table of Contents


Installation

Install the core package:

pip install rivercv

Then install the backend(s) you want to use:

pip install "rivercv[sam1]"          # SAM 1 (segment-anything)
pip install "rivercv[sam2]"          # SAM 2 / 2.1
pip install "rivercv[yolo]"          # YOLOv8-seg
pip install "rivercv[florence2]"     # Florence 2 (text-native)
pip install "rivercv[grounding-dino]" # GroundingDINO (auto-prompting for SAM 1/2)
pip install "rivercv[all]"           # everything

PyTorch note — SAM 2.1 requires torch >= 2.5.1, which is the version rivercv installs by default. If you only need SAM 1 or YOLOv8 and already have an older torch, pin it before installing: pip install torch==2.x.x then pip install rivercv --no-deps.


Quick Start

from rivercv.utils import load_image, save_mask
from rivercv import predict_mask

image = load_image("river.jpg")

# Point-prompt with SAM 2 (HuggingFace weights — no local download needed)
mask = predict_mask(
    "sam2", image,
    hf_model_id="facebook/sam2-hiera-large",
    points=[[512, 256]],   # (x, y) on the river
    point_labels=[1],      # 1 = foreground
    device="cuda",
)

save_mask(mask, "river_mask.png")

Usage

Spatial prompts — points and boxes

Every backend accepts the same three optional prompts. Pass at least one.

from rivercv import predict_mask

# --- Point prompt ---
mask = predict_mask(
    "sam1", image,
    checkpoint="sam_vit_h_4b8939.pth",
    points=[[512, 256]],   # list of [x, y]
    point_labels=[1],      # 1 = foreground, 0 = background
)

# --- Box prompt ---
mask = predict_mask(
    "sam2", image,
    hf_model_id="facebook/sam2-hiera-large",
    box=[100, 200, 800, 600],  # [x_min, y_min, x_max, y_max]
)

# --- Mixed (point + background exclusion) ---
mask = predict_mask(
    "yolov8", image,
    checkpoint="yolov8n-seg.pt",
    points=[[512, 256], [50, 50]],
    point_labels=[1, 0],   # river point, background point
)

All predict_mask() calls return a single np.ndarray of dtype bool, shape (H, W), where True = river pixel.


Text prompts — auto-prompting

For SAM 1 and SAM 2, GroundingDINO converts a text description into a bounding box automatically. Florence 2 handles text natively — no extra model needed.

from rivercv import predict_from_text

# SAM 2 + GroundingDINO (GroundingDINO auto-detects the river, SAM segments it)
mask = predict_from_text(
    "sam2", image, "river",
    hf_model_id="facebook/sam2-hiera-large",
    device="cuda",
)

# SAM 1 + GroundingDINO with a custom box threshold
mask = predict_from_text(
    "sam1", image, "river",
    checkpoint="sam_vit_h_4b8939.pth",
    prompter_kwargs={"box_threshold": 0.35},
)

# Florence 2 — text-native, no GroundingDINO required
mask = predict_from_text(
    "florence2", image, "river",
    model_id="microsoft/Florence-2-large",
    device="cuda",
)

predict_from_text dispatches automatically:

  • Text-native (Florence 2, SAM 3) → text forwarded directly to the model.
  • Spatial (SAM 1, SAM 2, YOLOv8) → GroundingDINO runs first, the resulting box is forwarded to the segmentation model.

If GroundingDINO finds no detections above the threshold, a zero mask is returned rather than raising an exception.


Reusable predictor — multiple images

create_predictor avoids reloading model weights for each image. The image encoder result is cached by object identity, so calling predict() twice with the same array skips the ViT pass.

from rivercv.models import create_predictor

# Build once
pred = create_predictor(
    "sam2",
    hf_model_id="facebook/sam2-hiera-large",
    device="cuda",
)

# Reuse across many images
for image in image_stack:
    mask = pred.predict(image, points=[[512, 256]], point_labels=[1])
    process(mask)

# Release GPU memory when done (optional)
pred.close()

For text-native models:

pred = create_predictor("florence2", model_id="microsoft/Florence-2-large")
mask = pred.predict(image, text="river")

River mask helper

generate_river_mask is a thin convenience wrapper around predict() that enforces the "at least one prompt" contract:

from rivercv.masks import generate_river_mask
from rivercv.models import create_predictor

pred = create_predictor("sam1", checkpoint="sam_vit_h_4b8939.pth")

mask = generate_river_mask(
    pred, image,
    point_coords=[[512, 256]],
    point_labels=[1],
)

Backends

Name Install extra Weights source Text-native
sam1 / sam rivercv[sam1] Manual download No
sam2 / sam2.1 rivercv[sam2] Manual or HuggingFace No
yolov8 / yolo rivercv[yolo] Auto-download or local No
florence2 rivercv[florence2] HuggingFace Yes
sam3 Not yet released

Accepted model name aliases — all case-insensitive:

sam, sam1, sam1.0
sam2, sam2.0, sam2.1
yolo, yolov8, yolov8-seg
florence2, florence-2, florence2-base, florence2-large
sam3, sam3.0

Model Weights

SAM 1

Download from Meta's SAM repository:

Variant File model_type
ViT-H (best) sam_vit_h_4b8939.pth "vit_h"
ViT-L sam_vit_l_0b3195.pth "vit_l"
ViT-B (fastest) sam_vit_b_01ec64.pth "vit_b"
pred = create_predictor("sam1", checkpoint="sam_vit_h_4b8939.pth", model_type="vit_h")

SAM 2 / 2.1

Option A — HuggingFace (recommended, no manual download):

pred = create_predictor("sam2", hf_model_id="facebook/sam2-hiera-large")
# also: facebook/sam2-hiera-small, facebook/sam2.1-hiera-large, etc.

Option B — local checkpoint (download from Meta's SAM 2 repository):

pred = create_predictor("sam2", checkpoint="sam2_hiera_large.pt")
# Config is inferred from the filename automatically.
# Supported stems: sam2_hiera_{tiny,small,base_plus,large}
#                  sam2.1_hiera_{tiny,small,base_plus,large}

YOLOv8-seg

Ultralytics auto-downloads weights on first use when passed a model name:

pred = create_predictor("yolov8", checkpoint="yolov8n-seg.pt")
# downloads to the ultralytics cache on first call

Or pass an absolute path to use a locally fine-tuned model.

Florence 2

Weights are downloaded from HuggingFace on first use and cached at ~/.cache/huggingface/hub/:

pred = create_predictor("florence2", model_id="microsoft/Florence-2-large")
# also: microsoft/Florence-2-base

GroundingDINO (auto-prompter)

HuggingFace mode (default) — requires rivercv[grounding-dino]:

from rivercv.prompts import create_prompter
prompter = create_prompter("grounding_dino")
# uses IDEA-Research/grounding-dino-tiny by default
prompter = create_prompter("grounding_dino", hf_model_id="IDEA-Research/grounding-dino-base")

Local mode — install GroundingDINO separately (pip install -e . from the cloned repo), then:

prompter = create_prompter(
    "grounding_dino",
    hf_model_id=None,
    checkpoint="groundingdino_swint_ogc.pth",
    config="GroundingDINO_SwinT_OGC.py",
)

Local GroundingDINO may conflict with SAM 2.1 if it was built against an older PyTorch version. Use HuggingFace mode when combining both.


Development

git clone https://github.com/DSHydro/RiverCV.git
cd RiverCV
pip install -e ".[dev]"
pytest

Running tests — all tests are mock-based and run without GPU or model weights:

pytest tests/ -v

Building for PyPI:

python -m build
twine check dist/*          # lint the package metadata
twine upload --repository testpypi dist/*   # test first
twine upload dist/*                         # then publish

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rivercv-0.2.0.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rivercv-0.2.0-py3-none-any.whl (22.3 kB view details)

Uploaded Python 3

File details

Details for the file rivercv-0.2.0.tar.gz.

File metadata

  • Download URL: rivercv-0.2.0.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for rivercv-0.2.0.tar.gz
Algorithm Hash digest
SHA256 750252978405ba79a7de96bc0fe16fae67d043f368ea7f77ac29957c0f97dfd1
MD5 dc8539ebbace994511fb18948b0e5fd0
BLAKE2b-256 da4031f6072061e5ca41d331fae9e8c7cd2988d4c8928919b8716761a4f18f74

See more details on using hashes here.

File details

Details for the file rivercv-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: rivercv-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 22.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for rivercv-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 326101e5742e040e88e9a48bb27a3b9cb24225a7ba547defa07982e070da17f7
MD5 539a525884232872ecefefb689594e03
BLAKE2b-256 912da845c9e8fd9161391eb19577d044753066e476cc6b8d07c74f74857b31b4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page