Easily apply foundational CV models (SAM, etc.) to detect and generate river masks from imagery.
Project description
RiverCV
RiverCV is a Python library for detecting and segmenting rivers in imagery using foundational computer-vision models. A single unified API works across SAM 1, SAM 2/2.1, YOLOv8-seg, and Florence 2, with built-in text-driven prompting via GroundingDINO.
Table of Contents
Installation
Install the core package:
pip install rivercv
Then install the backend(s) you want to use:
pip install "rivercv[sam1]" # SAM 1 (segment-anything)
pip install "rivercv[sam2]" # SAM 2 / 2.1
pip install "rivercv[yolo]" # YOLOv8-seg
pip install "rivercv[florence2]" # Florence 2 (text-native)
pip install "rivercv[grounding-dino]" # GroundingDINO (auto-prompting for SAM 1/2)
pip install "rivercv[all]" # everything
PyTorch note — SAM 2.1 requires
torch >= 2.5.1, which is the version rivercv installs by default. If you only need SAM 1 or YOLOv8 and already have an older torch, pin it before installing:pip install torch==2.x.xthenpip install rivercv --no-deps.
Quick Start
from rivercv.utils import load_image, save_mask
from rivercv import predict_mask
image = load_image("river.jpg")
# Point-prompt with SAM 2 (HuggingFace weights — no local download needed)
mask = predict_mask(
"sam2", image,
hf_model_id="facebook/sam2-hiera-large",
points=[[512, 256]], # (x, y) on the river
point_labels=[1], # 1 = foreground
device="cuda",
)
save_mask(mask, "river_mask.png")
Usage
Spatial prompts — points and boxes
Every backend accepts the same three optional prompts. Pass at least one.
from rivercv import predict_mask
# --- Point prompt ---
mask = predict_mask(
"sam1", image,
checkpoint="sam_vit_h_4b8939.pth",
points=[[512, 256]], # list of [x, y]
point_labels=[1], # 1 = foreground, 0 = background
)
# --- Box prompt ---
mask = predict_mask(
"sam2", image,
hf_model_id="facebook/sam2-hiera-large",
box=[100, 200, 800, 600], # [x_min, y_min, x_max, y_max]
)
# --- Mixed (point + background exclusion) ---
mask = predict_mask(
"yolov8", image,
checkpoint="yolov8n-seg.pt",
points=[[512, 256], [50, 50]],
point_labels=[1, 0], # river point, background point
)
All predict_mask() calls return a single np.ndarray of dtype bool,
shape (H, W), where True = river pixel.
Text prompts — auto-prompting
For SAM 1 and SAM 2, GroundingDINO converts a text description into a bounding box automatically. Florence 2 handles text natively — no extra model needed.
from rivercv import predict_from_text
# SAM 2 + GroundingDINO (GroundingDINO auto-detects the river, SAM segments it)
mask = predict_from_text(
"sam2", image, "river",
hf_model_id="facebook/sam2-hiera-large",
device="cuda",
)
# SAM 1 + GroundingDINO with a custom box threshold
mask = predict_from_text(
"sam1", image, "river",
checkpoint="sam_vit_h_4b8939.pth",
prompter_kwargs={"box_threshold": 0.35},
)
# Florence 2 — text-native, no GroundingDINO required
mask = predict_from_text(
"florence2", image, "river",
model_id="microsoft/Florence-2-large",
device="cuda",
)
predict_from_text dispatches automatically:
- Text-native (Florence 2, SAM 3) → text forwarded directly to the model.
- Spatial (SAM 1, SAM 2, YOLOv8) → GroundingDINO runs first, the resulting box is forwarded to the segmentation model.
If GroundingDINO finds no detections above the threshold, a zero mask is returned rather than raising an exception.
Reusable predictor — multiple images
create_predictor avoids reloading model weights for each image. The image
encoder result is cached by object identity, so calling predict() twice with
the same array skips the ViT pass.
from rivercv.models import create_predictor
# Build once
pred = create_predictor(
"sam2",
hf_model_id="facebook/sam2-hiera-large",
device="cuda",
)
# Reuse across many images
for image in image_stack:
mask = pred.predict(image, points=[[512, 256]], point_labels=[1])
process(mask)
# Release GPU memory when done (optional)
pred.close()
For text-native models:
pred = create_predictor("florence2", model_id="microsoft/Florence-2-large")
mask = pred.predict(image, text="river")
River mask helper
generate_river_mask is a thin convenience wrapper around predict() that
enforces the "at least one prompt" contract:
from rivercv.masks import generate_river_mask
from rivercv.models import create_predictor
pred = create_predictor("sam1", checkpoint="sam_vit_h_4b8939.pth")
mask = generate_river_mask(
pred, image,
point_coords=[[512, 256]],
point_labels=[1],
)
Backends
| Name | Install extra | Weights source | Text-native |
|---|---|---|---|
sam1 / sam |
rivercv[sam1] |
Manual download | No |
sam2 / sam2.1 |
rivercv[sam2] |
Manual or HuggingFace | No |
yolov8 / yolo |
rivercv[yolo] |
Auto-download or local | No |
florence2 |
rivercv[florence2] |
HuggingFace | Yes |
sam3 |
— | Not yet released | — |
Accepted model name aliases — all case-insensitive:
sam, sam1, sam1.0
sam2, sam2.0, sam2.1
yolo, yolov8, yolov8-seg
florence2, florence-2, florence2-base, florence2-large
sam3, sam3.0
Model Weights
SAM 1
Download from Meta's SAM repository:
| Variant | File | model_type |
|---|---|---|
| ViT-H (best) | sam_vit_h_4b8939.pth |
"vit_h" |
| ViT-L | sam_vit_l_0b3195.pth |
"vit_l" |
| ViT-B (fastest) | sam_vit_b_01ec64.pth |
"vit_b" |
pred = create_predictor("sam1", checkpoint="sam_vit_h_4b8939.pth", model_type="vit_h")
SAM 2 / 2.1
Option A — HuggingFace (recommended, no manual download):
pred = create_predictor("sam2", hf_model_id="facebook/sam2-hiera-large")
# also: facebook/sam2-hiera-small, facebook/sam2.1-hiera-large, etc.
Option B — local checkpoint (download from Meta's SAM 2 repository):
pred = create_predictor("sam2", checkpoint="sam2_hiera_large.pt")
# Config is inferred from the filename automatically.
# Supported stems: sam2_hiera_{tiny,small,base_plus,large}
# sam2.1_hiera_{tiny,small,base_plus,large}
YOLOv8-seg
Ultralytics auto-downloads weights on first use when passed a model name:
pred = create_predictor("yolov8", checkpoint="yolov8n-seg.pt")
# downloads to the ultralytics cache on first call
Or pass an absolute path to use a locally fine-tuned model.
Florence 2
Weights are downloaded from HuggingFace on first use and cached at
~/.cache/huggingface/hub/:
pred = create_predictor("florence2", model_id="microsoft/Florence-2-large")
# also: microsoft/Florence-2-base
GroundingDINO (auto-prompter)
HuggingFace mode (default) — requires rivercv[grounding-dino]:
from rivercv.prompts import create_prompter
prompter = create_prompter("grounding_dino")
# uses IDEA-Research/grounding-dino-tiny by default
prompter = create_prompter("grounding_dino", hf_model_id="IDEA-Research/grounding-dino-base")
Local mode — install GroundingDINO
separately (pip install -e . from the cloned repo), then:
prompter = create_prompter(
"grounding_dino",
hf_model_id=None,
checkpoint="groundingdino_swint_ogc.pth",
config="GroundingDINO_SwinT_OGC.py",
)
Local GroundingDINO may conflict with SAM 2.1 if it was built against an older PyTorch version. Use HuggingFace mode when combining both.
Development
git clone https://github.com/DSHydro/RiverCV.git
cd RiverCV
pip install -e ".[dev]"
pytest
Running tests — all tests are mock-based and run without GPU or model weights:
pytest tests/ -v
Building for PyPI:
python -m build
twine check dist/* # lint the package metadata
twine upload --repository testpypi dist/* # test first
twine upload dist/* # then publish
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rivercv-0.2.0.tar.gz.
File metadata
- Download URL: rivercv-0.2.0.tar.gz
- Upload date:
- Size: 16.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
750252978405ba79a7de96bc0fe16fae67d043f368ea7f77ac29957c0f97dfd1
|
|
| MD5 |
dc8539ebbace994511fb18948b0e5fd0
|
|
| BLAKE2b-256 |
da4031f6072061e5ca41d331fae9e8c7cd2988d4c8928919b8716761a4f18f74
|
File details
Details for the file rivercv-0.2.0-py3-none-any.whl.
File metadata
- Download URL: rivercv-0.2.0-py3-none-any.whl
- Upload date:
- Size: 22.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
326101e5742e040e88e9a48bb27a3b9cb24225a7ba547defa07982e070da17f7
|
|
| MD5 |
539a525884232872ecefefb689594e03
|
|
| BLAKE2b-256 |
912da845c9e8fd9161391eb19577d044753066e476cc6b8d07c74f74857b31b4
|