Content database library and CLI for VisData 3
Project description
VD3Storage
Content database library and CLI for VisData 3. Manages video and imageset assets, annotations, and worksets backed by MP4/JSON media and CSV-based metadata, with DVC-managed remote storage.
Installation
uv sync
To use as a dependency:
# pyproject.toml
[project]
dependencies = ["vd3"]
Quick Start
# Initialize a content database in the current directory
vd3 init
# ...or in a specific directory
vd3 init /path/to/mydb
# Add a video under a datasource
vd3 add video clip.mp4 -d my-datasource -p /path/to/mydb
# Add multiple videos with a glob (quote to prevent shell expansion)
vd3 add video '*.mp4' -d my-datasource -p /path/to/mydb
# List assets
vd3 list assets -p /path/to/mydb
# Show media availability
vd3 media status -p /path/to/mydb
Core Concepts
- Datasource — groups assets by origin (e.g.
dashcam-2024,test-data). Required when importing. - Asset — a single video (MP4 + JSON metadata) or imageset (directory of images).
- Workset — a named subset of assets, optionally organized into packages (folders). Independent of storage layout.
- Annotation layer — detections or tracks attached to an asset, with a key (e.g.
gt,det/yolo-v8) and ahumanormachinesource.
Adding Assets
Videos
# Single file
vd3 add video clip.mp4 -d dashcam
# Glob (recursive)
vd3 add video 'rawdata/**/*.mp4' -d dashcam
# Force re-import of a duplicate (matched by SHA-256)
vd3 add video clip.mp4 -d dashcam --force
# Add and assign to a workset/package
vd3 add video clip.mp4 -d dashcam -w my-workset -k batch1
Imagesets
# Directory of images
vd3 add imageset /path/to/images -d my-datasource
# Tar archive
vd3 add imageset images.tar -d my-datasource
Annotation results
VD3 JSON detections/tracks into an existing asset:
vd3 add result results.json -a clip -p /path/to/mydb
COCO
Import COCO annotations into an existing imageset:
vd3 add coco annotations.json -a my-imageset \
--layer gt --source human --reviewed-all
Import a full COCO dataset (creates the imageset and imports annotations):
vd3 add coco-dataset annotations.json -d my-datasource \
--image-root /path/to/images --layer gt
Worksets
# Create
vd3 workset create "My Experiment"
# Add assets by name or ID
vd3 workset add-asset my-experiment clip-001 clip-002
# ...or by media-path glob (run from the database root; files must be on disk)
cd /path/to/mydb
vd3 workset add-asset my-experiment 'db/media/videos/fc/*.mp4'
# Inspect
vd3 workset list
vd3 workset show my-experiment
# Remove an asset / delete the workset
vd3 workset remove-asset my-experiment clip-001
vd3 workset delete my-experiment
Remote Storage
Media files are tracked by DVC. A content database has a single configured remote.
# Set the remote (replaces any existing one)
vd3 media remote set gs://my-bucket/vd3-data
vd3 media remote show
# Sync
vd3 media push
vd3 media pull
vd3 media status
Supported backends:
| Backend | URL form | Notes |
|---|---|---|
| Google Cloud Storage | gs://bucket/path |
gcloud auth application-default login |
| Amazon S3 | s3://bucket/path |
Standard AWS credential chain |
| Azure Blob Storage | azure://container/path |
|
| Google Drive | gdrive://folder-id |
via dvc-gdrive |
| Local / NAS | /mnt/nas/vd3-backup |
Listing & Inspection
vd3 list assets # all assets (filterable)
vd3 list datasources # all datasources
vd3 list layers -a clip # annotation layers on an asset
vd3 show clip # asset details
vd3 info # database overview
vd3 query "SELECT ..." # raw DuckDB SQL against the CSV tables
Exporting
# Extract frames from a video or images from an imageset
vd3 export frames clip -o ./out
Library API
The CLI is a thin wrapper around VD3Storage, which is also usable directly.
from vd3storage import VD3Storage
# Open an existing database (or use VD3Storage.init(path) to create one)
storage = VD3Storage("/path/to/mydb")
# Browse assets
for a in storage.list_assets(datasource="dashcam"):
print(f"{a.name} ({a.asset_type}): {a.frame_count} frames @ {a.nominal_fps} fps")
# Look up by (datasource, name) or by ID
clip = storage.get_asset("dashcam", "clip-001")
clip = storage.get_asset_by_id("3f1a...")
# Import a video
asset = storage.import_video("clip.mp4", datasource="dashcam")
# Resolve where the media file lives on disk
storage.resolve_media_path(clip)
# Annotation layers
storage.list_annotation_layers(clip.asset_id)
storage.read_annotation_layer(clip.asset_id, "gt")
# Worksets
ws = storage.create_workset("My Experiment")
storage.add_asset_to_workset(ws.workset_id, clip.asset_id, package="batch1")
storage.list_workset_assets(ws.workset_id)
# Raw DuckDB SQL against the underlying CSV tables
rows = storage.execute_sql("SELECT name, frame_count FROM assets WHERE asset_type = 'video'")
Other useful methods: import_imageset, import_coco, import_coco_dataset, import_result, export_coco, open_video, open_imageset, get_frame_image, add_tag, is_media_available, pull, push. Inspect help(VD3Storage) for the full surface.
CLI Reference
vd3 --help Top-level help
vd3 <command> --help Help for a specific command
| Command | Description |
|---|---|
init |
Initialize a content database (defaults to cwd) |
info |
Show database overview |
show |
Show asset details |
query |
Run raw DuckDB SQL against the CSV tables |
remove |
Delete an asset |
add video |
Import video files |
add imageset |
Import an imageset (directory or tar) |
add result |
Import VD3 JSON detections/tracks |
add coco |
Import COCO annotations into an existing imageset |
add coco-dataset |
Import a COCO dataset (imageset + annotations) |
list assets |
List assets |
list datasources |
List datasources |
list layers |
List annotation layers for an asset |
workset create |
Create a workset |
workset list |
List worksets |
workset show |
Show workset details |
workset add-asset |
Add assets to a workset |
workset remove-asset |
Remove an asset from a workset |
workset delete |
Delete a workset (assets are kept) |
media status |
Show media availability |
media push |
Push media to remote storage |
media pull |
Pull media from remote storage |
media remote set |
Set the remote storage URL |
media remote show |
Show the configured remote |
export frames |
Extract frames from a video or imageset |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vd3-0.2.0.tar.gz.
File metadata
- Download URL: vd3-0.2.0.tar.gz
- Upload date:
- Size: 218.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1fa006249f5824da87e52e4699617ca9be038f4f45d10426e9c8d5c955282070
|
|
| MD5 |
dab9a6751ac59b524d6f19ad43a61cf2
|
|
| BLAKE2b-256 |
c5be52b44e2c808d2882d41605991f7376cbda45227349ac7ad0d903ed3dc9a7
|
File details
Details for the file vd3-0.2.0-py3-none-any.whl.
File metadata
- Download URL: vd3-0.2.0-py3-none-any.whl
- Upload date:
- Size: 66.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9fc7a2e823c9eb38d29d70da6db045a7620009140f6944f4cad7e8626c82922
|
|
| MD5 |
6f106edc34d1b99337fd788ff38a4787
|
|
| BLAKE2b-256 |
b4bf0d85bca13b4af2d9c3836971e3e86a31e474075f36f9a89a94ead4b5ae3c
|