A data processing library for computer vision datasets
Project description
DataFlow-CV
Where Vibe Coding meets CV data. ๐ Convert & visualize datasets. Built with the flow of Claude Code.
A data processing library for computer vision datasets, focusing on format conversion and visualization between LabelMe, COCO, and YOLO formats. Provides both a CLI and Python API.
Table of Contents
Project Structure
dataflow/
โโโ __init__.py # Package exports and convenience functions
โโโ cli.py # Command-line interface
โโโ config.py # Configuration management
โโโ convert/ # Format conversion module
โ โโโ __init__.py
โ โโโ base.py # Converter base class
โ โโโ coco_to_yolo.py # COCO to YOLO converter
โ โโโ yolo_to_coco.py # YOLO to COCO converter
โโโ visualize/ # Annotation visualization module
โ โโโ __init__.py
โ โโโ base.py # Visualizer base class
โ โโโ yolo.py # YOLO annotation visualizer
โ โโโ coco.py # COCO annotation visualizer
โ โโโ labelme.py # LabelMe annotation visualizer
โโโ label/ # Label format handlers module
โโโ __init__.py
โโโ yolo.py # YOLO format handler
โโโ coco.py # COCO format handler
โโโ labelme.py # LabelMe format handler
tests/
โโโ __init__.py
โโโ convert/ # Conversion tests
โ โโโ __init__.py
โ โโโ test_coco_to_yolo.py
โ โโโ test_yolo_to_coco.py
โโโ visualize/ # Visualization tests
โ โโโ __init__.py
โ โโโ test_yolo.py
โ โโโ test_coco.py
โ โโโ test_labelme.py
โ โโโ test_generic.py # Generic visualizer tests
โโโ run_tests.py # Test runner
samples/
โโโ __init__.py
โโโ example_usage.py # Quick usage demonstration
โโโ cli/ # CLI usage examples
โ โโโ __init__.py
โ โโโ convert/
โ โ โโโ cli_coco_to_yolo.py
โ โ โโโ cli_yolo_to_coco.py
โ โโโ visualize/
โ โโโ cli_yolo.py
โ โโโ cli_coco.py
โ โโโ cli_labelme.py
โโโ api/ # Python API examples
โโโ __init__.py
โโโ convert/
โ โโโ api_coco_to_yolo.py
โ โโโ api_yolo_to_coco.py
โโโ visualize/
โโโ api_yolo.py
โโโ api_coco.py
โโโ api_labelme.py
Requirements
Core Dependencies
- Python 3.8 or higher
- Linux environment (POSIX compatible, assumes POSIX paths)
click>= 8.1.0 โ CLI frameworknumpy>= 2.0.0 โ numerical operationsopencv-python>= 4.8.0 โ image processing (optional, used for some image operations)Pillow>= 10.0.0 โ image reading (optional, used for reading image dimensions)
Quick Start
Installation
# Regular installation from source
pip install .
# Editable installation (development mode)
# Due to setuptools compatibility, use python setup.py develop (not pip install -e .)
python setup.py develop
# After editable installation, use python -m dataflow.cli instead of the dataflow command
Command Line Usage
Global options: --verbose (-v) for progress output, --overwrite to replace existing files.
# COCO to YOLO conversion (use --segmentation for polygon annotations)
dataflow convert coco2yolo annotations.json output_dir/
dataflow convert coco2yolo annotations.json output_dir/ --segmentation
# YOLO to COCO conversion
dataflow convert yolo2coco images/ labels/ classes.names output.json
# Visualize YOLO annotations (use --save to export images)
dataflow visualize yolo images/ labels/ classes.names
dataflow visualize yolo images/ labels/ classes.names --save output_dir/
# Visualize COCO annotations (use --save to export images)
dataflow visualize coco images/ annotations.json
dataflow visualize coco images/ annotations.json --save output_dir/
# Visualize LabelMe annotations (use --save to export images)
dataflow visualize labelme images/ labels/
dataflow visualize labelme images/ labels/ --save output_dir/
# Show configuration
dataflow config
# Get help
dataflow --help
dataflow convert coco2yolo --help
dataflow visualize yolo --help
dataflow visualize labelme --help
See the CLI Reference below for detailed usage.
Python API Usage
import dataflow
# COCO to YOLO conversion (pass segmentation=True for polygon annotations)
result = dataflow.coco_to_yolo("annotations.json", "output_dir")
result = dataflow.coco_to_yolo("annotations.json", "output_dir", segmentation=True)
print(f"Processed {result['images_processed']} images")
# YOLO to COCO conversion
result = dataflow.yolo_to_coco("images/", "labels/", "classes.names", "output.json")
print(f"Generated {result['annotations_processed']} annotations")
# Visualize YOLO annotations (save_dir is optional)
result = dataflow.visualize_yolo("images/", "labels/", "classes.names")
result = dataflow.visualize_yolo("images/", "labels/", "classes.names", save_dir="output_dir/")
print(f"Visualized {result['images_processed']} images")
# Visualize COCO annotations (save_dir is optional)
result = dataflow.visualize_coco("images/", "annotations.json")
result = dataflow.visualize_coco("images/", "annotations.json", save_dir="output_dir/")
print(f"Visualized {result['images_processed']} images")
# Visualize LabelMe annotations (save_dir is optional)
result = dataflow.visualize_labelme("images/", "labels/")
result = dataflow.visualize_labelme("images/", "labels/", save_dir="output_dir/")
print(f"Visualized {result['images_processed']} images")
print(f"Classes found: {result['classes_found']}")
CLI Reference
The CLI follows a hierarchical structure: dataflow <mainโtask> <subโtask> [arguments]. Global options can be placed before the main task.
Global Options
--verbose,-v: Enable verbose output (progress information)--overwrite: Overwrite existing files
Conversion Commands
COCO to YOLO
dataflow convert coco2yolo COCO_JSON_PATH OUTPUT_DIR [--segmentation]
COCO_JSON_PATH: Path to COCO JSON annotation fileOUTPUT_DIR: Directory wherelabels/andclass.nameswill be created--segmentation,-s: Handle segmentation annotations (polygon format)
YOLO to COCO
dataflow convert yolo2coco IMAGE_DIR YOLO_LABELS_DIR YOLO_CLASS_PATH COCO_JSON_PATH
IMAGE_DIR: Directory containing image filesYOLO_LABELS_DIR: Directory containing YOLO label files (.txt)YOLO_CLASS_PATH: Path to YOLO class names file (e.g.,class.names)COCO_JSON_PATH: Path to save COCO JSON file
Visualization Commands
Visualize YOLO annotations
dataflow visualize yolo IMAGE_DIR LABEL_DIR CLASS_PATH [--save SAVE_DIR]
IMAGE_DIR: Directory containing image filesLABEL_DIR: Directory containing YOLO label files (.txt)CLASS_PATH: Path to class names file (e.g.,class.names)--save SAVE_DIR: Optional directory to save visualized images
Visualize COCO annotations
dataflow visualize coco IMAGE_DIR ANNOTATION_JSON [--save SAVE_DIR]
IMAGE_DIR: Directory containing image filesANNOTATION_JSON: Path to COCO JSON annotation file--save SAVE_DIR: Optional directory to save visualized images
Visualize LabelMe annotations
dataflow visualize labelme IMAGE_DIR LABEL_DIR [--save SAVE_DIR]
IMAGE_DIR: Directory containing image filesLABEL_DIR: Directory containing LabelMe JSON files--save SAVE_DIR: Optional directory to save visualized images
Configuration Command
dataflow config
Shows the current configuration (file extensions, default values, CLI context).
Getting Help
dataflow --help
dataflow convert --help
dataflow convert coco2yolo --help
dataflow convert yolo2coco --help
dataflow visualize --help
dataflow visualize yolo --help
dataflow visualize coco --help
dataflow visualize labelme --help
Segmentation Support
DataFlow-CV supports both bounding box and polygon segmentation annotations across all formats:
YOLO Segmentation Format
- Detection format:
class_id x_center y_center width height(normalized coordinates) - Segmentation format:
class_id x1 y1 x2 y2 ...(polygon vertices, normalized) - YOLO segmentation files have the same
.txtextension as detection files
COCO Segmentation Format
- Polygon coordinates in
segmentationfield (list of[x1, y1, x2, y2, ...]) - Both single-polygon and multi-polygon annotations are supported
LabelMe Segmentation Format
- Rectangle shapes (
shape_type: "rectangle") for bounding box annotations - Polygon shapes (
shape_type: "polygon") for segmentation annotations - Each JSON file contains
shapesarray with annotation data
Usage Examples
# Convert COCO to YOLO with segmentation annotations
dataflow convert coco2yolo annotations.json output_dir/ --segmentation
# Visualize YOLO annotations in strict segmentation mode (only polygons)
dataflow visualize yolo images/ labels/ classes.names --segmentation
# Visualize COCO annotations in strict segmentation mode
dataflow visualize coco images/ annotations.json --segmentation
# Visualize LabelMe annotations in strict segmentation mode (only polygons)
dataflow visualize labelme images/ labels/ --segmentation
Python API
# Convert COCO to YOLO with segmentation
result = dataflow.coco_to_yolo("annotations.json", "output_dir", segmentation=True)
# Visualize in strict segmentation mode
result = dataflow.visualize_yolo("images/", "labels/", "classes.names", segmentation=True)
result = dataflow.visualize_labelme("images/", "labels/", segmentation=True)
Notes
- Without the
--segmentationflag, both bounding boxes and polygons are processed automatically - With
--segmentationflag, only valid polygon annotations are processed (strict mode) - YOLO segmentation format requires at least 3 points (6 coordinates)
- COCO segmentation polygons are automatically converted to YOLO normalized coordinates
- LabelMe format supports both rectangle (
shape_type: "rectangle") and polygon (shape_type: "polygon") shapes - In segmentation mode, LabelMe visualizer rejects rectangle shapes and only accepts polygon shapes
Running Tests
# Run all tests
python tests/run_tests.py
# Run specific test
python tests/run_tests.py --test TestCocoToYoloConverter
# With verbose output
python tests/run_tests.py -v
Examples
Check the samples/ directory for detailed usage examples:
samples/cli/convert/- CLI conversion examplessamples/cli/visualize/- CLI visualization examplessamples/api/convert/- Python API conversion examplessamples/api/visualize/- Python API visualization examples
License
MIT License ยฉ 2026 zjykzj
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dataflow_cv-0.2.1.tar.gz.
File metadata
- Download URL: dataflow_cv-0.2.1.tar.gz
- Upload date:
- Size: 38.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
409b778e78c1318011daf4aaf600445e0e2f90919b4b082562c2db7c9af474c7
|
|
| MD5 |
0b06881d90ea0c7b77243cec58a31166
|
|
| BLAKE2b-256 |
d7aac105be444f0c476189b217eabfaa08d4c55905c2d9d8c8c1c2a69fdac157
|
File details
Details for the file dataflow_cv-0.2.1-py3-none-any.whl.
File metadata
- Download URL: dataflow_cv-0.2.1-py3-none-any.whl
- Upload date:
- Size: 48.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c38d47d6c3c0e681c66a07b6764186921306119c7833a9fa2d1fbfe206fe63d
|
|
| MD5 |
7d69af42c36404fde3525b5bde4320e8
|
|
| BLAKE2b-256 |
859b6c8605f6ec59f0e279505b665d152e875be715d622b2a585510ffe757d10
|