Skip to main content

A computer vision dataset processing library

Project description

DataFlow-CV

Where Vibe Coding meets CV data. 🌊 Convert & visualize datasets. Built with the flow of Claude Code.

Python 3.8+ License PyPI Development Status GitHub Actions Linux Windows macOS

A computer vision dataset processing library for seamless format conversion and visualization between LabelMe, COCO, and YOLO annotation formats. Designed for researchers and developers working with multi-format annotation pipelines.

Features

  • Bidirectional Conversion: Convert between LabelMe, COCO, and YOLO formats in any direction
  • Multi-format Support: Handle object detection bounding boxes and instance segmentation polygons
  • Lossless Round-trip: Preserve original coordinates through conversion chains
  • Visualization: Visualize annotations with OpenCV, supporting both display and save modes
  • Command-line Interface: User-friendly CLI with convert and visualize subcommands
  • Python API: Programmatic access for integration into larger pipelines
  • Verbose Logging: Detailed logging with file output for debugging
  • Cross-platform: Full support for Windows, Linux, and macOS

Table of Contents

Installation

From PyPI

pip install dataflow-cv

From Source

# Clone the repository
git clone https://github.com/zjykzj/DataFlow-CV.git
cd DataFlow-CV

# Regular installation
pip install .

# Editable installation (for development)
pip install -e .

Note: When installed in editable mode, use python -m dataflow.cli instead of the dataflow-cv command.

Optional Dependencies

  • pycocotools: Required for COCO RLE segmentation support
    pip install pycocotools
    

Quick Start

Command-line Interface

Format Conversion

# YOLO to COCO
dataflow-cv convert yolo2coco images/ yolo_labels/ classes.txt coco_annotations.json

# With RLE encoding
dataflow-cv convert yolo2coco images/ yolo_labels/ classes.txt coco_annotations.json --do-rle

# YOLO to LabelMe
dataflow-cv convert yolo2labelme images/ yolo_labels/ classes.txt labelme_json/

# COCO to YOLO
dataflow-cv convert coco2yolo coco_annotations.json yolo_labels/

# COCO to LabelMe
dataflow-cv convert coco2labelme coco_annotations.json labelme_json/

# LabelMe to YOLO
dataflow-cv convert labelme2yolo labelme_json/ classes.txt yolo_labels/

# LabelMe to COCO
dataflow-cv convert labelme2coco labelme_json/ classes.txt coco_annotations.json

# With RLE encoding
dataflow-cv convert labelme2coco labelme_json/ classes.txt coco_annotations.json --do-rle

# Enable verbose logging
dataflow-cv convert yolo2coco images/ yolo_labels/ classes.txt coco_annotations.json --verbose

Visualization

# Visualize YOLO annotations
dataflow-cv visualize yolo images/ yolo_labels/ classes.txt --save visualized/

# Visualize COCO annotations
dataflow-cv visualize coco images/ coco_annotations.json --save visualized/

# Visualize LabelMe annotations
dataflow-cv visualize labelme images/ labelme_json/ --save visualized/

Python API

from dataflow.convert import YoloAndCocoConverter
from dataflow.visualize import YOLOVisualizer

# Convert YOLO to COCO
converter = YoloAndCocoConverter(source_to_target=True, verbose=True, strict_mode=True)
result = converter.convert(
    source_path="yolo_labels/",
    target_path="coco_annotations.json",
    class_file="classes.txt",
    image_dir="images/",
    do_rle=False  # Set to True for RLE encoding
)

# Visualize YOLO annotations
visualizer = YOLOVisualizer(
    label_dir="yolo_labels/",
    image_dir="images/",
    class_file="classes.txt",
    is_show=True,
    is_save=True,
    output_dir="visualized/",
    verbose=True,
    strict_mode=True
)
result = visualizer.visualize()

See the samples/ directory for complete examples:

  • samples/visualize/yolo_demo.py - YOLO visualization example
  • samples/visualize/labelme_demo.py - LabelMe visualization example
  • samples/visualize/coco_demo.py - COCO visualization example
  • samples/convert/ - Conversion examples

Documentation

  • CLAUDE.md: Detailed architecture and development guide
  • docs/formats/: Format specifications (YOLO, COCO, LabelMe)
  • docs/specs/: Module specifications and design documents
  • CHANGELOG.md: Version history and breaking changes

Key Concepts

  • Normalized Coordinates: All internal coordinates are in 0-1 range
  • Original Data Preservation: Lossless round-trip conversion through OriginalData system
  • Strict Mode: Validation errors raise exceptions (default: enabled in CLI, can be disabled via strict_mode=False parameter in Python API)
  • Verbose Logging: Detailed debug logs saved to files when --verbose is used

Development

For detailed developer guidance including advanced test commands, debugging, and architecture overview, see CLAUDE.md.

Testing

# Run all tests
pytest

# Run tests with coverage
pytest --cov=dataflow

# Run specific test module
pytest tests/convert/test_yolo_and_coco.py

Linting and Formatting

# Install development dependencies
pip install -e .[dev]

# Format code
black dataflow tests samples

# Sort imports
isort dataflow tests samples

# Type checking
mypy dataflow

# Linting
flake8 dataflow tests samples

Project Structure

dataflow/
├── label/           # Annotation handlers (YOLO, LabelMe, COCO)
├── convert/         # Format converters
├── visualize/       # Visualization modules
├── util/           # Utilities (logging, file operations)
└── cli/            # Command-line interface
tests/              # Comprehensive test suite
samples/            # Usage examples
assets/             # Sample data for testing

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Before contributing, review CLAUDE.md for architecture and development patterns.

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add or update tests as needed
  5. Ensure code passes formatting and linting checks
  6. Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Thanks to the creators of LabelMe, COCO, and YOLO formats for establishing these annotation standards
  • Built with OpenCV, NumPy, and Click
  • Inspired by the need for seamless format conversion in multi-tool CV pipelines

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataflow_cv-0.6.0.tar.gz (59.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataflow_cv-0.6.0-py3-none-any.whl (72.2 kB view details)

Uploaded Python 3

File details

Details for the file dataflow_cv-0.6.0.tar.gz.

File metadata

  • Download URL: dataflow_cv-0.6.0.tar.gz
  • Upload date:
  • Size: 59.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataflow_cv-0.6.0.tar.gz
Algorithm Hash digest
SHA256 2054dcc71dd78063d33ecd8c5eb3bfd0df80ef17956a18f9e48a005e21727316
MD5 d24019adbb700cfd23b07dc72e21bb98
BLAKE2b-256 effa3fdeadabc13022fe9ab9428e63f491d3cd9e196179fe001273dbe29ccc8c

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataflow_cv-0.6.0.tar.gz:

Publisher: python-publish.yml on zjykzj/DataFlow-CV

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataflow_cv-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: dataflow_cv-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 72.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataflow_cv-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 18325978deda950e66d8a69b8b89450c48a3ced69ab061ec7977c1054f7e9c9b
MD5 b8ea230e39906cd31a62c681bd19102b
BLAKE2b-256 f6a8b7ae19bd238353f5e32cd1ac927599b2f7c33594e7676ec5f53f7943c6dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataflow_cv-0.6.0-py3-none-any.whl:

Publisher: python-publish.yml on zjykzj/DataFlow-CV

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page