Skip to main content

YOLO <-> COCO conversion tools with COCO dataset merging

Project description

YOLO-COCO-Converter

YOLO <-> COCO conversion tools with an extra COCO dataset merger. Use the unified CLI for conversions and merging, or import functions in notebooks.

Features

  • YOLO -> COCO: Build COCO JSON from YOLO labels and image sizes
  • Categories in COCO output include a supercategory field (defaults to the class name or a user override)
  • Optional COCO info metadata in outputs
  • COCO -> YOLO: Write YOLO .txt labels and classes.txt from COCO
  • Merge COCO: Merge multiple COCO datasets with id remapping and options
  • Optional Pillow for image size detection; or provide a sizes CSV
  • Progress bars for conversions (CLI and notebooks) using tqdm

Installation

Install from PyPI for the CLI and library:

pip install yolococo

This provides the yolococo CLI (with coco-merge as an alias).

For development or testing against the latest code:

pip install -e .[test]

CLI Usage

Run as a module (no install required):

python -m yolococo ...

Or after installing:

yolococo ...

Subcommands

  • YOLO -> COCO:

    yolococo yolo2coco \
      --images ./images \
      --labels ./labels \
      --classes ./classes.txt \    # optional
      --sizes ./sizes.csv \        # optional; overrides Pillow sizes
      --image-size 1920 1080 \     # optional; skip per-image size reads
      --bbox-round 2 \             # decimals for bbox/area (use <0 to disable)
      --file-name-mode name \      # name | relative
      --info '{"description":"my dataset"}' \  # optional COCO info
      --supercategory object \      # optional: set all supercategories
      --out ./coco.json
    

    sizes.csv format (no header): filename,width,height.

  • COCO -> YOLO:

    yolococo coco2yolo \
      --coco ./instances.json \
      --out-labels ./yolo_labels \
      --out-classes ./classes.txt \
      [--keep-category-ids]
      [--skip-empty-labels]
    
  • Merge COCO:

    yolococo merge \
      --inputs path/to/ds1.json path/to/ds2.json \
      --out merged.json \
      [--prefix-mode none|basename|custom] \
      [--custom-prefixes A_ B_] \
      [--align-by-name] \
      [--drop-duplicate-filenames]
    

Programmatic Use (incl. Jupyter)

from pathlib import Path
import json
from yolococo import yolo_to_coco, coco_to_yolo_files, merge_datasets

# YOLO -> COCO
coco = yolo_to_coco(
    images_dir=Path("./images"),
    labels_dir=Path("./labels"),
    classes_path=Path("./classes.txt"),  # or None
    sizes_csv=None,  # or Path("./sizes.csv")
    image_size=(1920, 1080),  # optional uniform size
    info={"description": "my dataset"},  # optional COCO info
    supercategory="object",  # optional: set all supercategories
)
with open("coco.json", "w", encoding="utf-8") as f:
    json.dump(coco, f, ensure_ascii=False, indent=2)

# COCO -> YOLO (writes files)
coco_to_yolo_files(Path("./instances.json"), Path("./yolo_labels"), Path("./classes.txt"))

# Merge COCO
merged = merge_datasets([Path("a.json"), Path("b.json")], prefix_mode="basename")

Tip for notebooks: run from the repo root (so import yolococo works), or add the repo path to sys.path.

Notes & Assumptions

  • YOLO labels follow the common format: <cls> <xc> <yc> <w> <h> normalized to [0,1].
  • COCO expects absolute pixel bbox as [x_min, y_min, width, height].
  • Pillow is installed by default and used to read image sizes; --sizes CSV (if provided) takes precedence per matching filename.
  • Bounding boxes and area are rounded to --bbox-round decimals (default 2). Set a negative value to disable rounding.
  • Merge assumes consistent semantic classes across inputs; use --align-by-name if ids differ but names match.
  • --file-name-mode controls whether COCO images[].file_name stores just the basename (name) or the path relative to --images (relative). When using relative, directory separators are /.

License

MIT - see LICENSE.

Testing & Visualization

  • Install dev deps: pip install -e .[test]
  • Run tests: pytest -q
  • Artifacts (COCO JSON and annotated images) are written to tests/_artifacts/ for manual inspection.

Manual visualization script:

python scripts/visualize_labels.py

It converts the sample in test/ to COCO and saves overlays: sample_annotated_from_yolo.jpg, sample_annotated_from_coco.jpg, and the original image in tests/_artifacts/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yolococo-0.2.1.tar.gz (17.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yolococo-0.2.1-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file yolococo-0.2.1.tar.gz.

File metadata

  • Download URL: yolococo-0.2.1.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for yolococo-0.2.1.tar.gz
Algorithm Hash digest
SHA256 4445edaa5a91e05267584e43026a4c2002c7f814ba5a6f2378b5b70f2011532a
MD5 f97511a4b5049722c511a89fed64f0e3
BLAKE2b-256 f0058b6ac9e19472250e6bed521a39c61c4a380e6ab7fac879135b9d4355b9b4

See more details on using hashes here.

File details

Details for the file yolococo-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: yolococo-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for yolococo-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 241b281cd82bd3b1fe778da48865fc9c0bee880a10f29cfaf2ae9b4bd80fa89d
MD5 f046f964b4f4e0c469823c9d31cab4e0
BLAKE2b-256 7995998ba8a2686e9bcdbbf4c3129b18347cef7d9eac815f861438ec810962a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page