Skip to main content

YOLO datasets, training queue, runs analytics — workspace-first CLI

Project description

Russian version: docs/ru/README.md

Smart Train (smartrain)

A CLI toolkit for preparing YOLO datasets, training models, running queues, and analyzing runs.

Quick start

Requirements: Python 3.10+.

git clone <repo-url>
cd smart-train
pip install -e .

Work from the project root (current directory):

smartrain deploy
smartrain scan
smartrain fusion --dataset ds_a --dataset ds_b --classes "class_a,class_b"
smartrain train --data 2026-01-01_12-00-00-merged -y

What's included

  • Single entry point: smartrain (module smartrain.cli).
  • Single-workspace model: raw_data/, datasets/, runs/, analytics/, models/, inference/, tmp/.
  • Pipeline support: scan -> fusion -> train -> analyze.
  • Additional tools: queue, registry, report, model, normalize-data-yaml, migrate-models, clearml-upload, plot, cvat, sahi, heatmap, orient.

How it works

smartrain uses a single workspace root and builds a process around file contracts:

  • scan synchronizes sources and updates the dataset catalog;
  • fusion generates the final dataset for training;
  • train creates a run directory with metrics and metadata;
  • analyze and registry work on artifacts in runs/.

Key commands

Command Purpose
smartrain deploy Initialize the workspace structure
smartrain scan Synchronize sources and update the dataset catalog
smartrain fusion Build the final training dataset
smartrain train Train and validate YOLO models
smartrain inference Run inference on folder or dataset split and save JSON report
smartrain queue / smartrain queue-run Manage and run the command queue
smartrain analyze Summaries, run comparison, PR curves, and inference benchmarks
smartrain registry Catalog run artifacts and promoted models

Documentation

Current documentation is organized into sections in docs/:

Testing

pip install -e ".[dev]"
pytest

Important details

  • Interactive mode starts only when a command is launched with zero arguments (TTY required).
  • Interactive dataset commands: fusion, augment, balance, stats, roi, orient, inference; plus train.
  • Dataset cleanup command: prune (prune empty for empty pairs, prune dedup for duplicate images by content).
  • If any arguments are provided but required ones are missing, commands return a clear "incomplete arguments" error instead of interactive prompts.
  • Command help now includes practical Examples / Quick examples blocks for common workflows.
  • smartrain balance presets:
    • --preset weights-safe for conservative balancing
    • --preset rfs-aggressive for stronger tail upsampling
    • --preset hybrid-default as a general default
  • smartrain balance eval splits: --eval-coverage is on by default (keeps val/test non-empty when possible and improves class coverage there); use --no-eval-coverage to disable. The interactive wizard asks for this option.
  • For hash --validate: 0 for a match, 1 for a mismatch, 2 for an error.
  • By default, the workspace queue uses queue.txt and tmp/status.txt.
  • Dependency extras:
    • pip install -e ".[dev]" for development and testing
    • pip install -e ".[clearml]" for ClearML
    • pip install -e ".[sahi]" for SAHI

Common workflows

Scanning with an explicit source list:

smartrain scan --datasets-list /path/to/workspace/raw_data/datasets_list.txt

Check dataset hash:

smartrain hash --dataset my_dataset
smartrain hash /path/to/dataset --validate a1b2c3d4

Starting a queue without opening a GUI terminal:

smartrain queue run --no-gui

Quick run overview:

smartrain analyze scan
smartrain analyze export-table -o runs_summary.csv

Developers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartrain-0.0.1.tar.gz (238.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smartrain-0.0.1-py3-none-any.whl (233.7 kB view details)

Uploaded Python 3

File details

Details for the file smartrain-0.0.1.tar.gz.

File metadata

  • Download URL: smartrain-0.0.1.tar.gz
  • Upload date:
  • Size: 238.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for smartrain-0.0.1.tar.gz
Algorithm Hash digest
SHA256 47aace07406cc51fd829dc1367648832fd2475627f5d5dbc9900048eb523ea7e
MD5 a4b360cfb83359d2e124e8529973af7f
BLAKE2b-256 85e15121b1dbf12295aca5f43c5f462927880f5c498f6b41f37e6f9be0a280fc

See more details on using hashes here.

File details

Details for the file smartrain-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: smartrain-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 233.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for smartrain-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 49f26888701f9b5e573173d2e387c9ab526f73cdfc09aa960af5b89bfe36b663
MD5 171aa96c54aba275307b01260f9b9c8b
BLAKE2b-256 b7b1453b6e14f938778ed648bc605695bc3b0f4c771b277f8f3d51426e776cfb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page