datasety

CLI tool for dataset preparation: resize, align, caption, shuffle, synthetic, mask, filter, degrade, and character generation.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kontextox

These details have not been verified by PyPI

Project description

CLI tool for dataset preparation — resize, caption, align, shuffle, synthetic editing, masking, degradation, character generation, LoRA training, audio TTS datasets, upload to HuggingFace, and multi-step workflows.

Full documentation →

Installation
Commands
License

Installation

pip install datasety                 # core (resize, align, shuffle, degrade)
pip install datasety[caption]        # + Florence-2 captioning
pip install datasety[synthetic]      # + image editing (FLUX, Qwen, SDXL)
pip install datasety[mask]           # + segmentation masks (SAM 3, CLIPSeg)
pip install datasety[filter]         # + content filtering (CLIP, NudeNet)
pip install datasety[character]      # + character dataset generation
pip install datasety[workflow]       # + YAML workflow support
pip install datasety[train]          # + LoRA training (FLUX, Qwen) & TTS (Piper)
pip install datasety[audio]          # + TTS audio & video datasets (YouTube, VAD, Piper)
pip install datasety[upload]         # + upload to HuggingFace Hub
pip install datasety[all]            # everything

Commands

`resize` — Resize & Crop Images

Batch resize images to exact dimensions with configurable crop positions.

datasety resize --input ./raw --output ./resized --resolution 768x1024 --crop-position top

Options

Option	Description	Default
`--input`, `-i`	Input directory	required*
`--output`, `-o`	Output directory	required*
`--input-image`	Single input image (alternative to dir mode)
`--output-image`	Single output image (use with `--input-image`)
`--resolution`, `-r`	Target resolution (`WIDTHxHEIGHT`)
`--megapixel`	Target megapixel count (e.g., 0.5, 1.0)
`--aspect-ratio`	Aspect ratio `W:H` (e.g., 1:1, 16:9)
`--crop-position`	`top`, `center`, `bottom`, `left`, `right`	`center`
`--input-format`	Comma-separated input formats	`jpg,jpeg,png,webp`
`--output-format`	`jpg`, `png`, `webp`	`jpg`
`--output-name-numbers`	Rename output files to 1.jpg, 2.jpg, ...	off
`--upscale`	Upscale images smaller than target	off
`--min-resolution`	Skip images below this size (e.g., `256x256`)
`--workers`	Parallel workers for processing	`1`
`--recursive`, `-R`	Search input directory recursively	off
`--progress`	Show tqdm progress bar	off
`--dry-run`	Preview without modifying files	off

# Single image
datasety resize --input-image photo.jpg --output-image resized.jpg -r 512x512

# Batch with sequential numbering
datasety resize -i ./photos -o ./dataset -r 1024x1024 --output-name-numbers --crop-position top

Full documentation →

`filter` — Filter Dataset by Content

Filter, curate, or clean datasets based on image content. Use CLIP for arbitrary text queries or NudeNet for NSFW label detection.

datasety filter --input ./dataset --output ./rejected --query "leg,male face" --action move

Options

Option	Description	Default
`--input`, `-i`	Input directory	required
`--output`, `-o`	Output directory for matched/rejected images
`--query`, `-q`	Comma-separated text queries (CLIP)
`--labels`, `-l`	Comma-separated NudeNet labels
`--model`	`clip`, `nudenet`	`clip`
`--action`	`move`, `copy`, `delete`, `keep`	`move`
`--threshold`	Confidence threshold (0.0-1.0)	`0.5`
`--device`	`auto`, `cpu`, `cuda`, `mps`	`auto`
`--confirm`	Required for destructive actions (`delete`, `keep`)	off
`--preserve-structure`	Keep subfolder hierarchy in output (with `--recursive`)	off
`--invert`	Invert match logic (act on non-matches)	off
`--log`	Write CSV log of all decisions to this path
`--dry-run`	Preview detections without modifying files	off
`--recursive`, `-R`	Search input directory recursively	off
`--progress`	Show tqdm progress bar	off

# Move images containing legs or male faces to a reject folder
datasety filter -i ./dataset -o ./rejected --query "leg,male face" --action move

# Delete NSFW images using NudeNet labels
datasety filter -i ./dataset --labels "FEMALE_BREAST_EXPOSED,MALE_GENITALIA_EXPOSED" \
    --action delete --model nudenet --threshold 0.6 --confirm

# Keep only images with "hat and socks", move the rest out
datasety filter -i ./dataset -o ./rejected --query "hat and socks" --action keep

# Dry-run to preview what would be filtered
datasety filter -i ./dataset --query "blurry,low quality" --action delete --dry-run -R

# Write a decision log for review
datasety filter -i ./dataset -o ./rejected --query "outdoor" --action copy --log filter_log.csv

Full documentation →

`degrade` — Image Degradation

Create degraded versions of images for upscale/enhance training. Pure Pillow, no extra dependencies.

datasety degrade --input ./originals --output ./dataset --type random --intensity-range 0.2-0.8 --paired

Options

Option	Description	Default
`--input`, `-i`	Input directory	required*
`--output`, `-o`	Output directory	required*
`--input-image`	Single input image
`--output-image`	Single output image
`--type`, `-t`	Degradation type(s), repeatable	`random`
`--intensity`	Global intensity (0.0-1.0)	`0.5`
`--intensity-range`	Random range `MIN-MAX`
`--chain`	Apply multiple types sequentially	off
`--num-variants`	Variants per input image	`1`
`--paired`	Create `control/` + `target/` subdirs	off
`--seed`	Random seed
`--output-format`	`png`, `jpg`, `webp`	`png`
`--skip-existing`	Skip images with existing output	off
`--workers`	Parallel workers for processing	`1`
`--progress`	Show tqdm progress bar	off
`--dry-run`	Preview without writing files	off

Degradation types: lowres, oversharpen, noise, blur, jpeg, motion-blur, pixelate, color-bands, upscale-sim, random

# Chain specific degradations for paired output
datasety degrade -i ./images -o ./dataset --type jpeg --type noise --chain --paired --seed 42

# Multiple random variants per image
datasety degrade -i ./images -o ./degraded --type random --num-variants 3 --intensity-range 0.3-0.8

Full documentation →

`mask` — Text-Prompted Segmentation Masks

Generate binary masks from images using text keywords. Supports SAM 3, SAM 2, and CLIPSeg.

datasety mask --input ./dataset --output ./masks --keywords "face,hair" --device cuda

Options

Option	Description	Default
`--input`, `-i`	Input directory	required*
`--output`, `-o`	Output directory for masks	required*
`--input-image`	Single input image
`--output-image`	Single output mask
`--keywords`, `-k`	Comma-separated keywords	required
`--model`	`sam3`, `sam2`, `clipseg`	`sam3`
`--device`	`auto`, `cpu`, `cuda`, `mps`	`auto`
`--threshold`	Confidence threshold (0.0-1.0)	`0.3`
`--padding`	Pixels to expand mask (dilation)	`0`
`--blur`	Gaussian blur radius for edges	`0`
`--invert`	Invert mask colors	off
`--naming`	`folder` or `suffix` (`_mask`)	`folder`
`--output-format`	`png`, `jpg`, `webp`	`png`
`--skip-existing`	Skip images with existing masks	off
`--dry-run`	Preview detections without saving	off
`--recursive`, `-R`	Search input directory recursively	off
`--progress`	Show tqdm progress bar	off

# CLIPSeg (lightweight, no extra deps)
datasety mask -i ./dataset -o ./masks -k "face" --model clipseg --threshold 0.5

# SAM 2 with mask refinement
datasety mask -i ./dataset -o ./masks -k "hat,glasses" --model sam2 --padding 5 --blur 3

Full documentation →

`caption` — Generate Image Captions

Generate captions using Florence-2 (local) or OpenAI-compatible vision APIs.

datasety caption --input ./images --output ./captions --template "[trigger] {{caption}}"

Options

Option	Description	Default
`--input`, `-i`	Input directory	required*
`--output`, `-o`	Output directory for .txt files	required*
`--input-image`	Single input image
`--output-caption`	Single output .txt path
`--device`	`auto`, `cpu`, `cuda`, `mps`	`auto`
`--template`	Template for caption text.
`--prompt`	Florence-2 task prompt	`<MORE_DETAILED_CAPTION>`
`--model`	HF model name or API model ID
`--num-beams`	Beam search width (1 = greedy)	`3`
`--florence-2-base`	Use Florence-2-base (0.23B, faster)	default
`--florence-2-large`	Use Florence-2-large (0.77B, more accurate)
`--llm-api`	Use OpenAI-compatible vision API
`--max-tokens`	Max response tokens (API mode)	`300`
`--temperature`	Temperature (API mode)	`0.3`
`--skip-existing`	Skip images that already have a .txt file	off
`--append`	Append text to existing captions
`--prepend`	Prepend text to existing captions
`--recursive`, `-R`	Search input directory recursively	off
`--progress`	Show tqdm progress bar	off
`--dry-run`	Preview without processing	off

# Florence-2 with template
datasety caption -i ./dataset -o ./dataset --template "photo of sks person, {{caption}}" --device cuda

# Template without placeholder (prepends text)
datasety caption -i ./dataset -o ./dataset --template "photo of sks person," --device cuda

# OpenAI vision API (supports OPENAI_MODEL env var)
datasety caption -i ./images -o ./captions --llm-api --model gpt-5-nano

Full documentation →

`shuffle` — Random Caption Generation

Generate random captions by picking one variant from each text group.

datasety shuffle -i ./images -o ./captions \
    --group "A photo of a person.|Portrait of someone." \
    --group "Remove the hat.|Take off the hat."

Options

Option	Description	Default
`--input`, `-i`	Input directory containing images	required
`--output`, `-o`	Output directory for .txt files	required
`--group`, `-g`	Inline `\|`-separated, `.txt` file, or URL	required
`--separator`	Separator between groups	`" "`
`--seed`	Random seed for reproducibility
`--dry-run`	Preview captions without writing	off
`--show-distribution`	Show caption distribution after generation	off

# Mix file, URL, and inline sources
datasety shuffle -i ./images -o ./captions \
    --group subjects.txt \
    --group "ending A|ending B" \
    --seed 42 --show-distribution

Full documentation →

`synthetic` — Synthetic Image Editing

Generate synthetic variations using image editing models (FLUX.2-klein FP8, FLUX.2-klein-9b-kv, Qwen-Image-Edit-2511, SDXL, LongCat, HunyuanImage). The default model FLUX.2-klein-4b-fp8 requires no HuggingFace token and fits in ~5 GB VRAM.

datasety synthetic --input ./images --output ./synthetic --prompt "add a winter hat" --steps 4

Options

Option	Description	Default
`--input`, `-i`	Input directory	required*
`--output`, `-o`	Output directory	required*
`--input-image`	Single input image
`--output-image`	Single output image
`--prompt`, `-p`	Edit instruction	required
`--model`	Model (auto-detects family or API model)	`black-forest-labs/FLUX.2-klein-4b-fp8`
`--image-api`	Use OpenAI-compatible API for generation	off
`--api-aspect-ratio`	Aspect ratio for `--image-api` (e.g. `16:9`, `9:16`, `1:1`)	auto
`--api-image-size`	Resolution for `--image-api`: `0.5K`, `1K`, `2K`, `4K`	`1K`
`--weights`	Fine-tuned weights file
`--lora`	LoRA adapter (repeatable, `:WEIGHT`)
`--device`	`auto`, `cpu`, `cuda`, `mps`	`auto`
`--cpu-offload`	Force CPU offload	auto
`--steps`	Inference steps	`4`
`--cfg-scale`	Guidance scale	`2.5`
`--true-cfg-scale`	True CFG (Qwen only)	`4.0`
`--negative-prompt`	Negative prompt	`" "`
`--num-images`	Images per input	`1`
`--seed`	Random seed
`--gguf`	GGUF path/URL for quantized loading
`--strength`	Img2img strength (SDXL/FLUX.2, 0.0-1.0)	`0.7`
`--recursive`, `-R`	Search input directory recursively	off
`--output-format`	`png`, `jpg`, `webp`	`png`
`--skip-existing`	Skip images with existing output	off
`--batch-size`	Flush GPU memory every N images	`0` (off)
`--progress`	Show tqdm progress bar	off
`--dry-run`	Preview without loading models	off

# Single image edit
datasety synthetic --input-image photo.jpg --output-image edited.png \
    --prompt "add sunglasses" --steps 4

# Cloud API — FLUX.2-flex (no GPU needed)
OPENAI_API_KEY=sk-... OPENAI_BASE_URL=https://openrouter.ai/api/v1 \
  datasety synthetic -i ./images -o ./synthetic \
  --prompt "add a winter hat" --image-api --model black-forest-labs/flux.2-flex \
  --api-aspect-ratio 1:1

# Cloud API — Gemini 2.5 Flash (text+image, supports image-to-image)
OPENAI_API_KEY=sk-... OPENAI_BASE_URL=https://openrouter.ai/api/v1 \
  datasety synthetic -i ./images -o ./synthetic \
  --prompt "transform into oil painting style" \
  --model google/gemini-2.5-flash-image --image-api \
  --api-aspect-ratio 3:4 --api-image-size 2K

# FLUX.2-klein-9b-kv (KV-cache, faster multi-reference, ~29 GB VRAM)
datasety synthetic -i ./images -o ./synthetic \
    --model "black-forest-labs/FLUX.2-klein-9b-kv" \
    --prompt "add sunglasses" --steps 4

# Qwen-Image-Edit-2511 with LoRA
datasety synthetic -i ./dataset -o ./synthetic \
    --model "Qwen/Qwen-Image-Edit-2511" \
    --lora "adapter.safetensors:0.8" \
    --prompt "add a red scarf" --steps 40

Full documentation →

`character` — Character Dataset Generation

Generate character datasets using LLM-generated prompts + text-to-image (FLUX.2-klein local or cloud API).

datasety character --output ./dataset --llm-ollama qwen3.5:4b --num-images 20

Options

Option	Description	Default
`--reference`, `-r`	Reference face image(s) (optional, prompt context)
`--output`, `-o`	Output directory	required
`--num-images`, `-n`	Number of images to generate	`10`
`--model`	Model for generation (local HF or API model ID)	`black-forest-labs/FLUX.2-klein-4b-fp8`
`--gguf`	GGUF path/URL for quantized loading
`--image-api`	Use OpenAI-compatible API for image generation	off
`--api-aspect-ratio`	Aspect ratio for `--image-api` (e.g. `9:16`, `1:1`)	derived from `--width`/`--height`
`--api-image-size`	Resolution for `--image-api`: `0.5K`, `1K`, `2K`, `4K`
`--character-description`	Text description of the character
`--style`	Style guidance (e.g., `photorealistic`)
`--prompts-only`	Only generate prompts, skip images	off
`--prompts-file`	Load prompts from file instead of LLM
`--llm-api`	Use OpenAI-compatible API for prompts
`--llm-ollama MODEL`	Use local Ollama server for prompts
`--llm-gguf PATH`	Use local GGUF model for prompts
`--llm-model REPO`	Use HuggingFace model for prompts
`--device`	`auto`, `cpu`, `cuda`, `mps`	`auto`
`--steps`	Inference steps	`4`
`--cfg-scale`	Guidance scale	`4.0`
`--seed`	Random seed
`--height`	Output image height	`1024`
`--width`	Output image width	`1024`
`--output-format`	`png`, `jpg`, `webp`	`png`
`--batch-size`	Flush GPU memory every N images	`0` (off)
`--dry-run`	Preview prompts without generating images	off

# Generate with local pipeline + Ollama prompts
datasety character -o ./dataset --llm-ollama qwen3.5:4b --num-images 20

# Cloud API for images (no GPU needed)
OPENAI_API_KEY=sk-... OPENAI_BASE_URL=https://openrouter.ai/api/v1 \
  datasety character -o ./dataset --prompts-file prompts.txt \
  --image-api --model black-forest-labs/flux.2-flex --api-aspect-ratio 2:3

# Preview prompts only
datasety character -o ./dataset --llm-api --prompts-only

Full documentation →

`audio` — Build TTS Audio Datasets

Build TTS (Text-to-Speech) audio datasets from video or audio files. Supports YouTube URLs, direct media URLs, local files, and text files containing lists of paths. Extracts audio, transcribes with faster-whisper, performs deep text cleaning, and outputs paired .wav + .txt files, or LJSpeech-compatible format with --metadata.

datasety audio --input ./video.mp4 --output ./dataset
datasety audio --input ./clips/ --output ./dataset
datasety audio --input "https://www.youtube.com/watch?v=..." --output ./dataset --language uk

Options

Option	Description	Default
`--input`, `-i`	Input: local file, URL, dir, or `.txt` list. Append `?start=X&end=Y` to slice	required
`--output`, `-o`	Output directory for the dataset	required
`--sample-rate`	Output audio sample rate in Hz	`22050`
`--metadata`	Output LJSpeech/Piper format with `metadata.csv` + `wavs/` (default: flat pairs)	`false`
`--demucs`	Enable Demucs vocal isolation	`false`
`--demucs-model`	Demucs model name	`htdemucs`
`--whisper-model`	Faster-Whisper model: tiny, base, small, medium, large-v3	`base`
`--language`	Language code (e.g., en, es, fr, uk). Auto-detected if omitted	(auto)
`--device`	Device: auto, cpu, cuda, mps	`auto`
`--vad`	Enable voice activity detection (VAD) to filter non-speech	`false`
`--min-duration`	Minimum segment duration in seconds	`1.5`
`--max-duration`	Maximum segment duration in seconds	`30.0`
`--merge-gap`	Merge segments closer than this many seconds	`0.0` (off)
`--normalize-numbers`	Expand digits into words	`false`
`--no-clean-text`	Disable special character stripping	`false`
`--phoneme-map`	Path to `config.json`/`phonemes.json` to filter bad text (with `--metadata`)
`--workers`	Number of parallel file workers (default: 1)	`1`
`--keep-temp`	Keep temporary audio files at this path
`--resume`	Resume a previous run (skip existing chunks, append to CSV)	`false`
`--overwrite`	Overwrite existing output directory	`false`
`--dry-run`	Print pipeline steps without executing	`false`
`--verbose`, `-V`	Print detailed progress messages	`false`

# Default: flat .wav/.txt pairs with timestamp-based naming
datasety audio --input ./video.mp4 --output ./dataset

# LJSpeech/Piper format with metadata.csv + wavs/
datasety audio --input ./video.mp4 --output ./dataset --metadata

# Extract a specific 40-second slice from a YouTube video
datasety audio --input "https://youtube.com/watch?v=...?start=50&end=90" -o ./dataset

# Local video with vocal isolation and high-quality transcription
datasety audio --input ./video.mp4 --output ./dataset --demucs --whisper-model large-v3

# Parallel processing of multiple files
datasety audio --input ./videos/ --output ./dataset --workers 4

Full documentation →

`video` — Build Video Datasets

Build video datasets from video files. Extracts video segments based on speech transcription and outputs paired .mp4 + .txt files.

datasety video --input ./video.mp4 --output ./dataset
datasety video --input ./clips/ --output ./dataset
datasety video --input "https://www.youtube.com/watch?v=..." --output ./dataset --language en

Options

Option	Description	Default
`--input`, `-i`	Input: local file, URL, dir, or `.txt` list. Append `?start=X&end=Y` to slice	required
`--output`, `-o`	Output directory for the dataset	required
`--demucs`	Enable Demucs vocal isolation for transcription	`false`
`--demucs-model`	Demucs model name	`htdemucs`
`--whisper-model`	Faster-Whisper model: tiny, base, small, medium, large-v3	`base`
`--language`	Language code (e.g., en, es, fr). Auto-detected if omitted	(auto)
`--device`	Device: auto, cpu, cuda, mps	`auto`
`--vad`	Enable voice activity detection (VAD) to filter non-speech	`false`
`--min-duration`	Minimum segment duration in seconds	`1.5`
`--max-duration`	Maximum segment duration in seconds	`30.0`
`--merge-gap`	Merge segments closer than this many seconds	`0.0` (off)
`--re-encode`	Re-encode for frame-accurate cuts (default: stream-copy)	`false`
`--normalize-numbers`	Expand digits into words	`false`
`--no-clean-text`	Disable special character stripping	`false`
`--workers`	Number of parallel file workers (default: 1)	`1`
`--resume`	Resume a previous run	`false`
`--overwrite`	Overwrite existing output directory	`false`
`--dry-run`	Print pipeline steps without executing	`false`
`--verbose`, `-V`	Print detailed progress messages	`false`

# YouTube video with timestamp-based segment naming
datasety video --input "https://youtube.com/watch?v=..." --output ./dataset

# Local video with frame-accurate cuts
datasety video --input ./interview.mp4 --output ./dataset --re-encode

# Directory of clips with vocal isolation for transcription
datasety video --input ./videos/ --output ./dataset --demucs --workers 4

Full documentation →

`align` — Align Control/Target Pairs

Match dimensions, enforce multiples of 32, and unify formats for control/target training pairs. Includes a built-in web server for visual comparison with a compare slider, caption editing, and pair management.

datasety align --target ./target --control ./control --dry-run

Options

Option	Description	Default
`--target`, `-t`	Target images directory	required
`--control`, `-c`	Control images directory	required
`--multiple-of`	Align dimensions to this multiple	`32`
`--output-format`	Convert all images: `jpg`, `png`, `webp`	keep original
`--recursive`, `-R`	Search input directories recursively	off
`--dry-run`	Preview changes without modifying files	off

# Preview, then apply
datasety align -t ./target -c ./control --dry-run
datasety align -t ./target -c ./control --output-format jpg

Full documentation →

`train` — LoRA Fine-Tuning & TTS Training

Train a LoRA adapter for image generation models (FLUX, SDXL, Qwen) or a TTS voice model (Piper). The mode is auto-detected from --family (flux/sdxl/qwen) or --backend (piper/coqui/f5-tts).

Image parameters (--family flux/sdxl/qwen): --lr, --lora-rank, --lora-alpha, --image-size, --optimizer, --lr-scheduler, etc.

Audio parameters (--backend piper): --sample-rate, --batch-size, --accelerator, --devices, --test-text.

# Image: FLUX.2-klein LoRA (~8 GB VRAM)
datasety train --input ./dataset --output lora.safetensors --family flux --steps 500 --lr 1e-4 --lora-rank 16

# Audio: Piper TTS (auto-downloads base model, auto-installs Piper, multi-GPU, voice watcher)
datasety train -i ./tts_dataset -o ./tts_output --backend piper \
    --model "rhasspy/piper-checkpoints:en/en_US/kristin/medium" \
    --devices auto --test-text "Hello world"

Image (LoRA) Options

Option	Description	Default
`--family`	Model family: `flux`, `sdxl`, `qwen`	auto-detected
`--model`, `-m`	HuggingFace repo ID (base model)	`black-forest-labs/FLUX.2-klein-base-4B`
`--output`, `-o`	Output `.safetensors` path	`lora.safetensors`
`--steps`	Training steps	`100`
`--lr`	Learning rate	`1e-4`
`--lora-rank`	LoRA rank	`16`
`--lora-alpha`	LoRA alpha	`16.0`
`--lora-dropout`	LoRA dropout rate	`0.0`
`--image-size`	Training resolution (square crop)	`512`
`--device`	`auto`, `cpu`, `cuda`, `mps`	`auto`
`--seed`	Random seed	`42`
`--save-every`	Save checkpoint every N steps	end only
`--resume`	Resume from a `.safetensors` checkpoint
`--validation-split`	Fraction for validation (0.0–0.5)
`--timestep-type`	Timestep sampling: `sigmoid`, `lognorm`, `linear`	`sigmoid`
`--caption-dropout`	Probability of dropping caption	`0.05`
`--gradient-checkpointing`	Enable gradient checkpointing (saves VRAM)	off
`--optimizer`	`adamw` or `adamw8bit` (requires bitsandbytes)	`adamw`
`--lr-scheduler`	LR schedule: `constant`, `cosine`, `linear`	`constant`
`--lr-warmup-steps`	Linear warmup steps	`0`
`--gradient-accumulation-steps`	Accumulate gradients over N steps	`1`
`--min-snr-gamma`	Min-SNR-γ for SDXL (recommended: 5.0)	disabled
`--noise-offset`	Per-channel noise offset for SDXL (recommended: 0.05–0.1)	`0.0`

Audio (TTS) Options

Option	Description	Default
`--backend`	TTS backend: `piper` (coqui, f5-tts planned)	`piper`
`--model`	Piper base model (`repo_id:subfolder` or local path)	(required)
`--output`, `-o`	Output directory for `.ckpt` checkpoints	(required)
`--steps`	Training epochs	`100`
`--sample-rate`	Audio sample rate in Hz	`22050`
`--batch-size`	Training batch size	`32`
`--accelerator`	PyTorch Lightning accelerator: `auto`, `gpu`, `cpu`	`auto`
`--devices`	Number of GPUs: `auto`, `1`, `2`, `-1` (all)	`auto`
`--test-text`	Background inference text to test checkpoints
`--seed`	Random seed	`42`

Full documentation →

`sweep` — Parameter Grid Search

Generate workflow YAML files with parameter grid combinations for synthetic editing. Computes the Cartesian product of sweep parameters.

datasety sweep -i ./images -o ./sweep_output -p "add a winter hat" --steps 4,8,16 --cfg-scale 1.0,2.5,5.0

Options

Option	Description	Default
`--input`, `-i`	Input images directory	required
`--output`, `-o`	Base output directory	required
`--prompt`, `-p`	Edit prompt	required
`--steps`	Comma-separated step values to sweep
`--cfg-scale`	Comma-separated CFG values to sweep
`--true-cfg-scale`	Comma-separated true CFG values to sweep
`--strength`	Comma-separated strength values to sweep
`--lora`	Comma-separated LoRA specs to sweep
`--model`	Comma-separated model names to sweep
`--seed`	Random seed (passed through)
`--output-file`	Output YAML path	`sweep.yaml`
`--run`	Generate and immediately execute	off

# Generate YAML, inspect, then run
datasety sweep -i ./images -o ./sweep -p "add sunglasses" --steps 4,8,16 --cfg-scale 1.0,2.5
datasety workflow -f sweep.yaml

# Generate and run immediately
datasety sweep -i ./images -o ./sweep -p "add a hat" --steps 4,8 --cfg-scale 2.0,3.0 --run

Full documentation →

`workflow` — Multi-Step Pipelines

Run multi-step datasety pipelines from YAML or JSON files with dry-run validation.

datasety workflow --file datasety.yaml --dry-run

Options

Option	Description	Default
`--file`, `-f`	Path to workflow file	auto-detect
`--dry-run`	Validate steps without executing	off

Create datasety.yaml:

steps:
  - command: resize
    args:
      input: ./raw
      output: ./resized
      resolution: 768x1024
  - command: caption
    args:
      input: ./resized
      output: ./resized
      llm-api: true
      model: gpt-5-nano

# Validate first, then execute
datasety workflow --dry-run
datasety workflow

Full documentation →

`server` — REST API Server

Start a headless REST API for remote dataset management and job execution.

datasety server --port 8080

Provides /v1/ endpoints to register datasets (auto-detects types), manage files with full CRUD, and remotely execute any datasety command via JSON payloads.

Endpoints

Endpoint	Method	Description
`/v1/datasets`	POST	Register a dataset
`/v1/datasets`	GET	List all datasets
`/v1/datasets/<id>`	GET	Get dataset info
`/v1/datasets/<id>`	PATCH	Update dataset name
`/v1/datasets/<id>`	DELETE	Unregister dataset
`/v1/datasets/<id>/files`	GET	List files (supports `?folder=&group=` query params)
`/v1/datasets/<id>/files/<path>`	GET	Download a file (or get info with `?info=true`)
`/v1/datasets/<id>/files/<path>`	POST	Create a new file (binary, base64, or sidecar caption/metadata)
`/v1/datasets/<id>/files/<path>`	PUT	Update a file and/or its caption/metadata sidecars
`/v1/datasets/<id>/files/<path>`	DELETE	Delete a file (add `?caption=true` to also remove .txt sidecar)
`/v1/jobs`	GET	List all jobs
`/v1/jobs`	POST	Start a new job (run any datasety command)
`/v1/jobs/<id>`	GET	Get job status & output
`/v1/jobs/<id>`	DELETE	Cancel a running job
`/v1/commands`	GET	Get command schemas

Full API documentation →

`upload` — Upload to HuggingFace Hub

Upload datasets and model adapters to HuggingFace Hub. Auto-detects type (audio, image, video, document, model, generic) from directory structure and generates HF-compliant README dataset cards with YAML frontmatter.

datasety upload --path ./tts_dataset --repo-id user/my-voice --type audio
datasety upload --path ./lora_output --repo-id user/klein-lora --type model
datasety upload --path ./dataset --repo-id user/my-dataset --dry-run

Options

Option	Description	Default
`--path`, `-p`	Path to the dataset or model directory to upload	required
`--repo-id`, `-r`	HuggingFace repo ID (e.g. `username/my-dataset`). Derived from dir name if omitted	(derived)
`--type`, `-t`	Dataset or model type	`auto`
`--private`	Make the repository private	`false`
`--token`	HuggingFace API token (or set `HF_TOKEN` env var)	`HF_TOKEN`
`--force`	Force regenerate README.md if it already exists	`false`
`--dry-run`	Show what would be uploaded without uploading	`false`
`--metadata`	Extra YAML `key: value` pairs for dataset card frontmatter
`--yes`, `-y`	Skip all confirmation prompts	`false`
`--verbose`, `-V`	Print detailed progress messages	`false`

# Upload a TTS dataset (auto-generates README with TTS task card)
datasety upload --path ./tts_dataset --repo-id your-username/my-voice --private

# Upload a LoRA adapter
datasety upload --path ./lora.safetensors --repo-id your-username/klein-lora --type model

# Dry-run to verify what will be uploaded
datasety upload --path ./dataset --repo-id user/dataset --dry-run --verbose

# With extra metadata
datasety upload --path ./dataset --repo-id user/dataset \
    --metadata 'license:cc-by-4.0 language: [en,fr]'

Full documentation →

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kontextox

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.44.0

Apr 10, 2026

0.43.0

Apr 8, 2026

0.42.0

Apr 6, 2026

0.41.0

Apr 4, 2026

0.40.0

Apr 4, 2026

0.39.0

Mar 30, 2026

0.38.0

Mar 30, 2026

0.37.0

Mar 30, 2026

0.36.0

Mar 30, 2026

0.35.0

Mar 30, 2026

0.34.0

Mar 30, 2026

0.33.0

Mar 29, 2026

0.32.0

Mar 24, 2026

0.31.0

Mar 23, 2026

0.30.0

Mar 23, 2026

0.28.0

Mar 22, 2026

0.26.0

Feb 17, 2026

0.25.0

Feb 17, 2026

0.23.0

Feb 16, 2026

0.20.0

Feb 15, 2026

0.19.0

Feb 15, 2026

0.18.0

Feb 15, 2026

0.16.0

Feb 15, 2026

0.15.0

Feb 14, 2026

0.14.0

Feb 14, 2026

0.13.0

Feb 9, 2026

0.12.0

Feb 9, 2026

0.11.0

Feb 9, 2026

0.10.0

Feb 9, 2026

0.9.0

Feb 9, 2026

0.8.0

Feb 9, 2026

0.7.0

Feb 9, 2026

0.6.0

Feb 9, 2026

0.5.0

Feb 8, 2026

0.4.0

Feb 8, 2026

0.3.0

Feb 8, 2026

0.2.0

Feb 4, 2026

0.1.0

Feb 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datasety-0.44.0.tar.gz (5.1 MB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

datasety-0.44.0-py3-none-any.whl (106.3 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file datasety-0.44.0.tar.gz.

File metadata

Download URL: datasety-0.44.0.tar.gz
Upload date: Apr 10, 2026
Size: 5.1 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datasety-0.44.0.tar.gz
Algorithm	Hash digest
SHA256	`1fff465a8067ee6d0a4682edbf0fe7a9a8f0fd5f1f3fe25f88b17076c162b104`
MD5	`ee8208d397527bcd2b322f2881df1fd9`
BLAKE2b-256	`76dbb039de88b4e1e965b824b5a8511dbd7a50435e268fdf43b22b2e2c3c71f7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for datasety-0.44.0.tar.gz:

Publisher: publish.yml on kontextox/datasety

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: datasety-0.44.0.tar.gz
- Subject digest: 1fff465a8067ee6d0a4682edbf0fe7a9a8f0fd5f1f3fe25f88b17076c162b104
- Sigstore transparency entry: 1269367043
- Sigstore integration time: Apr 10, 2026
Source repository:
- Permalink: kontextox/datasety@6c67c9194b630f567ddb34509745ae2d96d408fb
- Branch / Tag: refs/tags/v0.44.0
- Owner: https://github.com/kontextox
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6c67c9194b630f567ddb34509745ae2d96d408fb
- Trigger Event: release

File details

Details for the file datasety-0.44.0-py3-none-any.whl.

File metadata

Download URL: datasety-0.44.0-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 106.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datasety-0.44.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`52cdfe50a558af77950333f5e64a4a6039c8eb6b758839db4c525689b058d1c8`
MD5	`625e4d644af4e457cf46a792871bef10`
BLAKE2b-256	`b9c19a22d730542186b5fd84bfca3e7171735b139ff13e617ca5ffd58274f13a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for datasety-0.44.0-py3-none-any.whl:

Publisher: publish.yml on kontextox/datasety

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: datasety-0.44.0-py3-none-any.whl
- Subject digest: 52cdfe50a558af77950333f5e64a4a6039c8eb6b758839db4c525689b058d1c8
- Sigstore transparency entry: 1269367235
- Sigstore integration time: Apr 10, 2026
Source repository:
- Permalink: kontextox/datasety@6c67c9194b630f567ddb34509745ae2d96d408fb
- Branch / Tag: refs/tags/v0.44.0
- Owner: https://github.com/kontextox
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6c67c9194b630f567ddb34509745ae2d96d408fb
- Trigger Event: release

datasety 0.44.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Installation

Commands

resize — Resize & Crop Images

filter — Filter Dataset by Content

degrade — Image Degradation

mask — Text-Prompted Segmentation Masks

caption — Generate Image Captions

shuffle — Random Caption Generation

synthetic — Synthetic Image Editing

character — Character Dataset Generation

audio — Build TTS Audio Datasets

video — Build Video Datasets

align — Align Control/Target Pairs

train — LoRA Fine-Tuning & TTS Training

sweep — Parameter Grid Search

workflow — Multi-Step Pipelines

server — REST API Server

upload — Upload to HuggingFace Hub

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`resize` — Resize & Crop Images

`filter` — Filter Dataset by Content

`degrade` — Image Degradation

`mask` — Text-Prompted Segmentation Masks

`caption` — Generate Image Captions

`shuffle` — Random Caption Generation

`synthetic` — Synthetic Image Editing

`character` — Character Dataset Generation

`audio` — Build TTS Audio Datasets

`video` — Build Video Datasets

`align` — Align Control/Target Pairs

`train` — LoRA Fine-Tuning & TTS Training

`sweep` — Parameter Grid Search

`workflow` — Multi-Step Pipelines

`server` — REST API Server

`upload` — Upload to HuggingFace Hub