soup-cli

Fine-tune LLMs in one command. No SSH, no config hell.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

AlpamysMakazhan

These details have not been verified by PyPI

Project description

Soup

Fine-tune LLMs in one command. No SSH, no config hell.

Website · Quick Start · Config · Docs · Commands · Models

Python 3.10+ Apache-2.0 License

Soup turns the pain of LLM fine-tuning into a simple workflow. One config, one command, done.

pip install 'soup-cli[train]'   # add [train] to fine-tune; bare `soup-cli` is the light CLI
soup init --template chat
soup train

Why Soup?

Training LLMs is still painful. Even experienced teams spend 30-50% of their time fighting infrastructure instead of improving models. Soup fixes that.

Zero SSH. Never SSH into a broken GPU box again.
One config. A simple YAML file is all you need.
Auto everything. Batch size, GPU detection, quantization — handled.
Works locally. Train on your own GPU with QLoRA. No cloud required.

What's New

v0.71.12 — Architecture, distillation & adapter-training (live). Seven schema-only surfaces from earlier releases are now real, validated end-to-end on tiny models:

soup serve --bank <bank.json> — multi-tenant VeRA / VB-LoRA serving: load N personas at KB-per-user (shared projection + per-user vectors) and pick the active one per request via the X-User-Id header. An unknown / absent id is a zero-delta no-op (no cross-request leak).
task: moe_lora_routing — MoLE per-token routing trains a small gating network over N frozen task LoRAs (mole_task_adapters, mole_top_k, mole_temperature); only the router trains.
task: distill with distill_mode: sequence — sequence-level KD trains the student on the teacher's generated continuations (cross-tokenizer-friendly), alongside the existing token logit-KL.
task: classifier with a lora section — train a frozen encoder + LoRA adapter classifier instead of the full model.
use_mod (Mixture-of-Depths) · expand_layers (LLaMA Pro) · use_longlora (S² attention) — the architecture knobs that were schema-only are now live for Llama / Qwen / Mistral (+ Phi for LongLoRA).

Full history: CHANGELOG.md · GitHub Releases.

Quick Start

1. Install

pip install soup-cli            # light: CLI + config + data tools (no PyTorch)
pip install 'soup-cli[train]'   # add the training stack (torch, transformers, peft, trl, …)
pip install git+https://github.com/MakazhanAlpamys/Soup.git   # latest dev

soup init, soup data …, and the other data/inspection commands work on the light install. Fine-tuning (soup train) needs the [train] extra.

2. Create a config

soup init                       # interactive wizard
soup init --template chat       # or start from a template

Templates: chat, code, tool-calling, medical, reasoning, vision, kto, orpo, simpo, ipo, bco, rlhf, pretrain, moe, longcontext, embedding, audio.

3. Train, test, ship

soup train --config soup.yaml                 # LoRA, quantization, batching — all handled
soup chat  --model ./output                    # talk to your model
soup push  --model ./output --repo you/my-model

soup merge  --adapter ./output                              # merge LoRA into the base
soup export --model ./output --format gguf --quant q4_k_m   # GGUF for Ollama / llama.cpp

More export targets (ONNX, TensorRT, AWQ, GPTQ, BitNet) and deployment options live in docs/serving-and-export.md.

Configuration

A complete soup.yaml:

base: meta-llama/Llama-3.1-8B-Instruct
task: sft
# backend: unsloth  # 2-5x faster, pip install 'soup-cli[fast]'

data:
  train: ./data/train.jsonl
  format: alpaca
  val_split: 0.1

training:
  epochs: 3
  lr: 2e-5
  batch_size: auto
  lora:
    r: 64
    alpha: 16
  quantization: 4bit

output: ./output

config/schema.py is the single source of truth for every field. Advanced data, training, and PEFT options are documented under Documentation.

Documentation

The full feature reference lives in docs/. Start here:

Guide	Covers
Training tasks & methods	SFT, DPO/GRPO/PPO/KTO/ORPO/SimPO/IPO/BCO, tool-calling, PRM, pre-training, distillation, classification, vision/audio/TTS, unlearning, RAFT/RA-DIT, loop-hardening detectors
PEFT, long context & efficiency	DoRA, LoRA+, rsLoRA, VeRA, OLoRA, NEFTune, PiSSA, ReLoRA, optimizer & PEFT zoo, LLaMA Pro, GaLore, YaRN/LongLoRA, packing, curriculum, auto-tuning
Performance & quantization	QAT, FP8, Quant Menu (I + II), KV-cache, NVFP4, save formats, Cut Cross-Entropy, gradient checkpointing, kernels, activation offloading, multi-GPU / DeepSpeed / FSDP
Data engineering	Formats, the Axolotl/LF-parity pipeline, data tools, synthetic generation & forge, quality scorecards, trace tooling, remote datasets, mixing, recipe DAGs
Evaluation & probes	Eval design/gate, eval-gated training, benchmarks, NLG metrics, calibration, Elo arena, diagnose, post-train X-ray probes, A/B, drift, tunability, `soup advise`
Serving & export	OpenAI-compatible server, batch inference, benchmarking, merge/export, Anthropic Messages endpoint, speculative decoding, deploy autopilot, Web UI, Agent Forge
Adapters, registry & governance	Adapter lifecycle/management, model registry, Soup Cans, the data flywheel (`soup loop`), knowledge editing, steering, supply-chain controls (scan/sign/BOM/attest/audit/airgap)
Backends, platform & ops	MLX/Unsloth backends, alternative hubs, HF Hub integration, autopilot, experiment tracking, plan/apply, env lockfiles, hardware-fit, completions, plugins, utility commands
Command reference	The full `soup` command list
Supported models & extras	Recommended model families, the VRAM size guide, the pip extras matrix

Data Formats

All formats are auto-detected from JSONL, JSON, CSV, Parquet, or TXT:

alpaca — {"instruction": ..., "input": ..., "output": ...}
sharegpt — {"conversations": [{"from": "human", "value": ...}, ...]}
chatml — {"messages": [{"role": "user", "content": ...}, ...]}
dpo / orpo / simpo / ipo — {"prompt": ..., "chosen": ..., "rejected": ...}
kto — {"prompt": ..., "completion": ..., "label": true}
llava / sharegpt4v (vision), audio, plaintext (pre-training), embedding, prm, pre_tokenized, video, multimodal

Full schemas and the Axolotl/LlamaFactory-parity data pipeline (remote URIs, streaming, sharding, interleaving, vocab expansion, document ingestion) are in docs/data.md.

Common Commands

soup train  --config soup.yaml        # train (SFT/DPO/GRPO/PPO/KTO/ORPO/SimPO/IPO/...)
soup infer  --model ./output --input prompts.jsonl   # batch inference
soup chat   --model ./output          # interactive chat
soup serve  --model ./output          # OpenAI-compatible API server
soup merge  --adapter ./output        # merge LoRA into the base model
soup export --model ./output --format gguf           # export for deployment
soup eval   benchmark --model ./output               # evaluate
soup data   inspect ./data/train.jsonl               # dataset stats
soup recipes list                     # 100+ ready-made model recipes
soup autopilot --model <id> --data d.jsonl --goal chat  # zero-config
soup doctor                           # check GPU / deps / environment

The complete command list is in docs/commands.md.

Supported Models

Soup works with any text-generation model on the HuggingFace Hub — if it loads with AutoModelForCausalLM, it works, zero config changes. Llama 3.x/4, Qwen 2.5/3, Gemma 3, Mistral, Mixtral, DeepSeek R1/V3, Phi-4, and 100+ others ship as ready-made recipes (soup recipes list).

VRAM	Max model (QLoRA 4-bit)	Example
8 GB	~7B	Llama-3.1-8B, Mistral-7B
16 GB	~14B	Phi-4-14B, Qwen2.5-14B
24 GB	~34B	CodeLlama-34B, Yi-1.5-34B
48 GB	~70B	Llama-3.3-70B
80 GB+	70B+ (full) or MoE	Mixtral-8x22B, DeepSeek-V3

Full model + vision tables and the optional-extras matrix are in docs/models.md.

Docker

Run Soup without installing CUDA or PyTorch locally (image published to GHCR on every release):

docker pull ghcr.io/makazhanalpamys/soup:latest
docker run --gpus all -v $(pwd):/workspace ghcr.io/makazhanalpamys/soup train --config soup.yaml
docker compose up   # or build locally

Requirements

Python 3.10+
GPU with CUDA (recommended), Apple Silicon (MPS), or CPU (experimental — very slow)
8 GB+ VRAM for 7B models with QLoRA

All training tasks run on CPU for testing (quantization auto-disabled). Optional extras (train, all, fast, vision, qat, serve, serve-fast, ui, eval, deepspeed, liger, mlx, onnx, tensorrt, …) are listed in docs/models.md.

Troubleshooting

soup doctor    # GPU, system resources, dependencies, and version in one place

ImportError: DLL load failed while importing _C (Windows) — reinstall PyTorch for your CUDA version: pip install torch --index-url https://download.pytorch.org/whl/cu121.
soup version ≠ pip show soup-cli — multiple Python installs; use a virtualenv.

Development

git clone https://github.com/MakazhanAlpamys/Soup.git
cd Soup
pip install -e ".[dev]"

ruff check src/soup_cli/ tests/    # lint
pytest tests/ -v                   # unit tests (fast, no GPU)
pytest tests/ -m smoke -v          # smoke tests (downloads a tiny model, trains)

pre-commit install                 # optional: ruff lint+format on commit

See CONTRIBUTING.md for the full workflow and SECURITY.md to report a vulnerability.

License

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

AlpamysMakazhan

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.71.21

Jun 10, 2026

0.71.20

Jun 10, 2026

0.71.19

Jun 9, 2026

0.71.18

Jun 8, 2026

0.71.17

Jun 8, 2026

0.71.16

Jun 7, 2026

0.71.15

Jun 7, 2026

0.71.14

Jun 5, 2026

0.71.13

Jun 4, 2026

This version

0.71.12

Jun 4, 2026

0.71.11

Jun 4, 2026

0.71.10

Jun 3, 2026

0.71.9

Jun 3, 2026

0.71.8

Jun 3, 2026

0.71.7

Jun 3, 2026

0.71.6

Jun 2, 2026

0.71.5

Jun 2, 2026

0.71.4

Jun 2, 2026

0.71.3

Jun 1, 2026

0.71.2

Jun 1, 2026

0.71.1

Jun 1, 2026

0.71.0

Jun 1, 2026

0.70.0

May 25, 2026

0.69.0

May 25, 2026

0.68.0

May 24, 2026

0.67.0

May 24, 2026

0.66.0

May 22, 2026

0.65.0

May 21, 2026

0.64.0

May 21, 2026

0.63.0

May 20, 2026

0.62.0

May 20, 2026

0.61.0

May 19, 2026

0.60.0

May 19, 2026

0.59.0

May 18, 2026

0.58.0

May 15, 2026

0.57.0

May 15, 2026

0.56.0

May 15, 2026

0.55.0

May 15, 2026

0.54.0

May 14, 2026

0.53.11

May 14, 2026

0.53.10

May 14, 2026

0.53.9

May 14, 2026

0.53.8.1

May 13, 2026

0.53.7

May 13, 2026

0.53.6

May 13, 2026

0.53.5

May 13, 2026

0.53.4

May 13, 2026

0.53.3

May 13, 2026

0.53.2

May 13, 2026

0.53.1

May 12, 2026

0.53.0

May 12, 2026

0.52.0

May 12, 2026

0.51.0

May 12, 2026

0.50.0

May 11, 2026

0.49.0

May 11, 2026

0.48.0

May 11, 2026

0.47.0

May 11, 2026

0.46.0

May 11, 2026

0.45.0

May 10, 2026

0.44.0

May 10, 2026

0.43.0

May 10, 2026

0.42.0

May 10, 2026

0.41.0

May 10, 2026

0.40.6

May 9, 2026

0.40.5

May 9, 2026

0.40.4

May 9, 2026

0.40.3

May 8, 2026

0.40.2

May 8, 2026

0.40.1

May 8, 2026

0.40.0

May 1, 2026

0.39.0

May 1, 2026

0.38.0

May 1, 2026

0.37.0

Apr 30, 2026

0.36.0

Apr 30, 2026

0.35.0

Apr 28, 2026

0.34.0

Apr 28, 2026

0.33.0

Apr 27, 2026

0.32.0

Apr 26, 2026

0.31.0

Apr 25, 2026

0.30.0

Apr 24, 2026

0.29.0

Apr 24, 2026

0.28.0

Apr 22, 2026

0.27.0

Apr 21, 2026

0.26.0

Apr 20, 2026

0.25.1

Apr 13, 2026

0.25.0

Apr 13, 2026

0.24.3

Apr 7, 2026

0.24.2

Apr 7, 2026

0.24.1

Apr 3, 2026

0.24.0

Apr 3, 2026

0.23.1

Apr 3, 2026

0.23.0

Apr 3, 2026

0.22.1

Apr 3, 2026

0.22.0

Apr 3, 2026

0.21.1

Apr 2, 2026

0.21.0

Apr 2, 2026

0.20.2

Apr 1, 2026

0.20.1

Apr 1, 2026

0.20.0

Apr 1, 2026

0.19.0

Apr 1, 2026

0.18.2

Apr 1, 2026

0.18.1

Apr 1, 2026

0.18.0

Apr 1, 2026

0.17.3

Mar 26, 2026

0.17.2

Mar 26, 2026

0.17.1

Mar 26, 2026

0.17.0

Mar 26, 2026

0.16.0

Mar 26, 2026

0.15.0

Mar 26, 2026

0.14.3

Mar 25, 2026

0.14.2

Mar 25, 2026

0.14.1

Mar 25, 2026

0.14.0

Mar 25, 2026

0.13.2

Mar 25, 2026

0.13.1

Mar 25, 2026

0.13.0

Mar 25, 2026

0.12.0

Mar 25, 2026

0.10.10

Mar 25, 2026

0.10.9

Mar 24, 2026

0.10.8

Mar 24, 2026

0.10.7

Mar 24, 2026

0.10.6

Mar 24, 2026

0.10.5

Mar 24, 2026

0.10.4

Mar 24, 2026

0.10.3

Mar 24, 2026

0.10.2

Mar 24, 2026

0.10.1

Mar 24, 2026

0.10.0

Mar 23, 2026

0.9.0

Mar 23, 2026

0.8.0

Mar 23, 2026

0.7.3

Mar 23, 2026

0.7.2

Mar 23, 2026

0.7.1

Mar 23, 2026

0.6.0

Mar 23, 2026

0.5.0

Mar 23, 2026

0.4.3

Mar 23, 2026

0.4.2

Mar 23, 2026

0.4.1

Mar 23, 2026

0.4.0

Mar 23, 2026

0.3.2

Mar 5, 2026

0.3.1

Mar 5, 2026

0.3.0

Mar 5, 2026

0.2.2

Mar 5, 2026

0.2.1

Mar 3, 2026

0.2.0

Mar 3, 2026

0.1.0

Mar 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soup_cli-0.71.12.tar.gz (2.1 MB view details)

Uploaded Jun 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

soup_cli-0.71.12-py3-none-any.whl (1.4 MB view details)

Uploaded Jun 4, 2026 Python 3

File details

Details for the file soup_cli-0.71.12.tar.gz.

File metadata

Download URL: soup_cli-0.71.12.tar.gz
Upload date: Jun 4, 2026
Size: 2.1 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for soup_cli-0.71.12.tar.gz
Algorithm	Hash digest
SHA256	`b75321a5c1446055f1183bde1817d1d4eb95c67d1b203b4f62f3993707b2d597`
MD5	`81b7ae0ddf9448d3614bf855f59f2755`
BLAKE2b-256	`b4d13bb5a4edd35f15c024aa59dec03a1087b83bc5830092cf1fd068ca0d1822`

See more details on using hashes here.

Provenance

The following attestation bundles were made for soup_cli-0.71.12.tar.gz:

Publisher: publish.yml on MakazhanAlpamys/Soup

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: soup_cli-0.71.12.tar.gz
- Subject digest: b75321a5c1446055f1183bde1817d1d4eb95c67d1b203b4f62f3993707b2d597
- Sigstore transparency entry: 1721489256
- Sigstore integration time: Jun 4, 2026
Source repository:
- Permalink: MakazhanAlpamys/Soup@b46f22433c810a9ee79d37bd42c4d1b26641d494
- Branch / Tag: refs/tags/v0.71.12
- Owner: https://github.com/MakazhanAlpamys
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b46f22433c810a9ee79d37bd42c4d1b26641d494
- Trigger Event: push

File details

Details for the file soup_cli-0.71.12-py3-none-any.whl.

File metadata

Download URL: soup_cli-0.71.12-py3-none-any.whl
Upload date: Jun 4, 2026
Size: 1.4 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for soup_cli-0.71.12-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a2b3a9cacc09e1744088e4f5172ad74c1a1d7b7e8393639412d0748d474824e0`
MD5	`d7784283331eb05dfc5ad562602f8c3c`
BLAKE2b-256	`db7a3c528572a1a7a8289affc8e53bd3e19c2d59e1d58bceaecb91021ad97150`

See more details on using hashes here.

Provenance

The following attestation bundles were made for soup_cli-0.71.12-py3-none-any.whl:

Publisher: publish.yml on MakazhanAlpamys/Soup

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: soup_cli-0.71.12-py3-none-any.whl
- Subject digest: a2b3a9cacc09e1744088e4f5172ad74c1a1d7b7e8393639412d0748d474824e0
- Sigstore transparency entry: 1721489473
- Sigstore integration time: Jun 4, 2026
Source repository:
- Permalink: MakazhanAlpamys/Soup@b46f22433c810a9ee79d37bd42c4d1b26641d494
- Branch / Tag: refs/tags/v0.71.12
- Owner: https://github.com/MakazhanAlpamys
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b46f22433c810a9ee79d37bd42c4d1b26641d494
- Trigger Event: push

soup-cli 0.71.12

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Soup

Why Soup?

What's New

Quick Start

1. Install

2. Create a config

3. Train, test, ship

Configuration

Documentation

Data Formats

Common Commands

Supported Models

Docker

Requirements

Troubleshooting

Development

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance