Flash — managed LoRA post-training (SFT/GRPO) for Freesolo environments, driven by the `flash` CLI

These details have not been verified by PyPI

Project links

Project description

Flash

Managed LoRA post-training service: SFT and GRPO on managed RunPod Flash GPUs. The allocator picks the cheapest validated RunPod GPU class that fits the run.

Scope

flash train <cfg.toml> / control-plane POST /runs — submit a training job; one dedicated GPU per run, supervised server-side (stall watchdog, bounded auto-retry resuming from the last streamed checkpoint, endpoint GC).
flash deploy, flash chat — serving for trained adapters.
Freesolo SDK environments. Every run names a Freesolo environment id. Scaffold environment.py plus datasets/train.jsonl, upload . or another folder with flash env push --name <name> <folder>, then reference the returned id. The worker loads it through freesolo.environments. There are no built-in task environments. Single-turn and bounded multi-turn environments are supported.

Layout

flash/catalog.py — curated model catalog (Qwen3 dense supported tier; Qwen3.5/3.6 experimental tier) + model_policy = "allow" VRAM-fit check + each model's thinking capability (opt-in reasoning mode thinking = true)
flash/schema.py, flash/spec.py — TOML → JobSpec
flash/runner.py — server-side run supervisor (durable job handle, retries, cost guard, endpoint GC)
flash/providers/ — RunPod Flash provider code (pricing, gpus, durable submit/poll, preflight) behind the base.Provider protocol, with an allocator.py that picks the cheapest fitting class
flash/engine/ — the on-GPU worker (TRL + colocated vLLM rollouts) and the shared recipe; SFT targets and RL rewards route through the active environment (task-specific grading lives with its example, not in the engine)
flash/envs/ — environment machinery: registry and the adapter that loads Freesolo SDK environments onto the worker's interface
flash env setup — scaffold a starter local Freesolo env, datasets/train.jsonl, and ready-to-run configs to start from
flash/serve/, flash/server/ — adapter serving and the FastAPI control plane (run operator-side via the separate flash-server command)
Dockerfile — the control-plane image (used by the repo docker-compose)
tests/ — pytest suite (CPU-only; offline-by-default, no GPU/network)

Local commands

cd flash
uv sync --extra server
uv run pytest                           # CPU tests (offline-by-default, no GPU/network)
uv run ruff check . && uv run ruff format .
uv run flash --help
uv run flash-server                      # control plane (operator-side, run once)

The control plane owns provider credentials: RUNPOD_API_KEY is always required, plus the shared HF_TOKEN. The artifact repo is platform-managed and per-run (each run gets its own Freesolo-Co/flashrun-<run_id>, written by the operator HF_TOKEN); it is not a user knob and not an operator-wide env var. Clients authenticate with their freesolo API key (flash login).

Release channels

Two channels are published to PyPI from the same source, distinguished by one line in flash/_channel.py (CHANNEL):

Channel	PyPI package	CLI	Default plane	Published from
prod	`freesolo-flash`	`flash`	`flash.freesolo.co`	push to `main` that bumps `[project].version` (`.github/workflows/publish.yml`)
dev	`freesolo-flash-dev`	`flash-dev`	`flash-dev.freesolo.co`	push to `dev` whose `[tool.flash-dev].version` isn't on PyPI yet (`.github/workflows/publish-dev.yml`)

Each environment holds exactly one channel: both packages ship the same import package (flash/) with one baked CHANNEL line, so installing both into the same environment makes the later install win for both CLIs. For side-by-side prod and staging, install each channel in its own virtualenv (or via pipx, which isolates per tool). The dev build is produced by scripts/build_dev_dist.py, which renames the package/CLI and flips CHANNEL to dev before uv build. Both channels ship at the same version: [project].version and [tool.flash-dev].version must match (CI enforces this via .github/workflows/version-parity.yml), so cutting a release means bumping both together. Either CLI still honours an explicit FLASH_API_URL / the login --api-url flag; the channel only sets the default.

Serving From an API

flash chat is a CLI wrapper around the Flash control-plane chat endpoint. To call a deployed adapter from your own app, deploy the finished run once and then POST chat requests with your freesolo API key:

export FLASH_API_URL=https://flash.freesolo.co
export FREESOLO_API_KEY=fslo_...
export RUN_ID=flash-1782194170-ce1cfcff

curl -X POST "$FLASH_API_URL/v1/runs/$RUN_ID/deploy" \
  -H "Authorization: Bearer $FREESOLO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"dry_run": false}'

curl -X POST "$FLASH_API_URL/v1/runs/$RUN_ID/chat" \
  -H "Authorization: Bearer $FREESOLO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Write a two-sentence summary of the run."}
    ],
    "temperature": 0.0,
    "max_tokens": 256
  }'

The response uses the OpenAI chat-completions shape:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "..."
      }
    }
  ]
}

Use choices[0].message.content for the generated text. The run id is the adapter id for serving. If the run is not deployed yet, /v1/runs/<run_id>/chat returns 409 with a hint to deploy first.

Operators can also call the Modal serving app directly after the adapter is registered. The default serving app is https://clado-ai--freesolo-lora-serving.modal.run, and operators can point Flash at another serving app by setting FREESOLO_SERVING_URL. Use that same base URL when calling the app directly; pass the run id as model:

export FREESOLO_SERVING_URL=https://clado-ai--freesolo-lora-serving.modal.run

curl -X POST "$FREESOLO_SERVING_URL/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "flash-1782194170-ce1cfcff",
    "messages": [{"role": "user", "content": "Hello"}],
    "temperature": 0.0,
    "max_tokens": 256
  }'

Prefer the Flash control-plane endpoint for user apps because it enforces run ownership and forwards per-run serving options such as thinking-mode parity.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.26

Jun 27, 2026

0.2.25

Jun 25, 2026

0.2.24

Jun 24, 2026

0.2.23

Jun 24, 2026

0.2.20

Jun 24, 2026

0.2.19

Jun 23, 2026

0.2.18

Jun 23, 2026

0.2.17

Jun 23, 2026

0.2.16

Jun 23, 2026

0.2.15

Jun 23, 2026

0.2.14

Jun 23, 2026

0.2.12

Jun 22, 2026

0.2.11

Jun 22, 2026

0.2.10

Jun 22, 2026

0.2.9

Jun 22, 2026

0.2.8

Jun 22, 2026

0.2.7

Jun 22, 2026

0.2.6

Jun 21, 2026

0.2.5

Jun 21, 2026

0.2.4

Jun 21, 2026

0.2.3

Jun 21, 2026

0.2.2

Jun 21, 2026

0.2.1

Jun 21, 2026

0.2.0

Jun 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

freesolo_flash-0.2.26.tar.gz (2.5 MB view details)

Uploaded Jun 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

freesolo_flash-0.2.26-py3-none-any.whl (530.9 kB view details)

Uploaded Jun 27, 2026 Python 3

File details

Details for the file freesolo_flash-0.2.26.tar.gz.

File metadata

Download URL: freesolo_flash-0.2.26.tar.gz
Upload date: Jun 27, 2026
Size: 2.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for freesolo_flash-0.2.26.tar.gz
Algorithm	Hash digest
SHA256	`d7a0aeec2d3b674a645ded84e184609b519810fc4ca86967d7b2dbeddf11a538`
MD5	`138dab67619852158b8f821cf1cff8ce`
BLAKE2b-256	`324894909ee1194f828e52e7c581842de467d40938378acbb0f53d5fe27570a8`

See more details on using hashes here.

File details

Details for the file freesolo_flash-0.2.26-py3-none-any.whl.

File metadata

Download URL: freesolo_flash-0.2.26-py3-none-any.whl
Upload date: Jun 27, 2026
Size: 530.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for freesolo_flash-0.2.26-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d2c5e8a7ef264c2c9a97514a16150c377ba634b2bdeaac9a2762fc72bd70879d`
MD5	`ca392501cdcb71bd142482dbffcda84f`
BLAKE2b-256	`93d022254d891cebd3cf038b9977a1a2b1cd6c19e2c91503eb2a8f35cfb095fc`

See more details on using hashes here.

freesolo-flash 0.2.26

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Flash

Scope

Layout

Local commands

Release channels

Serving From an API

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes