Skip to main content

CLI and SDK for submitting and managing GPU workloads

Project description

Flow CLI & SDK

Python → Petaflops in 15 seconds. Flow procures GPUs through Mithril, spins InfiniBand-connected instances, and runs your workloads—zero friction, no hassle.

PyPI - Version Public repo

Background

There's a paradox in GPU infrastructure today: Massive GPU capacity sits idle, even as AI teams wait in queues—starved for compute. Mithril, the AI-compute omnicloud, dynamically allocates GPU resources from a global pool (spanning Mithril's first-party resources and 3rd-party partner cloud capacity) using efficient two-sided auctions, maximizing surplus and reducing costs. Mithril seamlessly supports both reserved-in-advance and just-in-time workloads—maximizing utilization, ensuring availability, and significantly reducing costs.

Infrastructure mode

flow instance create -i 8xh100 -N 20
╭─ Instance Configuration ────────────────────────────────╮
│                                                         │
│  Name           multinode-run                           │
│  Command        sleep infinity                          │
│  Image          nvidia/cuda:12.1.0-runtime-ubuntu22.04  │
│  Working Dir    /workspace                              │
│  Instance Type  8xh100                                  │
│  Num Instances  20                                      │
│  Max price      $12.29/hr                               │
│                                                         │
╰─────────────────────────────────────────────────────────╯

flow instance list
╭───────────────────────────────  Flow ────────────────────────────────╮
│                                                                       │
│     #     Status     Instance               GPU       Owner     Age   │     1    running    multinode-run        8×H100·80G  alex       0m   │
│     2    running    interactive-77c31e   A100·80G    noam       5h   │
│     3    cancelled  dev-a100-test        A100·80G    alex       1d   │
│                                                                       │
╰───────────────────────────────────────────────────────────────────────╯

Research mode (in early preview)

flow submit "python train.py" # -i 8xh100 Bidding for best‑price GPU node (8×H100) with $12.29/h100-hr limit_price…
✓ Launching on NVIDIA H100-80GB for $1/h100-hr

Quick Start

# Optional: install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install via uv or pipx (or see installer scripts in scripts/)
uv tool install flow-compute
# or: pipx install flow-compute

flow setup  # Sets up your authentication and configuration
flow dev   # Interactive box. sub-5-second dev loop after initial VM config

Optional assistant integrations:

flow codex   # Install Flow skills for Codex
flow claude  # Install Flow skills for Claude Code

Why choose Flow

Status quo GPU provisioning involves quotas, complex setups, and queue delays, even as GPUs sit idle elsewhere or in recovery processes. Flow addresses this:

Dynamic Market Allocation – Efficient two-sided auctions ensure you pay the lowest market-driven prices rather than inflated rates.

Simplified Batch Execution – An intuitive interface designed for cost-effective, high-performance batch workloads without complex infrastructure management.

Provision from 1 to thousands of GPUs for long-term reservations, short-term "micro-reservations" (minutes to weeks), or spot/on-demand needs—all interconnected via InfiniBand. High-performance persistent storage and built-in Docker support further streamline workloads, ensuring rapid data access and reproducibility.


Why Flow + Mithril?

Pillar Outcome How
Iteration Velocity and Ease Fresh containers in seconds; from idea to training or serving instantly. flow dev for DevBox or flow submit to programmatically launch tasks
Best price-performance via market-based pricing Preemptible secure jobs for $1/h100-hr Blind two-sided second-price auction; client-side bid capping
Availability and Elasticity GPUs always available, self-serve; no haggling, no calls. Uncapped spot + overflow capacity from partner clouds
Abstraction and Simplification InfiniBand VMs, CUDA drivers, auto-managed healing buffer—all pre-arranged. Mithril virtualization and base images preconfigured + Mithril capacity management.

"The tremendous demand for AI compute and the large fraction of idle time makes sharing a perfect solution, and Mithril's innovative market is the right approach."Paul Milgrom, Nobel Laureate (Auction Theory and Mechanism Design)


Key Concepts to Get Started

Core Workflows

Infrastructure mode

  • flow instance create -i 8xh100 -N 20 → spin up a 20-node GPU cluster in seconds
  • flow volume create -s 10000 -i file → provision 10 TB of persistent, high-speed storage
  • flow ssh instance -- nvidia-smi → run across all nodes in parallel

In research preview

Research mode

  • flow dev → interactive loops in seconds.
  • flow submit → reproducible batch jobs.
  • Python API → easy pipelines and orchestration.

Examples

# Launch a batch job on discounted H100s
flow submit "python train.py" -i 8xh100

# Frictionlessly leverage an existing SLURM script
flow submit job.slurm

# Serverless‑style decorator
@app.function(gpu="a100")

Ideal Use Cases

  • Rapid Experimentation – Quick iterations for research sprints.
  • Instant Elasticity – Scale rapidly from one to thousands of GPUs.
  • Collaborative Research – Shared dev environments with per-task cost controls.

Flow is not yet ideal for: always‑on ≤100 ms inference, strictly on‑prem regulated data, or models that fit on laptop or consumer-grade GPUs.


Architecture (30‑s view)

Your intent ⟶ Flow Execution Layer ⟶ Global GPU Fabric

Flow SDK abstracts complex GPU auctions, InfiniBand clusters, and multi-cloud management into a single seamless and unified developer interface.


Installation

Requirements

  • Python 3.10 or later
  • Recommended: use uv to auto-manage a compatible Python when installing the CLI

1) Install uv — optional but recommended

Installation guide: docs.astral.sh/uv/getting-started/installation

  • macOS/Linux:
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  • Windows (PowerShell):
    powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
    

2) Install Flow

  • Global CLI (uv):

    uv tool install flow-compute
    flow setup
    
  • Global CLI (pipx):

    pipx install flow-compute
    flow setup
    

Under the Hood (Advanced)

  • Bid Caps – Protect budgets automatically.
  • Self-Healing – Spot nodes dynamically migrate tasks.
  • Docker/Conda – Pre-built images or dynamic install.
  • Multi-cloud Ready – Mithril (with Oracle, Nebius integrations internal to Mithril), and more coming
  • SLURM Compatible – Run #SBATCH scripts directly.

Python SDK (Research Preview)

Advanced Task Configuration

# Distributed training example (32 GPUs, Mithril groups for InfiniBand connectivity by default)
task = flow.run(
    command="torchrun --nproc_per_node=8 train.py",
    instance_type="8xa100",
    num_instances=4,  # Total of 32 GPUs (4 nodes × 8 GPUs each)
    env={"NCCL_DEBUG": "INFO"}
)

# Mount S3 data + persistent volumes
task = flow.run(
    "python analyze.py",
    gpu="a100",
    mounts={
        "/datasets": "s3://ml-bucket/imagenet",  # S3 via s3fs
        "/models": "volume://pretrained-models"   # Persistent storage
    }
)

Key Features Summary

  • Distributed Training – Multi-node InfiniBand clusters auto-configured
  • Code Upload – Automatic with .flowignore (or .gitignore fallback)
  • Live Debugging – SSH into running instances (flow ssh)
  • Cost Protection – Built-in max_price_per_hour safeguards
  • Jupyter Integration – Connect notebooks to GPU instances

Documentation: https://docs.mithril.ai/cli-and-sdk/quickstart

Further Reading

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flow_compute-3.18.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

flow_compute-3.18.0-py3-none-any.whl (1.6 MB view details)

Uploaded Python 3

flow_compute-3.18.0-cp314-cp314-macosx_11_0_arm64.whl (4.4 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

flow_compute-3.18.0-cp311-cp311-manylinux_2_28_x86_64.whl (4.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

flow_compute-3.18.0-cp311-cp311-manylinux_2_28_aarch64.whl (4.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

File details

Details for the file flow_compute-3.18.0.tar.gz.

File metadata

  • Download URL: flow_compute-3.18.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.1

File hashes

Hashes for flow_compute-3.18.0.tar.gz
Algorithm Hash digest
SHA256 d67c0c34c65aa5cf570aeca14e16e4eb47242ef383b2af030c2451c95fa3bcba
MD5 aec184833eae8b8d2046b2e0c40a2e38
BLAKE2b-256 fab3a4854ab87576109e1a72bed555f6ddfa1723e319414b6d3a44369c37adba

See more details on using hashes here.

File details

Details for the file flow_compute-3.18.0-py3-none-any.whl.

File metadata

File hashes

Hashes for flow_compute-3.18.0-py3-none-any.whl
Algorithm Hash digest
SHA256 de348a7574b000d13299d39e24c053639ce3eca2d5a69ea61833a52c195ec585
MD5 1be4d1a80d722d0b14d89b52d4bee6f1
BLAKE2b-256 c7d556121e4cbe0375cfe6afc8f1dd3eb0a89a9293be6f87d1ca2bf9d4d0df9a

See more details on using hashes here.

File details

Details for the file flow_compute-3.18.0-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for flow_compute-3.18.0-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0abdecffb8752b5fb1f3b5739587bbd7289d56c33fa6d8868e55c98931820ad4
MD5 0bc29e7d46a4a657524432603a26e25e
BLAKE2b-256 861d903ef7f7bbcb49a9e7ac62a61c5b4315c7e052405f1a1e60b81dbdf3de44

See more details on using hashes here.

File details

Details for the file flow_compute-3.18.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for flow_compute-3.18.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 964e39f1b63797b0879a6863ac4efed979d06966f79e47c6cc3071b1c724aaeb
MD5 9af8961574660e6daec1b834a9661d38
BLAKE2b-256 97ae0520fdfe2d46865091ce32f9f8335ee16e53927b68b3dc7572a4eb130831

See more details on using hashes here.

File details

Details for the file flow_compute-3.18.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for flow_compute-3.18.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0c6aa22e16a57b9eeb395d433c6be3347e46da32372489e510f3a9e48d5abf64
MD5 04cd4b33c7a546f08babdd6a141ba251
BLAKE2b-256 2827a991e336ab19b873d1eaf93f2e21148c7fe3b61c02feab2f4c1d7f009bcc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page