Research infrastructure management tooling.
Project description
❄️ brr ❄️
Opinionated research infrastructure tooling. Launch clusters, get SSH access, start building.
Features
- Shared filesystem — All nodes share
~/code/via EFS (AWS) or virtiofs (Nebius). - Coding tools — Install Claude Code, Codex, or Gemini. Connect with e.g.
brr attach dev claude - Autoscaling — Ray-based cluster scaling with cached instances.
- Project-based workflows — Per-repo cluster configs and project-specific dependencies.
- Auto-shutdown — Monitors CPU, GPU, and SSH activity. Shuts down idle instances to save costs.
- Dotfiles integration — Take your dev environment (vim, tmux, shell config) to every cluster node via GNU Stow.
Prerequisites
- Python 3.11+
- uv (for installation)
Quick Start
# Install (AWS only)
uv tool install brr-cli[aws]
# Install (both providers)
# uv tool install brr-cli[aws,nebius]
# Configure (interactive wizard)
brr configure # or: brr configure nebius
# Launch an H100
brr up aws:h100
# brr up nebius:h100
# Connect
brr attach aws:h100 # SSH
brr attach aws:h100 claude # Claude Code on the cluster
brr vscode aws:h100 # VS Code remote
Built-in templates use provider:name syntax (e.g. aws:h100). Inside a project, short names like brr up dev work automatically.
Supported clouds: AWS · Nebius
Projects
For per-repo cluster configs, initialize a project:
cd my-research-repo/
brr init
This creates:
.brr/
aws/
dev.yaml # Single GPU for development
cluster.yaml # CPU head + GPU workers
setup.sh # Project-specific dependencies
config.env # Project config (overrides global)
Templates are standard Ray YAML — edit them or add your own. Inside a project, use short names:
brr up # launches DEFAULT_TEMPLATE (set in .brr/config.env)
brr up dev # launches .brr/aws/dev.yaml
brr up cluster # launches .brr/aws/cluster.yaml
brr attach dev # SSH into dev cluster
brr down dev # tear down
If your project uses uv, brr init automatically adds brr-cli and ray to a brr dependency group. The cluster uses your project-locked versions — no manual setup needed.
Project config (.brr/config.env) overrides global settings (~/.brr/config.env). Use it for project-specific settings like idle timeouts or dotfiles.
Templates
Built-in templates
| Template | Instance | GPU | Workers |
|---|---|---|---|
aws:cpu |
t3.2xlarge | — | 0-2 |
aws:l4 |
gr6.4xlarge | 1x L4 | — |
aws:h100 |
p5.4xlarge | 8x H100 | — |
aws:cpu-l4 |
t3.2xlarge + g6.4xlarge | 1x L4 | 0-4 |
nebius:cpu |
8vcpu-32gb | — | 0-2 |
nebius:h100 |
1gpu-16vcpu-200gb | 1x H100 | — |
nebius:cpu-h100s |
8vcpu-32gb + 8gpu-128vcpu-1600gb | 8x H100 | 0-4 |
Overrides
Override template values inline:
brr up aws:cpu instance_type=t3.xlarge max_workers=4
brr up aws:h100 spot=true
brr up dev region=us-west-2
Preview the rendered config without launching:
brr up dev --dry-run
See available overrides for a template:
brr templates show dev
Multi-provider
Use the provider prefix for built-in templates:
brr up aws:h100
brr up nebius:h100
brr attach nebius:h100
brr down nebius:h100
Both providers can run simultaneously.
Customization
Node setup
~/.brr/setup.sh runs on every node boot. It installs packages, mounts shared storage, sets up Python/Ray, GitHub SSH keys, AI coding tools, dotfiles, and the idle shutdown daemon.
Edit it to customize:
vim ~/.brr/setup.sh
Project-specific dependencies go in .brr/{provider}/setup.sh (created by brr init), which runs after the global setup.
AI coding tools
Install AI coding assistants on every cluster node:
brr configure tools # select Claude Code, Codex, and/or Gemini CLI
Then connect and start coding:
brr attach dev claude
Dotfiles
Set a dotfiles repo to sync your dev environment to every node:
brr config set DOTFILES_REPO "https://github.com/user/dotfiles"
The repo is cloned to ~/dotfiles and installed via install.sh (if present) or GNU Stow.
Image baking
Bake the global setup into AMIs/images for fast boot:
brr bake aws # bake both CPU + GPU AMIs
brr bake status # check if baked images are up to date
After baking, clusters boot from the pre-built image. Only project-specific deps need to install. brr up warns when setup.sh has changed since the last bake.
Idle shutdown
A systemd daemon monitors CPU, GPU, and SSH activity. When all signals are idle for the configured timeout, the instance shuts down.
Configure in ~/.brr/config.env:
IDLE_SHUTDOWN_ENABLED="true"
IDLE_SHUTDOWN_TIMEOUT_MIN="30"
IDLE_SHUTDOWN_CPU_THRESHOLD="10"
IDLE_SHUTDOWN_GRACE_MIN="15"
The grace period prevents shutdown during initial setup. Monitor on a node with journalctl -u idle-shutdown -f.
Node caching (Nebius)
By default, Nebius nodes are deleted on scale-down. Unlike AWS, stopped Nebius instances still incur disk charges, so deleting is cheaper.
To keep nodes stopped instead (faster restart, but you pay for disks while idle), enable caching in your template's provider config:
provider:
cache_stopped_nodes: true
AWS nodes are cached (stopped) by default — this setting only affects Nebius.
Commands
| Command | Description |
|---|---|
brr up TEMPLATE [OVERRIDES...] |
Launch or update a cluster (aws:h100, dev, or path.yaml) |
brr up TEMPLATE --dry-run |
Preview rendered config without launching |
brr down TEMPLATE |
Stop a cluster (instances preserved for fast restart) |
brr down TEMPLATE --delete |
Terminate all instances and remove staging files |
brr attach TEMPLATE [COMMAND] |
SSH into head node, optionally run a command (e.g. claude) |
brr list [--all] |
List clusters (project-scoped by default, --all for everything) |
brr clean [TEMPLATE] |
Terminate stopped (cached) instances |
brr vscode TEMPLATE |
Open VS Code on a running cluster |
brr templates list |
List built-in templates |
brr templates show TEMPLATE |
Show template config and overrides |
brr init |
Initialize a project (interactive provider selection) |
brr configure [cloud|tools|general] |
Interactive setup (cloud provider, AI tools, settings) |
brr config [list|get|set|path] |
View and manage configuration |
brr bake [aws|nebius] |
Bake setup into cloud images |
brr bake status |
Check if baked images are up to date |
brr completion [bash|zsh|fish] |
Shell completion (--install to add to shell rc) |
brr nuke [aws|nebius] |
Tear down all cloud resources |
Cloud Setup
AWS Setup
- Attach the IAM policy to your IAM user
- Install the AWS CLI and run
aws configure - (Optional) For GitHub SSH access on clusters, authenticate the GitHub CLI:
gh auth login gh auth refresh -h github.com -s admin:public_key
- Run the setup wizard:
brr configure aws
Nebius Setup
- Install the Nebius CLI and run
nebius init - Create a service account with editor permissions:
TENANT_ID="<your-tenant-id>" # from console.nebius.com → Administration SA_ID=$(nebius iam service-account create \ --name brr-cluster --format json | jq -r '.metadata.id') EDITORS_GROUP_ID=$(nebius iam group get-by-name \ --name editors --parent-id $TENANT_ID --format json | jq -r '.metadata.id') nebius iam group-membership create \ --parent-id $EDITORS_GROUP_ID --member-id $SA_ID
- Generate credentials:
mkdir -p ~/.nebius nebius iam auth-public-key generate \ --service-account-id $SA_ID --output ~/.nebius/credentials.json
- Run the setup wizard:
brr configure nebius
Acknowledgments
This project started as a fork of aws_wiz by Bes and has been inspired by discussions with colleagues from the Encode: AI for Science Fellowship.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file brr_cli-0.2.0.tar.gz.
File metadata
- Download URL: brr_cli-0.2.0.tar.gz
- Upload date:
- Size: 188.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6e72a6f8e1b4f4d3b468d43adb286dda01423b509743dfb858f367877856aff
|
|
| MD5 |
d5ae2763b7b45cdb768a8e7b6d424a5b
|
|
| BLAKE2b-256 |
b5ccf69cbdf149f032c7fccece50181ae7fa7c4c55cf2d84a6fee0f7c88d4361
|
Provenance
The following attestation bundles were made for brr_cli-0.2.0.tar.gz:
Publisher:
publish.yml on joncarter1/brr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
brr_cli-0.2.0.tar.gz -
Subject digest:
e6e72a6f8e1b4f4d3b468d43adb286dda01423b509743dfb858f367877856aff - Sigstore transparency entry: 962936970
- Sigstore integration time:
-
Permalink:
joncarter1/brr@522ed702483cabbc0eef25ba5d123d8920a0f232 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/joncarter1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@522ed702483cabbc0eef25ba5d123d8920a0f232 -
Trigger Event:
release
-
Statement type:
File details
Details for the file brr_cli-0.2.0-py3-none-any.whl.
File metadata
- Download URL: brr_cli-0.2.0-py3-none-any.whl
- Upload date:
- Size: 74.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
767ff6f858d8eff4ea12c16851c398e1fbb3ee0cbf928d936b8df0a467618223
|
|
| MD5 |
8f94dfe0003510f05be9a1ee91178669
|
|
| BLAKE2b-256 |
b0b3a4376a8ffa02cfdf04a6fb6dea80713efcb9e479008cdb4bb6c20dc3dbf4
|
Provenance
The following attestation bundles were made for brr_cli-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on joncarter1/brr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
brr_cli-0.2.0-py3-none-any.whl -
Subject digest:
767ff6f858d8eff4ea12c16851c398e1fbb3ee0cbf928d936b8df0a467618223 - Sigstore transparency entry: 962937000
- Sigstore integration time:
-
Permalink:
joncarter1/brr@522ed702483cabbc0eef25ba5d123d8920a0f232 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/joncarter1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@522ed702483cabbc0eef25ba5d123d8920a0f232 -
Trigger Event:
release
-
Statement type: