Skip to main content

A minimal alternative to Ray for distributed data processing on EC2 instances

Project description

poormanray

poormanray library logo

PyPI version Python 3.11+ License

A minimal alternative to Ray for distributed data processing on EC2 instances. Manage clusters, run commands, and distribute jobs without the complexity of a full Ray deployment.

Installation

Requires Python 3.11+.

# Install as a CLI tool (recommended)
uv tool install poormanray

# Or install as a library
uv pip install poormanray
pip install poormanray

Quick Start

# Create a cluster of 5 instances
pmr create --name mycluster --number 5 --instance-type i4i.2xlarge

# If you need an Ai2 project tag, use either --project...
pmr create --name mycluster --project my-ai2-project --number 5
# ...or name@project syntax:
pmr create --name mycluster@my-ai2-project --number 5

# List instances in the cluster
pmr list --name mycluster

# Run a command on all instances
pmr run --name mycluster --command "echo 'Hello from $(hostname)'"

# Terminate the cluster when done
pmr terminate --name mycluster

Prerequisites

  • AWS credentials configured via:
    • Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
    • AWS CLI (aws configure)
    • Credentials file (~/.aws/credentials)
  • SSH key pair in ~/.ssh/ (id_rsa, id_ed25519, etc.)

Project Selection (--project or name@project)

When creating instances, you can optionally set an Ai2 project tag in two equivalent ways:

# Explicit flag
pmr create --name mycluster --project my-ai2-project

# Inline syntax in --name (split on the first @)
pmr create --name mycluster@my-ai2-project

Rules enforced by the CLI:

  • --name is required and must be non-empty.
  • Do not combine --project with --name that already contains @; this raises a usage error.
  • If neither --project nor @project is provided, pmr logs a warning because some environments require project tagging to launch instances.

For Ai2 users, valid project names are listed here:

Commands

Cluster Management

create - Launch EC2 instances

pmr create --name mycluster --number 5 --instance-type i4i.2xlarge

# With explicit project flag
pmr create --name mycluster --project my-ai2-project --number 5

# Equivalent inline project syntax
pmr create --name mycluster@my-ai2-project --number 5

# Options:
#   -n, --name          Cluster name (required)
#   -p, --project       Ai2 project name (or specify as name@project)
#   -N, --number        Number of instances (default: 1)
#   -t, --instance-type EC2 instance type (default: i4i.xlarge)
#   -r, --region        AWS region (default: us-east-1)
#   -a, --ami-id        Custom AMI ID (default: latest Amazon Linux 2023 AMI from SSM)
#   -d, --detach        Don't wait for instances to be ready
#   -j, --parallelism   Max concurrent instance creations
#   --zone              Availability zone
#   --storage-type      EBS volume type (gp3, gp2, io1, io2, io2e, st1, sc1)
#   --storage-size      Root volume size in GB
#   --storage-iops      IOPS for the root volume

list - Show cluster instances

pmr list --name mycluster

# Output includes: instance ID, name, type, state, IP, status checks, tags

terminate - Destroy instances

pmr terminate --name mycluster

# Terminate specific instances only:
pmr terminate --name mycluster -i i-abc123 -i i-def456

# Cap concurrent termination requests
pmr terminate --name mycluster --parallelism 4

pause / resume - Stop and start instances

pmr pause --name mycluster    # Stop instances (preserves EBS)
pmr resume --name mycluster   # Start stopped instances

# Control stop/start concurrency
pmr pause --name mycluster --parallelism 8
pmr resume --name mycluster --parallelism 8

wait - Wait for instances to be ready

Polls instance status until all instances are running and passing health checks. Optionally runs a readiness command via SSH on each instance.

# Wait for all instances to be healthy
pmr wait --name mycluster

# Wait with a custom readiness check
pmr wait --name mycluster --command "test -f /tmp/ready"

# Wait with a timeout and custom poll interval
pmr wait --name mycluster --timeout 300 --poll-interval 15

# Options:
#   --poll-interval     Seconds between status checks (default: 10)
#   -T, --timeout       Timeout in seconds (default: wait indefinitely)
#   -c, --command       Command that must exit 0 for instance to be considered ready
#   -s, --script        Script that must exit 0 for instance to be considered ready

ssh - Connect to an instance

Opens an interactive SSH session to an instance. If multiple instances match, presents a selection menu.

# SSH into an instance (interactive selection if multiple match)
pmr ssh --name mycluster

# SSH into a specific instance
pmr ssh --name mycluster -i i-abc123

version - Print version

pmr version

Command Execution

run - Execute commands on instances

# Run a command
pmr run --name mycluster --command "df -h"

# Run a script
pmr run --name mycluster --script ./my-script.sh

# Run in background (detached)
pmr run --name mycluster --command "long-running-job.sh" --detach

# Auto-terminate after command completes
pmr run --name mycluster --command "./job.sh" --spindown

# Run with a timeout
pmr run --name mycluster --command "df -h" --timeout 60

# Run as a different user
pmr run --name mycluster --command "whoami" --instance-username ubuntu

# Cap concurrent command execution
pmr run --name mycluster --command "df -h" --parallelism 4

run requires exactly one of --command or --script.

map - Distribute scripts across instances

Distributes scripts evenly across all instances and runs them in parallel. map expects --script to point to a directory containing executable files.

# Create scripts directory with executable scripts
ls scripts/
# job_001.sh  job_002.sh  job_003.sh  job_004.sh  job_005.sh

# Distribute and run across cluster
pmr map --name mycluster --script scripts/

# Scripts are shuffled, distributed evenly, and executed in detached screen sessions.
# Each instance gets a run_all.sh that runs its assigned scripts sequentially,
# with progress logged to run_all.log.

# Stop instances after their scripts complete
pmr map --name mycluster --script scripts/ --spindown

S3 Bucket Management

create_bucket - Create an S3 bucket

Creates a bucket with private visibility, standard tags, intelligent tiering (after 7 days), and hard-delete lifecycle (after 7 days). Bucket names must follow AWS naming rules (lowercase, 3-63 chars, no consecutive periods).

pmr create_bucket --name my-data-bucket@my-ai2-project

# Customize lifecycle timing
pmr create_bucket --name my-data-bucket@my-ai2-project \
  --tier-after-days 14 --expire-after-days 30

# Options:
#   -n, --name              Bucket name (required)
#   -p, --project           Ai2 project name (or specify as name@project)
#   -r, --region            AWS region (default: us-east-1)
#   --tier-after-days       Days before INTELLIGENT_TIERING transition (default: 7)
#   --expire-after-days     Days before hard-delete expiration (default: 7)

update_bucket - Backfill missing bucket settings

Adds missing default tags and lifecycle rules without overwriting existing values. Visibility settings are never changed.

pmr update_bucket --name my-data-bucket@my-ai2-project

delete_bucket - Delete an S3 bucket

Deletes a bucket. Fails if the bucket is not empty.

pmr delete_bucket --name my-data-bucket

# Skip confirmation prompt
pmr delete_bucket --name my-data-bucket --yes

If the bucket is not empty, pmr will suggest running s5cmd rm s3://<bucket>/* first.

Cluster Tag Management

update_cluster - Backfill missing cluster tags

Adds missing default tags (Project, Contact, Tool, ai2-project) to EC2 instances without overwriting existing tag values.

pmr update_cluster --name mycluster@my-ai2-project

# Update specific instances only
pmr update_cluster --name mycluster -i i-abc123 -i i-def456

Instance Setup

setup - Configure AWS credentials

Copies your AWS credentials to all instances in the cluster. Also installs GNU screen.

pmr setup --name mycluster

setup-d2tk - Install Dolma2 Toolkit

Sets up RAID drives, installs Rust, and builds datamap-rs and minhash-rs. Automatically runs setup first.

pmr setup-d2tk --name mycluster --detach

setup-dolma-python - Install Dolma Python

Installs Python 3.12, uv, and the dolma package. Automatically runs setup first.

pmr setup-dolma-python --name mycluster --detach

setup-decon - Install DECON toolkit

Sets up the DECON pipeline with Rust toolchain. Automatically runs setup first. Each instance receives its host index as PMR_HOST_INDEX for coordinated work.

pmr setup-decon --name mycluster --github-token ghp_xxx --detach

# Options:
#   -g, --github-token  GitHub personal access token for cloning private repos

Common Options

Base options (all commands)

Option Short Description
--name -n Resource name (required). You can encode project as name@project.
--project -p Ai2 project name (equivalent to using name@project)
--region -r AWS region (default: us-east-1)
--owner -o Owner tag for cost tracking (defaults to $USER)

Instance options (cluster/instance commands only)

Option Short Description
--instance-id -i Target specific instance(s), repeatable
--ssh-key-path -k Path to SSH private key (auto-detected from ~/.ssh/)
--detach/--no-detach -d/-nd Run in background via screen
--parallelism -j Max concurrent workers for create, terminate, pause, resume, and run
--instance-type -t EC2 instance type (default: i4i.xlarge)
--number -N Number of instances to create (default: 1)
--ami-id -a Custom AMI ID
--timeout -T Timeout in seconds for command execution
--spindown/--no-spindown -S/-NS Self-terminate instance after command completes
--command -c Command to execute on instances
--script -s Path to script file or directory to execute
--instance-username -u SSH username (default: ec2-user)

How It Works

  1. Instance Tagging: Instances are tagged with Project (cluster name), Contact (owner), and Tool (poormanray). If project is provided, ai2-project is also added.

  2. SSH Key Management: Your local SSH key is automatically imported to EC2 when creating instances.

  3. Remote Execution: Commands are executed over SSH using paramiko. Long-running commands use GNU screen for detached execution.

  4. Script Distribution: The map command base64-encodes scripts, transfers them to instances, and executes them in parallel.

Examples

Data Processing Pipeline

# 1. Create a cluster
pmr create --name dataproc --number 10 --instance-type i4i.4xlarge

# 2. Wait for all instances to be ready
pmr wait --name dataproc

# 3. Set up the environment
pmr setup-dolma-python --name dataproc --detach

# 4. Distribute processing scripts
pmr map --name dataproc --script ./processing-jobs/

# 5. Monitor progress (SSH into an instance to check)
pmr ssh --name dataproc
# or check logs across all instances:
pmr run --name dataproc --command "tail -f ~/*/run_all.log"

# 6. Clean up
pmr terminate --name dataproc

Quick One-Off Command

# Create, run, and terminate in one go
pmr create --name quickjob --number 1
pmr run --name quickjob --command "./my-job.sh" --spindown
# Instance auto-terminates after job completes

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

poormanray-0.5.0.tar.gz (8.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

poormanray-0.5.0-py3-none-any.whl (37.0 kB view details)

Uploaded Python 3

File details

Details for the file poormanray-0.5.0.tar.gz.

File metadata

  • Download URL: poormanray-0.5.0.tar.gz
  • Upload date:
  • Size: 8.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for poormanray-0.5.0.tar.gz
Algorithm Hash digest
SHA256 3b41300956f46ad6061187178ce6c1c64ea664fceaa2b177e8b00df311afdfa4
MD5 0559c56be61c7f61bacecee7128e3f25
BLAKE2b-256 3dd5450beacbfb5a1d8c167cab1992ee1d2b8bdeaa3fda0407637e3a035e0dd8

See more details on using hashes here.

Provenance

The following attestation bundles were made for poormanray-0.5.0.tar.gz:

Publisher: publish.yml on allenai/poormanray

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file poormanray-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: poormanray-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 37.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for poormanray-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8ff219ab381202cb6fd5cca382f53f7c31294dbd9a7c1e325a60960cd7cd36c2
MD5 544da4b6c238382f929a5188e658e2ac
BLAKE2b-256 c01ce30b9a2cfc173bac51da2f8fa4d891c2d9fb699d2be6b4a5f45b1d078e75

See more details on using hashes here.

Provenance

The following attestation bundles were made for poormanray-0.5.0-py3-none-any.whl:

Publisher: publish.yml on allenai/poormanray

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page