Skip to main content

AWS GPU Capacity Management Agent with Pydantic AI

Project description

AWS AI Capacity - GPU Capacity Management Agent

A Pydantic AI-powered agent for managing and querying AWS GPU capacity, with a Chainlit chat interface and CLI for automated reporting.

Features

  • SageMaker Training Plans: Search offerings, list plans, check utilization
  • EC2 Capacity: Query reservations, check instance availability
  • Spot Capacity: Spot placement scores (1-10) and price history across regions
  • On-Demand Capacity: Launch-and-terminate availability check (opt-in, costs money)
  • GPU Utilization: View running GPU instances with spot vs on-demand breakdown
  • GPU Specs: Compare instance types and their capabilities
  • Chat Interface: Interactive Chainlit UI for ad-hoc queries
  • CLI Reports: Automated report generation for cron jobs

Quick Start (uvx)

Run directly without installing:

# With inline environment variables
AWS_PROFILE=myprofile uvx aws-ai-capacity chat "What GPU capacity is available?"

# Or export first
export AWS_PROFILE=myprofile
uvx aws-ai-capacity chat "What p5 training plans are available?"
uvx aws-ai-capacity report daily

Installation (Development)

# Clone and install
git clone https://github.com/drewdresser/aws-ai-capacity.git
cd aws-ai-capacity
uv sync

# Copy environment template
cp .env.example .env

Configuration

The agent uses AWS Bedrock with Claude Opus 4.5 (anthropic.claude-opus-4-5-20251101-v1:0) by default.

Required AWS permissions:

  • bedrock:InvokeModel - For Claude model access
  • sagemaker:SearchTrainingPlanOfferings - Query available plans
  • sagemaker:ListTrainingPlans - List existing plans
  • sagemaker:DescribeTrainingPlan - Get plan details
  • ec2:DescribeCapacityReservations - Query reservations
  • ec2:DescribeInstanceTypeOfferings - Check availability
  • ec2:DescribeInstances - List running GPU instances
  • ec2:DescribeSpotPriceHistory - Spot price trends
  • ec2:GetSpotPlacementScores - Spot capacity scores
  • ec2:RunInstances / ec2:TerminateInstances - On-demand capacity check (opt-in)
  • ssm:GetParameter - AMI lookup for on-demand check

Environment Variables

For development or to override defaults, create a .env file:

# AWS credentials
AWS_PROFILE=myprofile
AWS_REGION=us-east-1

# Override the default model
BEDROCK_MODEL_ID=us.anthropic.claude-opus-4-5-20251101-v1:0
BEDROCK_REGION=us-east-1

# Other options
AGENT_MAX_RETRIES=3
LOG_LEVEL=DEBUG

See .env.example for all available options.

Usage

Chat Interface

Start the Chainlit UI:

uv run aws-ai-capacity serve

Then open http://localhost:8000

CLI Commands

# Single query
uv run aws-ai-capacity chat "What p5 training plans are available?"

# Debug mode - shows all tool calls made by the agent
uv run aws-ai-capacity chat "What p5 training plans are available?" --debug

# Generate reports
uv run aws-ai-capacity report daily
uv run aws-ai-capacity report availability
uv run aws-ai-capacity report training-plans
uv run aws-ai-capacity report spot

# Save report to file
uv run aws-ai-capacity report daily -o capacity-report.md

# Generate all reports (for cron)
uv run aws-ai-capacity cron-report -d ./reports

# List GPU instance types
uv run aws-ai-capacity list-instance-types

Cron Job Example

# Daily capacity report at 8am
0 8 * * * cd /path/to/ai-capacity && uv run aws-ai-capacity cron-report -d ./reports

Example Questions

  • "What p5.48xlarge training plans are available for the next week?"
  • "Show me all active capacity reservations"
  • "Which regions have p4d.24xlarge available?"
  • "Compare the specs of p5 vs p4d instances"
  • "What are the spot placement scores for p5.48xlarge across all regions?"
  • "Show me spot price history for p4d.24xlarge in us-east-1"
  • "Show me all running GPU instances in my account"
  • "Check on-demand capacity for g5.xlarge in us-east-1"

Project Structure

ai-capacity/
├── src/ai_capacity/
│   ├── agent/          # Pydantic AI agent definition
│   ├── tools/          # AWS API tools (SageMaker, EC2, Spot/On-Demand)
│   ├── cli/            # Typer CLI commands
│   ├── ui/             # Chainlit chat interface
│   └── config.py       # Settings management
├── chainlit.md         # Chat welcome message
└── .env.example        # Environment template

Development

# Install with dev dependencies
uv sync --all-extras

# Run tests
uv run pytest

# Lint
uv run ruff check .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws_ai_capacity-0.1.3.tar.gz (177.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aws_ai_capacity-0.1.3-py3-none-any.whl (28.5 kB view details)

Uploaded Python 3

File details

Details for the file aws_ai_capacity-0.1.3.tar.gz.

File metadata

  • Download URL: aws_ai_capacity-0.1.3.tar.gz
  • Upload date:
  • Size: 177.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aws_ai_capacity-0.1.3.tar.gz
Algorithm Hash digest
SHA256 021117f114e78de5418e63af10edf8d5c6b9a81b0e79afc29f6b9171e3e5af2b
MD5 7292bc0fe202c8033b5513f01d175c42
BLAKE2b-256 435d5de80a52d8431528575feeee6cdbd2d397d446f77365858a76648b4e8409

See more details on using hashes here.

Provenance

The following attestation bundles were made for aws_ai_capacity-0.1.3.tar.gz:

Publisher: publish.yml on drewdresser/aws-ai-capacity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aws_ai_capacity-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for aws_ai_capacity-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e5ddface99c8212d3cfb6c093d68a510a9da384389f05f7fcafe86aa089bed4f
MD5 5f23cf245d007b6664ec3c73c779dbe5
BLAKE2b-256 2314b53b75a9bb0c4cd0c2e3221a53ea641b73ad5c9f7811a2427c668e40de01

See more details on using hashes here.

Provenance

The following attestation bundles were made for aws_ai_capacity-0.1.3-py3-none-any.whl:

Publisher: publish.yml on drewdresser/aws-ai-capacity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page