Skip to main content

AWS GPU Capacity Management Agent with Pydantic AI

Project description

AWS AI Capacity - GPU Capacity Management Agent

A Pydantic AI-powered agent for managing and querying AWS GPU capacity, with a Chainlit chat interface and CLI for automated reporting.

Features

  • SageMaker Training Plans: Search offerings, list plans, check utilization
  • EC2 Capacity: Query reservations, check instance availability
  • GPU Specs: Compare instance types and their capabilities
  • Chat Interface: Interactive Chainlit UI for ad-hoc queries
  • CLI Reports: Automated report generation for cron jobs

Quick Start (uvx)

Run directly without installing:

# With inline environment variables
AWS_PROFILE=myprofile uvx aws-ai-capacity chat "What GPU capacity is available?"

# Or export first
export AWS_PROFILE=myprofile
uvx aws-ai-capacity chat "What p5 training plans are available?"
uvx aws-ai-capacity report daily

Installation (Development)

# Clone and install
git clone https://github.com/drewdresser/aws-ai-capacity.git
cd aws-ai-capacity
uv sync

# Copy environment template
cp .env.example .env

Configuration

The agent uses AWS Bedrock with Claude Opus 4.5 (anthropic.claude-opus-4-5-20251101-v1:0) by default.

Required AWS permissions:

  • bedrock:InvokeModel - For Claude model access
  • sagemaker:SearchTrainingPlanOfferings - Query available plans
  • sagemaker:ListTrainingPlans - List existing plans
  • sagemaker:DescribeTrainingPlan - Get plan details
  • ec2:DescribeCapacityReservations - Query reservations
  • ec2:DescribeInstanceTypeOfferings - Check availability

Environment Variables

For development or to override defaults, create a .env file:

# AWS credentials
AWS_PROFILE=myprofile
AWS_REGION=us-east-1

# Override the default model
BEDROCK_MODEL_ID=us.anthropic.claude-opus-4-5-20251101-v1:0
BEDROCK_REGION=us-east-1

# Other options
AGENT_MAX_RETRIES=3
LOG_LEVEL=DEBUG

See .env.example for all available options.

Usage

Chat Interface

Start the Chainlit UI:

uv run aws-ai-capacity serve

Then open http://localhost:8000

CLI Commands

# Single query
uv run aws-ai-capacity chat "What p5 training plans are available?"

# Debug mode - shows all tool calls made by the agent
uv run aws-ai-capacity chat "What p5 training plans are available?" --debug

# Generate reports
uv run aws-ai-capacity report daily
uv run aws-ai-capacity report availability
uv run aws-ai-capacity report training-plans

# Save report to file
uv run aws-ai-capacity report daily -o capacity-report.md

# Generate all reports (for cron)
uv run aws-ai-capacity cron-report -d ./reports

# List GPU instance types
uv run aws-ai-capacity list-instance-types

Cron Job Example

# Daily capacity report at 8am
0 8 * * * cd /path/to/ai-capacity && uv run aws-ai-capacity cron-report -d ./reports

Example Questions

  • "What p5.48xlarge training plans are available for the next week?"
  • "Show me all active capacity reservations"
  • "Which regions have p4d.24xlarge available?"
  • "Compare the specs of p5 vs p4d instances"
  • "Generate a daily capacity report"
  • "Search for available H100 capacity"

Project Structure

ai-capacity/
├── src/ai_capacity/
│   ├── agent/          # Pydantic AI agent definition
│   ├── tools/          # AWS API tools (SageMaker, EC2)
│   ├── cli/            # Typer CLI commands
│   ├── ui/             # Chainlit chat interface
│   └── config.py       # Settings management
├── chainlit.md         # Chat welcome message
└── .env.example        # Environment template

Development

# Install with dev dependencies
uv sync --all-extras

# Run tests
uv run pytest

# Lint
uv run ruff check .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws_ai_capacity-0.1.2.tar.gz (172.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aws_ai_capacity-0.1.2-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file aws_ai_capacity-0.1.2.tar.gz.

File metadata

  • Download URL: aws_ai_capacity-0.1.2.tar.gz
  • Upload date:
  • Size: 172.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aws_ai_capacity-0.1.2.tar.gz
Algorithm Hash digest
SHA256 983cd4bb6aa90eb64779d8fd56db84ffae92d954c0cdc0063ed33a1f065f47c3
MD5 8bd0ded4eb42b73679798756d3b88472
BLAKE2b-256 5847289cc6b01642bde6eddc39aac9b153d3a63dbf07aeab9c2de3164e621163

See more details on using hashes here.

Provenance

The following attestation bundles were made for aws_ai_capacity-0.1.2.tar.gz:

Publisher: publish.yml on drewdresser/aws-ai-capacity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aws_ai_capacity-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for aws_ai_capacity-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 feb11a1eec8be4466645a1bafe471f8c2410d7481292a42ebe1b2d68d3570a53
MD5 e5884ed0f83cf7ecc5306848536ae0d9
BLAKE2b-256 bd046273a9806d18dcff9161e2ffd853e0ce1e97e7cf49fb0a56758b923f556c

See more details on using hashes here.

Provenance

The following attestation bundles were made for aws_ai_capacity-0.1.2-py3-none-any.whl:

Publisher: publish.yml on drewdresser/aws-ai-capacity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page