Skip to main content

AWS GPU Capacity Management Agent with Pydantic AI

Project description

AWS AI Capacity - GPU Capacity Management Agent

A Pydantic AI-powered agent for managing and querying AWS GPU capacity, with a Chainlit chat interface and CLI for automated reporting.

Features

  • SageMaker Training Plans: Search offerings, list plans, check utilization
  • EC2 Capacity: Query reservations, check instance availability
  • GPU Specs: Compare instance types and their capabilities
  • Chat Interface: Interactive Chainlit UI for ad-hoc queries
  • CLI Reports: Automated report generation for cron jobs

Quick Start (uvx)

Run directly without installing:

# With inline environment variables
AWS_PROFILE=myprofile uvx aws-ai-capacity chat "What GPU capacity is available?"

# Or export first
export AWS_PROFILE=myprofile
uvx aws-ai-capacity chat "What p5 training plans are available?"
uvx aws-ai-capacity report daily

Installation (Development)

# Clone and install
git clone https://github.com/drewdresser/aws-ai-capacity.git
cd aws-ai-capacity
uv sync

# Copy environment template
cp .env.example .env

Configuration

The agent uses AWS Bedrock for the LLM, so it uses your existing AWS credentials.

Required AWS permissions:

  • bedrock:InvokeModel - For Claude model access
  • sagemaker:SearchTrainingPlanOfferings - Query available plans
  • sagemaker:ListTrainingPlans - List existing plans
  • sagemaker:DescribeTrainingPlan - Get plan details
  • ec2:DescribeCapacityReservations - Query reservations
  • ec2:DescribeInstanceTypeOfferings - Check availability

Configure in .env:

AWS_REGION=us-east-1
AWS_PROFILE=default  # Optional

Usage

Chat Interface

Start the Chainlit UI:

uv run aws-ai-capacity serve

Then open http://localhost:8000

CLI Commands

# Single query
uv run aws-ai-capacity chat "What p5 training plans are available?"

# Generate reports
uv run aws-ai-capacity report daily
uv run aws-ai-capacity report availability
uv run aws-ai-capacity report training-plans

# Save report to file
uv run aws-ai-capacity report daily -o capacity-report.md

# Generate all reports (for cron)
uv run aws-ai-capacity cron-report -d ./reports

# List GPU instance types
uv run aws-ai-capacity list-instance-types

Cron Job Example

# Daily capacity report at 8am
0 8 * * * cd /path/to/ai-capacity && uv run aws-ai-capacity cron-report -d ./reports

Example Questions

  • "What p5.48xlarge training plans are available for the next week?"
  • "Show me all active capacity reservations"
  • "Which regions have p4d.24xlarge available?"
  • "Compare the specs of p5 vs p4d instances"
  • "Generate a daily capacity report"
  • "Search for available H100 capacity"

Project Structure

ai-capacity/
├── src/ai_capacity/
│   ├── agent/          # Pydantic AI agent definition
│   ├── tools/          # AWS API tools (SageMaker, EC2)
│   ├── cli/            # Typer CLI commands
│   ├── ui/             # Chainlit chat interface
│   └── config.py       # Settings management
├── chainlit.md         # Chat welcome message
└── .env.example        # Environment template

Development

# Install with dev dependencies
uv sync --all-extras

# Run tests
uv run pytest

# Lint
uv run ruff check .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws_ai_capacity-0.1.1.tar.gz (172.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aws_ai_capacity-0.1.1-py3-none-any.whl (22.0 kB view details)

Uploaded Python 3

File details

Details for the file aws_ai_capacity-0.1.1.tar.gz.

File metadata

  • Download URL: aws_ai_capacity-0.1.1.tar.gz
  • Upload date:
  • Size: 172.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aws_ai_capacity-0.1.1.tar.gz
Algorithm Hash digest
SHA256 946b886a087ec39ed439f25fb88916c37c7344dee526e1195af3f34b6701f147
MD5 c9386a8d0d9cdb9eb5373edc1c980246
BLAKE2b-256 d5f7f8eab7277f69aedbae1d46ebbca32fbc52a88f96dfec141931c49b827370

See more details on using hashes here.

Provenance

The following attestation bundles were made for aws_ai_capacity-0.1.1.tar.gz:

Publisher: publish.yml on drewdresser/aws-ai-capacity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aws_ai_capacity-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for aws_ai_capacity-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d164f5b5ecba5364cbe86b3d790204a9c613ac710386e73c119b8d3976f2fcb6
MD5 345eadc3bd49a3c4758a796e7af48057
BLAKE2b-256 e0c44d4b32a94f00401361091e25a71584cbdfcea6221d85359e274cabf44bea

See more details on using hashes here.

Provenance

The following attestation bundles were made for aws_ai_capacity-0.1.1-py3-none-any.whl:

Publisher: publish.yml on drewdresser/aws-ai-capacity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page