Skip to main content

AWS GPU Capacity Management Agent with Pydantic AI

Project description

AWS AI Capacity - GPU Capacity Management Agent

A Pydantic AI-powered agent for managing and querying AWS GPU capacity, with a Chainlit chat interface and CLI for automated reporting.

Features

  • SageMaker Training Plans: Search offerings, list plans, check utilization
  • EC2 Capacity: Query reservations, check instance availability
  • GPU Specs: Compare instance types and their capabilities
  • Chat Interface: Interactive Chainlit UI for ad-hoc queries
  • CLI Reports: Automated report generation for cron jobs

Installation

# Clone and install
cd ai-capacity
uv sync

# Copy environment template
cp .env.example .env

Configuration

The agent uses AWS Bedrock for the LLM, so it uses your existing AWS credentials.

Required AWS permissions:

  • bedrock:InvokeModel - For Claude model access
  • sagemaker:SearchTrainingPlanOfferings - Query available plans
  • sagemaker:ListTrainingPlans - List existing plans
  • sagemaker:DescribeTrainingPlan - Get plan details
  • ec2:DescribeCapacityReservations - Query reservations
  • ec2:DescribeInstanceTypeOfferings - Check availability

Configure in .env:

AWS_REGION=us-east-1
AWS_PROFILE=default  # Optional

Usage

Chat Interface

Start the Chainlit UI:

uv run aws-ai-capacity serve

Then open http://localhost:8000

CLI Commands

# Single query
uv run aws-ai-capacity chat "What p5 training plans are available?"

# Generate reports
uv run aws-ai-capacity report daily
uv run aws-ai-capacity report availability
uv run aws-ai-capacity report training-plans

# Save report to file
uv run aws-ai-capacity report daily -o capacity-report.md

# Generate all reports (for cron)
uv run aws-ai-capacity cron-report -d ./reports

# List GPU instance types
uv run aws-ai-capacity list-instance-types

Cron Job Example

# Daily capacity report at 8am
0 8 * * * cd /path/to/ai-capacity && uv run aws-ai-capacity cron-report -d ./reports

Example Questions

  • "What p5.48xlarge training plans are available for the next week?"
  • "Show me all active capacity reservations"
  • "Which regions have p4d.24xlarge available?"
  • "Compare the specs of p5 vs p4d instances"
  • "Generate a daily capacity report"
  • "Search for available H100 capacity"

Project Structure

ai-capacity/
├── src/ai_capacity/
│   ├── agent/          # Pydantic AI agent definition
│   ├── tools/          # AWS API tools (SageMaker, EC2)
│   ├── cli/            # Typer CLI commands
│   ├── ui/             # Chainlit chat interface
│   └── config.py       # Settings management
├── chainlit.md         # Chat welcome message
└── .env.example        # Environment template

Development

# Install with dev dependencies
uv sync --all-extras

# Run tests
uv run pytest

# Lint
uv run ruff check .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws_ai_capacity-0.1.0.tar.gz (171.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aws_ai_capacity-0.1.0-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file aws_ai_capacity-0.1.0.tar.gz.

File metadata

  • Download URL: aws_ai_capacity-0.1.0.tar.gz
  • Upload date:
  • Size: 171.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aws_ai_capacity-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2539f9bae1f8925fa2a694d8c0652aa5889e38da04e2e051ba0f1b9c7995ad10
MD5 5a586d8c606e74c4d28a8261e86f0040
BLAKE2b-256 df7f91b4d5ffe657dca7f3dea39c0a757695201a3d64ebf80af1c07db9c69830

See more details on using hashes here.

Provenance

The following attestation bundles were made for aws_ai_capacity-0.1.0.tar.gz:

Publisher: publish.yml on drewdresser/aws-ai-capacity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aws_ai_capacity-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for aws_ai_capacity-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dcfa5cb034b552573796492e85080965e612b008654a25883be42356a32c0074
MD5 3ac315c1b4073ededf5a888d76ab16d7
BLAKE2b-256 f9597ad54acb073a1463d3c2b9139858d6113d1d363e85af2adeb9dd12c1e74c

See more details on using hashes here.

Provenance

The following attestation bundles were made for aws_ai_capacity-0.1.0-py3-none-any.whl:

Publisher: publish.yml on drewdresser/aws-ai-capacity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page