AWS GPU Capacity Management Agent with Pydantic AI
Project description
AWS AI Capacity - GPU Capacity Management Agent
A Pydantic AI-powered agent for managing and querying AWS GPU capacity, with a Chainlit chat interface and CLI for automated reporting.
Features
- SageMaker Training Plans: Search offerings, list plans, check utilization
- EC2 Capacity: Query reservations, check instance availability
- GPU Specs: Compare instance types and their capabilities
- Chat Interface: Interactive Chainlit UI for ad-hoc queries
- CLI Reports: Automated report generation for cron jobs
Quick Start (uvx)
Run directly without installing:
# With inline environment variables
AWS_PROFILE=myprofile uvx aws-ai-capacity chat "What GPU capacity is available?"
# Or export first
export AWS_PROFILE=myprofile
uvx aws-ai-capacity chat "What p5 training plans are available?"
uvx aws-ai-capacity report daily
Installation (Development)
# Clone and install
git clone https://github.com/drewdresser/aws-ai-capacity.git
cd aws-ai-capacity
uv sync
# Copy environment template
cp .env.example .env
Configuration
The agent uses AWS Bedrock with Claude Opus 4.5 (anthropic.claude-opus-4-5-20251101-v1:0) by default.
Required AWS permissions:
bedrock:InvokeModel- For Claude model accesssagemaker:SearchTrainingPlanOfferings- Query available planssagemaker:ListTrainingPlans- List existing planssagemaker:DescribeTrainingPlan- Get plan detailsec2:DescribeCapacityReservations- Query reservationsec2:DescribeInstanceTypeOfferings- Check availability
Environment Variables
For development or to override defaults, create a .env file:
# AWS credentials
AWS_PROFILE=myprofile
AWS_REGION=us-east-1
# Override the default model
BEDROCK_MODEL_ID=us.anthropic.claude-opus-4-5-20251101-v1:0
BEDROCK_REGION=us-east-1
# Other options
AGENT_MAX_RETRIES=3
LOG_LEVEL=DEBUG
See .env.example for all available options.
Usage
Chat Interface
Start the Chainlit UI:
uv run aws-ai-capacity serve
Then open http://localhost:8000
CLI Commands
# Single query
uv run aws-ai-capacity chat "What p5 training plans are available?"
# Debug mode - shows all tool calls made by the agent
uv run aws-ai-capacity chat "What p5 training plans are available?" --debug
# Generate reports
uv run aws-ai-capacity report daily
uv run aws-ai-capacity report availability
uv run aws-ai-capacity report training-plans
# Save report to file
uv run aws-ai-capacity report daily -o capacity-report.md
# Generate all reports (for cron)
uv run aws-ai-capacity cron-report -d ./reports
# List GPU instance types
uv run aws-ai-capacity list-instance-types
Cron Job Example
# Daily capacity report at 8am
0 8 * * * cd /path/to/ai-capacity && uv run aws-ai-capacity cron-report -d ./reports
Example Questions
- "What p5.48xlarge training plans are available for the next week?"
- "Show me all active capacity reservations"
- "Which regions have p4d.24xlarge available?"
- "Compare the specs of p5 vs p4d instances"
- "Generate a daily capacity report"
- "Search for available H100 capacity"
Project Structure
ai-capacity/
├── src/ai_capacity/
│ ├── agent/ # Pydantic AI agent definition
│ ├── tools/ # AWS API tools (SageMaker, EC2)
│ ├── cli/ # Typer CLI commands
│ ├── ui/ # Chainlit chat interface
│ └── config.py # Settings management
├── chainlit.md # Chat welcome message
└── .env.example # Environment template
Development
# Install with dev dependencies
uv sync --all-extras
# Run tests
uv run pytest
# Lint
uv run ruff check .
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aws_ai_capacity-0.1.2.tar.gz.
File metadata
- Download URL: aws_ai_capacity-0.1.2.tar.gz
- Upload date:
- Size: 172.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
983cd4bb6aa90eb64779d8fd56db84ffae92d954c0cdc0063ed33a1f065f47c3
|
|
| MD5 |
8bd0ded4eb42b73679798756d3b88472
|
|
| BLAKE2b-256 |
5847289cc6b01642bde6eddc39aac9b153d3a63dbf07aeab9c2de3164e621163
|
Provenance
The following attestation bundles were made for aws_ai_capacity-0.1.2.tar.gz:
Publisher:
publish.yml on drewdresser/aws-ai-capacity
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aws_ai_capacity-0.1.2.tar.gz -
Subject digest:
983cd4bb6aa90eb64779d8fd56db84ffae92d954c0cdc0063ed33a1f065f47c3 - Sigstore transparency entry: 871595710
- Sigstore integration time:
-
Permalink:
drewdresser/aws-ai-capacity@eacd6bf33da558f09fd5ff1abc7f39f827428f08 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/drewdresser
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@eacd6bf33da558f09fd5ff1abc7f39f827428f08 -
Trigger Event:
release
-
Statement type:
File details
Details for the file aws_ai_capacity-0.1.2-py3-none-any.whl.
File metadata
- Download URL: aws_ai_capacity-0.1.2-py3-none-any.whl
- Upload date:
- Size: 22.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
feb11a1eec8be4466645a1bafe471f8c2410d7481292a42ebe1b2d68d3570a53
|
|
| MD5 |
e5884ed0f83cf7ecc5306848536ae0d9
|
|
| BLAKE2b-256 |
bd046273a9806d18dcff9161e2ffd853e0ce1e97e7cf49fb0a56758b923f556c
|
Provenance
The following attestation bundles were made for aws_ai_capacity-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on drewdresser/aws-ai-capacity
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aws_ai_capacity-0.1.2-py3-none-any.whl -
Subject digest:
feb11a1eec8be4466645a1bafe471f8c2410d7481292a42ebe1b2d68d3570a53 - Sigstore transparency entry: 871595714
- Sigstore integration time:
-
Permalink:
drewdresser/aws-ai-capacity@eacd6bf33da558f09fd5ff1abc7f39f827428f08 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/drewdresser
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@eacd6bf33da558f09fd5ff1abc7f39f827428f08 -
Trigger Event:
release
-
Statement type: