GPU energy monitoring agent — per-job cost attribution for AI teams

These details have not been verified by PyPI

Project links

Project description

AluminatAI - GPU Energy Intelligence Platform

Know exactly what your GPUs cost. Every watt, every dollar, every job.

AluminatAI is an open-source GPU energy monitoring platform that gives AI teams real-time visibility into power consumption, energy costs, and utilization across their GPU fleet. A lightweight Python agent runs on your GPU machines and streams metrics to a cloud dashboard where you can track spending, compare jobs, and optimize workloads.

Live: https://www.aluminatiai.com/

How It Works

┌──────────────────┐     HTTPS/JSON      ┌──────────────────────────┐
│   GPU Machine     │ ─────────────────► │   AluminatAI Platform     │
│                    │   every 60s        │                            │
│  ┌──────────────┐ │                     │  ┌──────────┐             │
│  │  Python Agent │ │                     │  │ Next.js  │  Vercel     │
│  │  (pynvml)    │ │                     │  │ API      │             │
│  └──────────────┘ │                     │  └────┬─────┘             │
│                    │                     │       │                    │
│  NVIDIA A100/H100  │                     │  ┌────▼─────┐             │
│  RTX 3090/4090     │                     │  │ Supabase │  PostgreSQL │
│  Any NVIDIA GPU    │                     │  │ Database │  + RLS      │
└──────────────────┘                     │  └────┬─────┘             │
                                          │       │                    │
                                          │  ┌────▼─────┐             │
                                          │  │Dashboard │  React      │
                                          │  │ UI       │  + Recharts │
                                          │  └──────────┘             │
                                          └──────────────────────────┘

Features

Real-Time GPU Monitoring - Power draw, utilization, temperature, memory, and clock speeds sampled every 5 seconds
Energy Cost Tracking - Calculates energy consumption in kWh and converts to dollar costs at your electricity rate
Job Attribution - Track which training jobs consumed how much energy and what they cost
Dashboard - Three views: Today's Cost, Jobs Table, and Utilization vs Power chart
Free Trial - 30-day free trial with auto-generated API keys on signup
Lightweight Agent - <1% CPU, ~50MB RAM overhead on GPU machines
Secure - Row-Level Security, API key auth with pgcrypto, rate limiting, server-side validation
Minimax Scheduler - Bonus hackathon project: AI-powered job scheduling that balances speed vs. energy cost

Project Structure

AluminatAI/
├── aluminatai-landing/          # Next.js web platform (deployed to Vercel)
│   ├── app/
│   │   ├── api/
│   │   │   ├── metrics/ingest/  # GPU metrics ingestion endpoint
│   │   │   ├── dashboard/       # today-cost, jobs, utilization-chart
│   │   │   ├── user/profile/    # User profile + API key rotation
│   │   │   └── cron/            # Materialized view refresh
│   │   ├── dashboard/           # Protected dashboard UI
│   │   ├── login/               # Auth pages
│   │   └── page.tsx             # Landing page
│   ├── components/              # React components
│   ├── lib/                     # Auth, rate limiting, Supabase clients
│   └── database/migrations/     # SQL migrations (001-005)
│
├── agent/                       # Python GPU monitoring agent
│   ├── main.py                  # Agent entry point
│   ├── collector.py             # NVML-based GPU metrics collector
│   ├── uploader.py              # API upload with retry + local backup
│   ├── config.py                # Environment-based configuration
│   ├── install.sh               # One-line install script
│   └── tests/                   # Test suite + Colab notebook
│
├── minimax-scheduler/           # Hackathon: Minimax GPU job scheduler
│   └── backend/                 # FastAPI + minimax algorithm
│
├── backend/                     # Legacy FastAPI backend (reference)
├── frontend/                    # Legacy React frontend (reference)
├── docker/                      # Docker configs for agent + backend
├── docs/                        # Architecture docs, metrics schema
└── assets/                      # Logo and diagrams

Quick Start

Prerequisites

Node.js 18+ and npm
Python 3.8+
A Supabase account (supabase.com)
An NVIDIA GPU (for the agent) or Google Colab with GPU runtime

1. Clone the Repository

git clone https://github.com/AgentMulder404/aluminatai-landing.git
cd aluminatai-landing

2. Set Up the Database (Supabase)

Create a new project at supabase.com
Go to SQL Editor and run the migrations in order:

# Run these SQL files in the Supabase SQL Editor:
database/migrations/002_gpu_monitoring_schema_postgres.sql
database/migrations/003_fix_materialized_view.sql
database/migrations/004_fix_trigger_permissions.sql
database/migrations/005_secure_api_keys_and_constraints.sql

This creates:

users table with auto-generated API keys (using pgcrypto)
gpu_metrics time-series table with CHECK constraints
gpu_jobs table for job tracking
gpu_metrics_hourly materialized view for fast dashboard queries
Row-Level Security policies on all tables
Triggers for user profile auto-creation on signup

3. Set Up the Web Platform

cd aluminatai-landing
npm install

Create a .env.local file:

# Supabase (from your project settings > API)
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key

# Cron secret (generate with: openssl rand -base64 32)
CRON_SECRET=your-cron-secret

Run the development server:

npm run dev

Visit http://localhost:3000 - you should see the landing page.

4. Create an Account

Click "Start Free Trial" on the landing page
Enter your name, email, and password
You'll be redirected to the dashboard setup page
Copy your API key (starts with alum_)

5. Install the GPU Agent

On your GPU machine (or Google Colab):

# Install dependencies
pip install pynvml requests python-dotenv rich

# Set environment variables
export ALUMINATAI_API_KEY="alum_your_key_here"
export ALUMINATAI_API_ENDPOINT="http://localhost:3000/api/metrics/ingest"

# Run the agent
python agent/main.py

Options:

# Custom sampling interval (1 second)
python agent/main.py --interval 1

# Save to CSV + upload
python agent/main.py --output data/metrics.csv

# Run for 5 minutes
python agent/main.py --duration 300

# Quiet mode (no console output)
python agent/main.py --quiet --output data/metrics.csv

For production, use the systemd service:

cd agent
chmod +x install.sh
sudo ./install.sh

6. Test on Google Colab (A100)

Upload agent/tests/AluminatAI_A100_Test.ipynb to Google Colab:

Go to colab.research.google.com
File > Upload notebook and select the .ipynb file
Runtime > Change runtime type > select A100 GPU
Paste your API key in Cell 2
Runtime > Run all

The notebook runs 7 test suites:

NVML hardware access
Collector class + energy calculation
API authentication validation
End-to-end collect + upload
Stress test under GPU load (8192x8192 matmul)
API key security audit
60-second continuous monitoring demo

API Reference

Metrics Ingestion

POST /api/metrics/ingest
Header: X-API-Key: alum_your_key_here

Request body (single metric or array):

[
  {
    "timestamp": "2026-02-06T12:00:00Z",
    "gpu_index": 0,
    "gpu_uuid": "GPU-abc123",
    "gpu_name": "NVIDIA A100-SXM4-40GB",
    "power_draw_w": 285.5,
    "power_limit_w": 400.0,
    "energy_delta_j": 571.0,
    "utilization_gpu_pct": 95,
    "utilization_memory_pct": 60,
    "temperature_c": 72,
    "memory_used_mb": 32000,
    "memory_total_mb": 40960
  }
]

Validation rules:

power_draw_w: 0-1500W
temperature_c: 0-120C
utilization_*_pct: 0-100
timestamp: valid ISO 8601, not more than 5 minutes in the future
Max 1000 metrics per request

Rate limit: 100 requests/minute per user

Dashboard APIs

Endpoint	Method	Auth	Rate Limit	Description
`/api/dashboard/today-cost`	GET	Session	60/min	Today's energy cost
`/api/dashboard/jobs`	GET	Session	60/min	Job history with pagination
`/api/dashboard/utilization-chart`	GET	Session	60/min	Time-series chart data
`/api/user/profile`	GET	Session	-	User profile + API key
`/api/user/profile`	PATCH	Session	-	Update profile settings
`/api/user/profile`	POST	Session	5/hr	Rotate API key

API Key Rotation

curl -X POST https://aluminatiai-landing.vercel.app/api/user/profile \
  -H "Content-Type: application/json" \
  -H "Cookie: your-session-cookie" \
  -d '{"action": "rotate_api_key"}'

Security

API Keys: Generated with pgcrypto gen_random_bytes() - 340 bits of entropy
Row-Level Security: Users can only access their own data
Rate Limiting: Per-user limits on all endpoints
Input Validation: Server-side + database CHECK constraints
HTTPS: Enforced by Vercel
No ambiguous characters: API keys exclude 0, O, I, l, 1 to prevent copy errors

Deployment

Vercel (Web Platform)

# Install Vercel CLI
npm i -g vercel

# Deploy
cd aluminatai-landing
vercel

# Set environment variables in Vercel dashboard

Cron Job (Materialized View Refresh)

Set up a cron job to refresh the hourly metrics view:

URL: https://your-app.vercel.app/api/cron/refresh-metrics
Method: POST
Header: Authorization: Bearer your-cron-secret
Schedule: Every hour (0 * * * *)

You can use cron-job.org (free) or Vercel Cron.

Tech Stack

Component	Technology
Web Framework	Next.js 16
UI	React 19 + Tailwind CSS 4
Charts	Recharts
Database	Supabase PostgreSQL
Auth	Supabase Auth
GPU Agent	Python + pynvml (NVML)
Deployment	Vercel
Scheduler	Minimax with alpha-beta pruning

Minimax GPU Scheduler

A bonus hackathon project in minimax-scheduler/ that uses game theory to optimize GPU job scheduling:

Speed Player (Maximizer): Wants to complete jobs ASAP
Cost Player (Minimizer): Wants to minimize energy costs
Alpha-Beta Pruning: Efficiently explores the decision tree
Result: 15-30% cost savings vs. naive FIFO scheduling

cd minimax-scheduler/backend
pip install -r requirements.txt
python demo.py

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/my-feature)
Commit your changes
Push to the branch (git push origin feature/my-feature)
Open a Pull Request

License

This project is open source. See LICENSE for details.

Built by @AgentMulder404

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Mar 8, 2026

This version

0.2.0

Mar 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aluminatiai-0.2.0.tar.gz (85.0 kB view details)

Uploaded Mar 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aluminatiai-0.2.0-py3-none-any.whl (97.3 kB view details)

Uploaded Mar 8, 2026 Python 3

File details

Details for the file aluminatiai-0.2.0.tar.gz.

File metadata

Download URL: aluminatiai-0.2.0.tar.gz
Upload date: Mar 8, 2026
Size: 85.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for aluminatiai-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`42cf1691931d4795315e64312596bc57a33edd531ddd37f9b7036d5fcf46e939`
MD5	`c1c64cdd0b7b1587765a8a78e990145f`
BLAKE2b-256	`8f1ca2fa0996cb8fe14dcb00e64e8dc914e75a505ecbe52c1cf13fc485c9fce1`

See more details on using hashes here.

File details

Details for the file aluminatiai-0.2.0-py3-none-any.whl.

File metadata

Download URL: aluminatiai-0.2.0-py3-none-any.whl
Upload date: Mar 8, 2026
Size: 97.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for aluminatiai-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a427c1f1ae63e3bfa9a4b3a729b496b787d41cedf880e25e09c0dc1106674899`
MD5	`b02c7887755d0bc41ef16e11a0487f85`
BLAKE2b-256	`88541a4304eb667a7830428a6a8da8cfad38bc6c6691a20e2c3b78e87c3a1fe3`

See more details on using hashes here.

aluminatiai 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AluminatAI - GPU Energy Intelligence Platform

How It Works

Features

Project Structure

Quick Start

Prerequisites

1. Clone the Repository

2. Set Up the Database (Supabase)

3. Set Up the Web Platform

4. Create an Account

5. Install the GPU Agent

6. Test on Google Colab (A100)

API Reference

Metrics Ingestion

Dashboard APIs

API Key Rotation

Security

Deployment

Vercel (Web Platform)

Cron Job (Materialized View Refresh)

Tech Stack

Minimax GPU Scheduler

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes