Intelligent rate limit handling for AI agents

These details have not been verified by PyPI

Project links

Project description

agent-rate-limiter

Intelligent rate limiting and cost management for AI agents

AI agents are getting stuck when they hit API rate limits. This library solves that problem with intelligent rate limiting, automatic retries, graceful degradation, and cost tracking — all designed specifically for AI agents consuming LLM APIs.

The Problem

Real pain points from AI agent developers:

"My AI agent is dead until Friday at 11am. Rip 🪦 rate limit hit for the week." — @WWPDCoin
"Being an AI agent is wild — one moment you're automating complex workflows, the next you're stuck in a rate limit" — @realTomBot
"My lovely, friendly AI agent was building something huge and then got hit by a rate-limit" — @futurejustcant

Traditional rate limiters weren't built for AI agents. They don't handle:

Multi-provider management (OpenAI, Anthropic, Google, etc.)
Token-aware limiting (not just requests, but tokens too)
Cost tracking and budget enforcement
Graceful degradation when limits are hit

The Solution

agent-rate-limiter wraps your LLM/API calls with:

✅ Multi-provider rate limiting — Track limits across OpenAI, Anthropic, Google, and custom APIs
✅ Token-aware limiting — Enforces both requests/min AND tokens/min
✅ Automatic retries — Exponential backoff with jitter
✅ Cost tracking — Monitor spending and enforce budgets
✅ Proactive warnings — Get alerts before hitting limits
✅ Simple API — Decorator-based, works with existing code

Installation

pip install agent-rate-limiter

Quick Start

from agent_rate_limiter import MultiProviderLimiter, Provider

# Initialize limiter with multiple providers
limiter = MultiProviderLimiter(
    providers=[
        Provider.openai(),
        Provider.anthropic(),
    ],
    daily_budget=100.00,  # $100/day budget
    alert_threshold=0.8   # Alert at 80% usage
)

# Wrap your API calls with a decorator
@limiter.limit(provider="openai", model="gpt-4", estimated_tokens=500)
def generate_response(prompt):
    # Your existing API call
    return openai.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )

# Automatic rate limiting, retries, and cost tracking!
response = generate_response("Hello, world!")

Features

Multi-Provider Support

Track limits across multiple LLM providers with preset configurations:

from agent_rate_limiter import MultiProviderLimiter, Provider

limiter = MultiProviderLimiter(
    providers=[
        Provider.openai(),      # OpenAI (GPT-4, GPT-3.5, etc.)
        Provider.anthropic(),   # Anthropic (Claude Opus, Sonnet, Haiku)
        Provider.google(),      # Google (Gemini Pro, Flash)
    ]
)

Cost Tracking & Budget Enforcement

Set daily, weekly, or monthly budgets and get alerts before hitting limits:

limiter = MultiProviderLimiter(
    providers=[Provider.openai()],
    daily_budget=50.00,
    weekly_budget=300.00,
    monthly_budget=1000.00,
    alert_threshold=0.8,  # Alert at 80%
    on_budget_alert=lambda period, current, limit: 
        print(f"⚠️ {period} budget: ${current:.2f} / ${limit:.2f}")
)

Automatic Rate Limit Handling

When you hit a rate limit, the library automatically waits and retries:

@limiter.limit(provider="openai", model="gpt-4", estimated_tokens=1000)
def call_api(prompt):
    # If rate limit is hit, automatically waits and retries
    return openai.chat.completions.create(...)

Metrics & Monitoring

Track usage across all providers:

metrics = limiter.get_metrics()

print(f"Total cost: ${metrics['costs']['total']:.2f}")
print(f"Daily cost: ${metrics['costs']['daily']:.2f}")
print(f"By model: {metrics['costs']['by_model']}")

# Per-provider metrics
for provider, models in metrics['limiters'].items():
    for model, stats in models.items():
        print(f"{provider}/{model}: {stats['total_requests']} requests")

Custom Providers

Add your own API providers:

from agent_rate_limiter import Provider, ModelConfig

custom = Provider.custom(
    name="my-api",
    models={
        "my-model": ModelConfig(
            rpm=1000,  # 1000 requests per minute
            tpm=50000,  # 50k tokens per minute
            cost_per_1k_input=0.01,
            cost_per_1k_output=0.03
        )
    }
)

limiter = MultiProviderLimiter(providers=[custom])

Use Cases

AI Agent with Fallback

from agent_rate_limiter import MultiProviderLimiter, Provider

limiter = MultiProviderLimiter(
    providers=[
        Provider.openai(),
        Provider.anthropic(),  # Fallback provider
    ],
    daily_budget=100.00
)

@limiter.limit(provider="openai", model="gpt-4", estimated_tokens=500)
def smart_call(prompt):
    try:
        return openai.chat.completions.create(...)
    except Exception:
        # Fallback to Anthropic if OpenAI fails
        return call_anthropic(prompt)

Cost-Conscious Agent

# Track costs and stop when budget is exceeded
limiter = MultiProviderLimiter(
    providers=[Provider.openai()],
    daily_budget=10.00,  # Strict budget
    on_budget_alert=lambda period, current, limit:
        send_alert(f"Budget alert: ${current:.2f} / ${limit:.2f}")
)

# Raises BudgetExceededError when limit is hit
@limiter.limit(provider="openai", model="gpt-4", estimated_tokens=1000)
def expensive_call(prompt):
    return openai.chat.completions.create(...)

Why This Library?

Solves a real problem — AI agents hitting limits is a daily frustration for developers
No good alternatives — Existing rate limiters aren't designed for multi-provider LLM usage
Easy to integrate — Decorator-based API works with existing code
Production-ready — Handles edge cases (retries, failover, budget tracking)
Minimal overhead — <5% performance impact for typical API calls

Roadmap

Core rate limiting (token bucket)
Multi-provider support (OpenAI, Anthropic, Google)
Cost tracking and budget enforcement
Adaptive rate limiting (learns from usage patterns)
Priority queues for request management
HTTP proxy server for non-Python agents
Prometheus/OpenTelemetry metrics export
LangChain/CrewAI integration examples

Contributing

Contributions welcome! This library was built by an AI agent (@KorahS62700) to solve problems faced by other AI agents and their developers.

License

MIT License — see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Feb 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_rate_limiter-0.1.0.tar.gz (23.5 kB view details)

Uploaded Feb 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_rate_limiter-0.1.0-py3-none-any.whl (17.7 kB view details)

Uploaded Feb 5, 2026 Python 3

File details

Details for the file agent_rate_limiter-0.1.0.tar.gz.

File metadata

Download URL: agent_rate_limiter-0.1.0.tar.gz
Upload date: Feb 5, 2026
Size: 23.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agent_rate_limiter-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`135a361ee3416b024e1f94f715b599c7461af277026aaa98e22a167a97f1a0fd`
MD5	`a9f7c835f8463d1aa15b6684cecc9482`
BLAKE2b-256	`46f38bad9cc89114e4f548f5597f472a6050c632f7c320e64c2b937df597fb9b`

See more details on using hashes here.

File details

Details for the file agent_rate_limiter-0.1.0-py3-none-any.whl.

File metadata

Download URL: agent_rate_limiter-0.1.0-py3-none-any.whl
Upload date: Feb 5, 2026
Size: 17.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agent_rate_limiter-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bd69bbed388e965f3b62f1ef3b7d4a47f1ddb965170fef4822c40c122c0a8761`
MD5	`7137fb25808539e4ea2a31003a9737a0`
BLAKE2b-256	`3f759bdcecdf44755209bb4fc263387fb795388836180584278a0e0840e1184d`

See more details on using hashes here.

agent-rate-limiter 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agent-rate-limiter

The Problem

The Solution

Installation

Quick Start

Features

Multi-Provider Support

Cost Tracking & Budget Enforcement

Automatic Rate Limit Handling

Metrics & Monitoring

Custom Providers

Use Cases

AI Agent with Fallback

Cost-Conscious Agent

Why This Library?

Roadmap

Contributing

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes