High-performance rate limiter engine for MCP Gateway
Project description
Rate Limiter Plugin
Author: ContextForge Contributors
Enforces rate limits per user, tenant, and tool across tool_pre_invoke and prompt_pre_fetch hooks. Supports pluggable counting algorithms (fixed window, sliding window, token bucket), an in-process memory backend (single-instance), and a Redis backend (shared across all gateway instances).
Hooks
| Hook | When it runs |
|---|---|
tool_pre_invoke |
Before every tool call — checks by_user, by_tenant, by_tool |
prompt_pre_fetch |
Before every prompt fetch — checks by_user, by_tenant, by_tool |
If any configured dimension is exceeded, the plugin returns a violation with HTTP 429. All requests include X-RateLimit-* headers. The most restrictive active dimension is surfaced (e.g. if both user and tenant limits are active, the one closest to exhaustion is reported).
Configuration
- name: RateLimiterPlugin
kind: cpex_rate_limiter.rate_limiter.RateLimiterPlugin
hooks:
- prompt_pre_fetch
- tool_pre_invoke
mode: enforce # enforce | permissive | disabled
config:
by_user: "30/m" # per-user limit across all tools
by_tenant: "300/m" # shared limit across all users in a tenant
by_tool: # per-tool overrides (applied on top of by_user)
search: "10/m"
summarise: "5/m"
# Algorithm — choose one (default: fixed_window)
algorithm: "fixed_window" # fixed_window | sliding_window | token_bucket
# Backend — choose one
backend: "memory" # default: single-process, resets on restart
# backend: "redis" # shared across all gateway instances
# Redis options (required when backend: redis)
redis_url: "redis://redis:6379/0"
redis_key_prefix: "rl"
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
by_user |
string | null |
Per-user rate limit, e.g. "60/m" |
by_tenant |
string | null |
Per-tenant rate limit, e.g. "600/m" |
by_tool |
dict | {} |
Per-tool overrides, e.g. {"search": "10/m"} |
algorithm |
string | "fixed_window" |
Counting algorithm: "fixed_window", "sliding_window", or "token_bucket" |
backend |
string | "memory" |
"memory" or "redis" |
redis_url |
string | null |
Redis connection URL (required when backend: redis) |
redis_key_prefix |
string | "rl" |
Prefix for all Redis keys |
Rate string format: "<count>/<unit>" where unit is s/sec/second, m/min/minute, or h/hr/hour. Malformed strings raise ValueError at startup.
Omitting a dimension (e.g. no by_tenant) means that dimension is unlimited — no counter is tracked for it.
Response headers
Every request (allowed or blocked) includes:
| Header | Description |
|---|---|
X-RateLimit-Limit |
Configured limit for the most restrictive active dimension |
X-RateLimit-Remaining |
Requests remaining in the current window |
X-RateLimit-Reset |
Unix timestamp when the current window resets |
Retry-After |
Seconds until the window resets (blocked requests only) |
Algorithms
Three counting algorithms are available, selected via the algorithm config field.
| Algorithm | Config value | Best for | Trade-off |
|---|---|---|---|
| Fixed window | fixed_window |
General use, lowest overhead | Up to 2× the limit at window boundaries |
| Sliding window | sliding_window |
Smooth enforcement, no boundary burst | Higher memory: stores one timestamp per request per key |
| Token bucket | token_bucket |
Bursty workloads — allows short spikes up to capacity | Slightly higher Redis overhead: stores {tokens, last_refill} hash per key |
Fixed window (default)
Counts requests in a fixed time slot (e.g. "minute 14:03"). Resets at the slot boundary. Simple and fast. The 2× burst at a boundary (N requests at the end of slot T, N requests at the start of T+1) is a known trade-off; use by_user with headroom if this matters.
Sliding window
Stores a timestamp for every request in the current window. At each check, expired timestamps are discarded and the remaining count is compared against the limit. Prevents boundary bursts entirely. Memory usage grows with request volume — roughly one float per request per active key.
Token bucket
Each identity (user, tenant, tool) has a bucket that holds up to count tokens. Tokens refill at a steady rate of count/window. A request consumes one token. Bursts up to the bucket capacity are allowed; sustained rate above count/window is rejected. Useful for APIs where short spikes are acceptable but sustained overload is not.
Redis support: token_bucket with backend: redis is fully supported. The plugin stores {tokens, last_refill} in a Redis hash per key and uses an atomic Lua script to refill and consume tokens in a single round-trip — the same pattern as the other two algorithms. This means token_bucket enforces a true cluster-wide limit in multi-instance deployments.
Backends
Memory backend (default, single-instance only)
- Counters are stored in a process-local
MemoryStore(Rust, per-keyRwLock— no single global lock) - An amortized sweep evicts expired keys every ~128 calls — for
fixed_window, keys are evicted once the window elapses; forsliding_window, keys with empty timestamp deques are evicted; fortoken_bucket, keys inactive for >1 hour are evicted - Limitation: state is not shared across processes or hosts. In a multi-instance deployment (e.g. 3 gateway instances behind nginx), each instance tracks its own counter — the effective limit is
N × configured_limit
Redis backend
fixed_window: atomic LuaINCR+EXPIRE— one Redis round-trip per check, no race conditionsliding_window: atomic LuaZADD+ZREMRANGEBYSCORE+ZCARD+EXPIRE— one round-trip, no race conditiontoken_bucket: atomic Lua script — reads{tokens, last_refill}hash, refills proportionally, consumes 1 token, writes back — one round-trip, no race condition- All gateway instances share the same counter — the configured limit is the true cluster-wide limit
- Requires
redis_urlto be set - If Redis is unavailable, the plugin fails open — the request is allowed through without rate limiting. This is a deliberate design choice: an infrastructure failure must never block legitimate traffic. Operators should monitor for rate-limiter error logs and treat them as high-priority alerts
Multi-instance deployment (important): The memory backend is local to a single gateway instance — rate limit counters are not shared across replicas. For multi-instance deployments (e.g., behind nginx or on OpenShift with multiple gateway pods), always use backend: redis to ensure rate limits are enforced correctly across all instances.
Examples
Single-instance (default config)
config:
by_user: "60/m"
by_tenant: "600/m"
Multi-instance with Redis
config:
backend: "redis"
redis_url: "redis://redis:6379/0"
by_user: "30/m"
by_tenant: "3000/m"
by_tool:
search: "10/m"
Sliding window (no boundary bursts)
config:
algorithm: "sliding_window"
by_user: "30/m"
by_tenant: "300/m"
Token bucket — memory backend (default)
config:
algorithm: "token_bucket"
by_user: "30/m" # bucket holds 30 tokens, refills at 30/min
Token bucket — Redis backend (multi-instance)
config:
algorithm: "token_bucket"
backend: "redis"
redis_url: "redis://redis:6379/0"
by_user: "30/m"
Permissive mode (observe without blocking)
mode: permissive
config:
by_user: "60/m"
In permissive mode the plugin records violations and emits X-RateLimit-* headers but does not block requests. Useful for baselining traffic before switching to enforce.
Limitations
| Limitation | Severity | Status |
|---|---|---|
| Memory backend not shared across processes | HIGH | Use Redis backend for multi-instance deployments |
| Fixed window allows up to 2× limit at window boundary | LOW | Use sliding_window algorithm, or use by_user with headroom |
by_tool matching is case-sensitive |
LOW | Fixed — tool names are normalised with .strip().lower() |
| Whitespace-only user identity bypasses anonymous bucket | LOW | Fixed — _extract_user_identity strips whitespace and falls back to 'anonymous' |
No per-server limits (server_id dimension missing) |
LOW | Not implemented |
| No config hot-reload — rate string changes require restart | LOW | Not implemented |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cpex_rate_limiter-0.0.3.tar.gz.
File metadata
- Download URL: cpex_rate_limiter-0.0.3.tar.gz
- Upload date:
- Size: 63.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2bc35530840bdc98c70a8cb8a12290dc765e757f54e2961644bc6eda4ca99f1
|
|
| MD5 |
7e6b77c03d08339d1b8f09bbdcd4e818
|
|
| BLAKE2b-256 |
1ad5db5a693ddbfab73770f71d20b6bb78fb5d5625effef81728069194eaa842
|
Provenance
The following attestation bundles were made for cpex_rate_limiter-0.0.3.tar.gz:
Publisher:
release-rust-python-package.yaml on IBM/cpex-plugins
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cpex_rate_limiter-0.0.3.tar.gz -
Subject digest:
c2bc35530840bdc98c70a8cb8a12290dc765e757f54e2961644bc6eda4ca99f1 - Sigstore transparency entry: 1246247331
- Sigstore integration time:
-
Permalink:
IBM/cpex-plugins@6ca511b767c133544dffd979cb56d03c90ad8417 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IBM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-rust-python-package.yaml@6ca511b767c133544dffd979cb56d03c90ad8417 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file cpex_rate_limiter-0.0.3-cp311-abi3-win_amd64.whl.
File metadata
- Download URL: cpex_rate_limiter-0.0.3-cp311-abi3-win_amd64.whl
- Upload date:
- Size: 735.1 kB
- Tags: CPython 3.11+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
876df5158f8f5d1c29fc6a7a41c775a4f510ddf3c56f24bc05a15297705b4bd3
|
|
| MD5 |
2310b16d998573ddb2f76f077ed04c36
|
|
| BLAKE2b-256 |
f2e777a83a27bf681e2abc3b76edacbc379c61d7994b02d436d2b0bfd36003c8
|
Provenance
The following attestation bundles were made for cpex_rate_limiter-0.0.3-cp311-abi3-win_amd64.whl:
Publisher:
release-rust-python-package.yaml on IBM/cpex-plugins
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cpex_rate_limiter-0.0.3-cp311-abi3-win_amd64.whl -
Subject digest:
876df5158f8f5d1c29fc6a7a41c775a4f510ddf3c56f24bc05a15297705b4bd3 - Sigstore transparency entry: 1246247354
- Sigstore integration time:
-
Permalink:
IBM/cpex-plugins@6ca511b767c133544dffd979cb56d03c90ad8417 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IBM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-rust-python-package.yaml@6ca511b767c133544dffd979cb56d03c90ad8417 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 771.2 kB
- Tags: CPython 3.11+, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d45de2e6f6f8b3cbd0291f5ce81e950846d31751b11d02567dd3c553a771919e
|
|
| MD5 |
54f67553318392e896f2ebbac0dc0736
|
|
| BLAKE2b-256 |
5ab7e5509dbbfcf8f60d5a4062e78f0bf82f60548787ceede7bc5a272aee34ec
|
Provenance
The following attestation bundles were made for cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_x86_64.whl:
Publisher:
release-rust-python-package.yaml on IBM/cpex-plugins
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_x86_64.whl -
Subject digest:
d45de2e6f6f8b3cbd0291f5ce81e950846d31751b11d02567dd3c553a771919e - Sigstore transparency entry: 1246247347
- Sigstore integration time:
-
Permalink:
IBM/cpex-plugins@6ca511b767c133544dffd979cb56d03c90ad8417 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IBM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-rust-python-package.yaml@6ca511b767c133544dffd979cb56d03c90ad8417 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_s390x.whl.
File metadata
- Download URL: cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_s390x.whl
- Upload date:
- Size: 851.7 kB
- Tags: CPython 3.11+, manylinux: glibc 2.34+ s390x
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6f703e5170908c2801ba8641161ad1be454e73a652ce8ee555f56af83418637
|
|
| MD5 |
0036f661b529dcde3a3fc2b9aeb898c1
|
|
| BLAKE2b-256 |
ebb2fab0c09600dacfca1b0d86e24d1d219527480d82cf8e2e7268ed06cf22b0
|
Provenance
The following attestation bundles were made for cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_s390x.whl:
Publisher:
release-rust-python-package.yaml on IBM/cpex-plugins
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_s390x.whl -
Subject digest:
a6f703e5170908c2801ba8641161ad1be454e73a652ce8ee555f56af83418637 - Sigstore transparency entry: 1246247335
- Sigstore integration time:
-
Permalink:
IBM/cpex-plugins@6ca511b767c133544dffd979cb56d03c90ad8417 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IBM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-rust-python-package.yaml@6ca511b767c133544dffd979cb56d03c90ad8417 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_ppc64le.whl.
File metadata
- Download URL: cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_ppc64le.whl
- Upload date:
- Size: 832.7 kB
- Tags: CPython 3.11+, manylinux: glibc 2.34+ ppc64le
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f98674f7b609ef107e4129a110d20ae942ecd29fa8dbad39f92fda4ec9ba2271
|
|
| MD5 |
c5fb134e772831ac25339fe89c441bb9
|
|
| BLAKE2b-256 |
37905c01ce2499a43f9f356e102723e13672a6e591c0015b9fb26c2f7dd5467a
|
Provenance
The following attestation bundles were made for cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_ppc64le.whl:
Publisher:
release-rust-python-package.yaml on IBM/cpex-plugins
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_ppc64le.whl -
Subject digest:
f98674f7b609ef107e4129a110d20ae942ecd29fa8dbad39f92fda4ec9ba2271 - Sigstore transparency entry: 1246247338
- Sigstore integration time:
-
Permalink:
IBM/cpex-plugins@6ca511b767c133544dffd979cb56d03c90ad8417 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IBM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-rust-python-package.yaml@6ca511b767c133544dffd979cb56d03c90ad8417 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_aarch64.whl.
File metadata
- Download URL: cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_aarch64.whl
- Upload date:
- Size: 731.5 kB
- Tags: CPython 3.11+, manylinux: glibc 2.34+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b6cab31c59eb24c433d56694935be8918916b74513904e3de77e1610c90bf93b
|
|
| MD5 |
1a8f7d7b0bc9d1575b18d34d10371e81
|
|
| BLAKE2b-256 |
80aaf93985195352d90f40ef57458c2cdd897b1b42b0dae910483bf9b1d76b78
|
Provenance
The following attestation bundles were made for cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_aarch64.whl:
Publisher:
release-rust-python-package.yaml on IBM/cpex-plugins
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cpex_rate_limiter-0.0.3-cp311-abi3-manylinux_2_34_aarch64.whl -
Subject digest:
b6cab31c59eb24c433d56694935be8918916b74513904e3de77e1610c90bf93b - Sigstore transparency entry: 1246247355
- Sigstore integration time:
-
Permalink:
IBM/cpex-plugins@6ca511b767c133544dffd979cb56d03c90ad8417 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IBM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-rust-python-package.yaml@6ca511b767c133544dffd979cb56d03c90ad8417 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file cpex_rate_limiter-0.0.3-cp311-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: cpex_rate_limiter-0.0.3-cp311-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 703.0 kB
- Tags: CPython 3.11+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ce22f0edaa79687a98fb9ffbce30fe8b147cf931f110de04acd4c3d4a867217
|
|
| MD5 |
abcfa6e736dc9c94d53346de67cc397a
|
|
| BLAKE2b-256 |
d49ca354b5a9d2d34f72b4aa9a5ad8a9ea32f9811116fb98d9e8f2178465f63f
|
Provenance
The following attestation bundles were made for cpex_rate_limiter-0.0.3-cp311-abi3-macosx_11_0_arm64.whl:
Publisher:
release-rust-python-package.yaml on IBM/cpex-plugins
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cpex_rate_limiter-0.0.3-cp311-abi3-macosx_11_0_arm64.whl -
Subject digest:
6ce22f0edaa79687a98fb9ffbce30fe8b147cf931f110de04acd4c3d4a867217 - Sigstore transparency entry: 1246247341
- Sigstore integration time:
-
Permalink:
IBM/cpex-plugins@6ca511b767c133544dffd979cb56d03c90ad8417 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IBM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-rust-python-package.yaml@6ca511b767c133544dffd979cb56d03c90ad8417 -
Trigger Event:
workflow_dispatch
-
Statement type: