Python ASGI server for production apps, streaming responses, and free-threaded Python
Project description
=^..^= Pounce
Pure Python ASGI server. 7x faster HTTP parsing. True thread parallelism on Python 3.14t.
import pounce
pounce.run("myapp:app")
What is Pounce?
Pounce is a Python ASGI server for Python 3.14+, with a worker model designed for free-threaded Python 3.14t. It runs standard ASGI applications, supports streaming responses, and gives you a clear upgrade path from process-based servers such as Uvicorn.
Pounce's built-in HTTP/1.1 parser runs at ~3 us per request (7x faster than h11), its frozen configuration eliminates all locking overhead, and its rolling reload spawns new workers while draining old ones — zero dropped requests.
On Python 3.14t, worker threads share one interpreter and one copy of your app. On GIL builds, Pounce falls back to multi-process workers automatically.
Why people pick it:
- ASGI-first — Runs standard ASGI apps with CLI and programmatic entry points
- Free-threading native — True thread parallelism with frozen immutable config (zero locks)
- 7x faster parsing — Built-in HTTP/1.1 parser (~3 us/req) with full request smuggling protection
- Four protocols — HTTP/1.1, HTTP/2, HTTP/3 (QUIC), and WebSocket (including WS over H2)
- Zero-downtime reload — Rolling restart with generational worker swap, no dropped requests
- Observable — Typed lifecycle events, Prometheus metrics, OpenTelemetry, Server-Timing headers
- Batteries included — TLS, compression, static files, middleware, rate limiting, observability
- Migration path — Familiar CLI for teams moving from Uvicorn-style deployments
Use Pounce For
- Serving ASGI apps — Tunable workers, TLS, graceful shutdown, and deployment controls
- Free-threaded Python deployments — Shared-memory worker threads on Python 3.14t
- Streaming workloads — Server-sent events, streamed HTML, and token-by-token responses
- Teams migrating from Uvicorn — Similar CLI shape with a different worker model
Performance
Pounce matches uvicorn on throughput — pure Python, no C extensions.
| Scenario | Pounce | Uvicorn | Notes |
|---|---|---|---|
| 1 worker | ~7.2k req/s | ~6.5k req/s | Async event loop, h11 parser |
| 4 workers | ~16k req/s | ~17k req/s | Threads (pounce) vs processes (uvicorn) |
Measured with wrk -t4 -c100 -d10s on macOS Apple Silicon, plain-text "hello world" ASGI app, Python 3.14t.
Run pounce bench --workers 4 --compare to reproduce on your machine.
Key optimizations in the sync worker path:
- Fast HTTP/1.1 parser — Direct bytes parsing (~3 µs/req) replaces h11 (~22 µs/req) with full safety checks (method validation, header size limits, request smuggling detection)
- Keep-alive connections — Connection reuse eliminates TCP handshake overhead
- Shared socket distribution — Single accept queue for thread workers avoids macOS SO_REUSEPORT limitations
Installation
pip install bengal-pounce
Requires Python 3.14+
Optional extras:
pip install bengal-pounce[h2] # HTTP/2 stream multiplexing
pip install bengal-pounce[ws] # WebSocket via wsproto
pip install bengal-pounce[tls] # TLS with truststore
pip install bengal-pounce[h3] # HTTP/3 (QUIC/UDP, requires TLS)
pip install bengal-pounce[full] # All protocol extras
Quick Start
| Usage | Command |
|---|---|
| Programmatic | pounce.run("myapp:app") |
| CLI | pounce myapp:app |
| Multi-worker | pounce myapp:app --workers 4 |
| TLS | pounce myapp:app --ssl-certfile cert.pem --ssl-keyfile key.pem |
| HTTP/3 | pounce myapp:app --http3 --ssl-certfile cert.pem --ssl-keyfile key.pem |
| Dev reload | pounce myapp:app --reload |
| App factory | pounce myapp:create_app() |
| Testing | with TestServer(app) as server: ... |
Features
| Feature | Description | Docs |
|---|---|---|
| Deployment | Production workers, compression, observability, and shutdown behavior | Deployment → |
| Migration | Move from Uvicorn with similar CLI concepts | Migrate from Uvicorn → |
| HTTP/1.1 | h11 (async) + fast built-in parser (sync) | HTTP/1.1 → |
| HTTP/2 | Stream multiplexing via h2 | HTTP/2 → |
| HTTP/3 | QUIC/UDP via bengal-zoomies (requires TLS) | HTTP/3 → |
| WebSocket | Full RFC 6455 via wsproto (including WS over H2) | WebSocket → |
| Static Files | Pre-compressed files, ETags, range requests | Static Files → |
| Middleware | ASGI3 middleware stack support | Middleware → |
| OpenTelemetry | Native distributed tracing (OTLP) | OpenTelemetry → |
| Lifecycle Logging | Structured JSON event logging | Logging → |
| Graceful Shutdown | Kubernetes-ready connection draining | Shutdown → |
| Dev Error Pages | Rich tracebacks with syntax highlighting | Errors → |
| TLS | SSL with truststore integration | TLS → |
| Compression | zstd (stdlib PEP 784) + gzip + WS compression | Compression → |
| Workers | Auto-detect: threads (3.14t) or processes (GIL) | Workers → |
| Auto Reload | Graceful restart on file changes | Reload → |
| Rate Limiting | Per-IP token bucket with 429 responses | Rate Limiting → |
| Request Queueing | Bounded queue with 503 load shedding | Request Queueing → |
| Prometheus | Built-in /metrics endpoint |
Metrics → |
| Sentry | Error tracking and performance monitoring | Sentry → |
| Testing | TestServer + pytest fixture for integration tests |
Testing → |
| Benchmarking | Built-in pounce bench command with comparative analysis |
Bench → |
| Lifecycle Events | Public API for typed connection/request events | API → |
📚 Full documentation: lbliii.github.io/pounce | Complete Feature List →
Usage
Programmatic Configuration — Full control from Python
import pounce
pounce.run(
"myapp:app",
host="0.0.0.0",
port=8000,
workers=4,
)
How It Works — Adaptive worker model
On Python 3.14t (free-threading): workers are threads. One process, N threads, each with its own asyncio event loop. Shared memory, no fork overhead, no IPC.
On GIL builds: workers are processes. Same API, same config. The supervisor detects the
runtime via sys._is_gil_enabled() and adapts automatically.
A request flows through: socket accept -> protocol parser -> ASGI scope
construction -> app(scope, receive, send) -> response serialization -> socket write.
Async workers use h11; sync workers use a fast built-in parser for lower latency.
Protocol Extras — Install only what you need
| Protocol | Backend | Install |
|---|---|---|
| HTTP/1.1 | h11 (async) / fast built-in parser (sync) | built-in |
| HTTP/2 | h2 (stream multiplexing, priority signals) | pounce[h2] |
| WebSocket | wsproto (including WS over H2) | pounce[ws] |
| TLS | stdlib ssl + truststore | pounce[tls] |
| All | Everything above | pounce[full] |
Compression uses Python 3.14's stdlib compression.zstd — zero external dependencies.
Testing — Real server for integration tests
from pounce.testing import TestServer
import httpx
def test_homepage(my_app):
with TestServer(my_app) as server:
resp = httpx.get(f"{server.url}/")
assert resp.status_code == 200
The pounce_server pytest fixture is auto-registered when pounce is installed:
def test_api(pounce_server, my_app):
server = pounce_server(my_app)
resp = httpx.get(f"{server.url}/health")
assert resp.status_code == 200
Key Ideas
- Free-threading first. Threads, not processes. One interpreter, N event loops, shared immutable state. On GIL builds, falls back to multi-process automatically.
- Pure Python. No Rust, no C extensions in the server core. Debuggable, hackable, readable.
- Typed end-to-end. Frozen config, typed ASGI definitions, zero
type: ignorecomments. - One dependency.
h11for HTTP/1.1 parsing. Everything else is optional. - Observable by design. Lifecycle events are public API —
from pounce import BufferedCollector, ResponseCompleted. Frameworks build dashboards on typed events, not log parsing. - Chirp companion. Built to serve Chirp apps natively, but works with any ASGI framework.
- Batteries included. Static files, middleware, rate limiting, request queueing, Prometheus metrics, Sentry, and OpenTelemetry — all built in, all optional.
Documentation
| Section | Description |
|---|---|
| Get Started | Installation and quickstart |
| Protocols | HTTP/1.1, HTTP/2, WebSocket |
| Configuration | Server config, TLS, CLI |
| Deployment | Workers, compression, production |
| Extending | ASGI bridge, custom protocols |
| Tutorials | Uvicorn migration guide |
| Troubleshooting | Common issues and fixes |
| Reference | API documentation |
| About | Architecture, performance, FAQ |
Development
git clone https://github.com/lbliii/pounce.git
cd pounce
uv sync --group dev
pytest
The Bengal Ecosystem
A structured reactive stack — every layer written in pure Python for 3.14t free-threading.
| ᓚᘏᗢ | Bengal | Static site generator | Docs |
| ∿∿ | Purr | Content runtime | — |
| ⌁⌁ | Chirp | Web framework | Docs |
| =^..^= | Pounce | ASGI server ← You are here | Docs |
| )彡 | Kida | Template engine | Docs |
| ฅᨐฅ | Patitas | Markdown parser | Docs |
| ⌾⌾⌾ | Rosettes | Syntax highlighter | Docs |
Python-native. Free-threading ready. No npm required.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bengal_pounce-0.5.0.tar.gz.
File metadata
- Download URL: bengal_pounce-0.5.0.tar.gz
- Upload date:
- Size: 155.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4f60d7b67cf214cf18faa9886b8a73ce7884b9ca75153321d206a8adb935599
|
|
| MD5 |
0ca32a04638a196752f0e5a7bbd09e37
|
|
| BLAKE2b-256 |
39205391cec23f6fa9d8a266fcbc69a6707f6321286f504abf8a59b5c912a16b
|
Provenance
The following attestation bundles were made for bengal_pounce-0.5.0.tar.gz:
Publisher:
python-publish.yml on lbliii/pounce
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bengal_pounce-0.5.0.tar.gz -
Subject digest:
f4f60d7b67cf214cf18faa9886b8a73ce7884b9ca75153321d206a8adb935599 - Sigstore transparency entry: 1228874487
- Sigstore integration time:
-
Permalink:
lbliii/pounce@bacf806888ab078f3a409603fce4a3b1ec8e8872 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/lbliii
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@bacf806888ab078f3a409603fce4a3b1ec8e8872 -
Trigger Event:
release
-
Statement type:
File details
Details for the file bengal_pounce-0.5.0-py3-none-any.whl.
File metadata
- Download URL: bengal_pounce-0.5.0-py3-none-any.whl
- Upload date:
- Size: 187.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2e3f7cf7accbd9e86114bf6e868ec0c1c9acd974b61a8d9948f387b32f379402
|
|
| MD5 |
921fa85d43c360fdef81fb4e94abe6e9
|
|
| BLAKE2b-256 |
801cd6469d22fe37f245e8bc80a253c03361009a03fa04721216612d02ae21ed
|
Provenance
The following attestation bundles were made for bengal_pounce-0.5.0-py3-none-any.whl:
Publisher:
python-publish.yml on lbliii/pounce
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bengal_pounce-0.5.0-py3-none-any.whl -
Subject digest:
2e3f7cf7accbd9e86114bf6e868ec0c1c9acd974b61a8d9948f387b32f379402 - Sigstore transparency entry: 1228874518
- Sigstore integration time:
-
Permalink:
lbliii/pounce@bacf806888ab078f3a409603fce4a3b1ec8e8872 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/lbliii
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@bacf806888ab078f3a409603fce4a3b1ec8e8872 -
Trigger Event:
release
-
Statement type: