OpenAI-compatible client + worker that bridge inference requests over a Redis queue (Redis Streams), for running LLMs across heavily restricted networks.
Project description
openai-rq
📊 Overview deck: https://allen2c.github.io/openai-rq/
Use the OpenAI SDK from behind a locked-down network where the only reachable
outbound endpoint is Redis. openai-rq ships each OpenAI HTTP request over Redis
Streams to a worker that replays it against a local OpenAI-compatible server (e.g. vLLM)
and streams the response back — your client code stays identical to normal OpenAI usage.
your client ──(Redis Streams)──▶ openai-rq worker ──▶ http://localhost:8000/v1
OpenAIRQ ◀─(Redis Streams)── (vLLM / OpenAI-compatible)
Both sides connect only to Redis. No direct HTTP between client and the inference box.
Install
pip install openai-rq
Client — a drop-in openai.OpenAI
Swap openai.OpenAI for openai_rq.OpenAIRQ and point it at Redis. Everything else —
parameters, response objects, streaming, error handling — works unchanged.
from openai_rq import OpenAIRQ
client = OpenAIRQ(redis_url="redis://localhost:6379/0")
resp = client.chat.completions.create(
model="openai/gpt-oss-120b",
messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)
Streaming
stream = client.chat.completions.create(
model="openai/gpt-oss-120b",
messages=[{"role": "user", "content": "Write a haiku."}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
Async
from openai_rq import AsyncOpenAIRQ
client = AsyncOpenAIRQ(redis_url="redis://localhost:6379/0")
resp = await client.chat.completions.create(
model="openai/gpt-oss-120b",
messages=[{"role": "user", "content": "Hello!"}],
)
extra_headers and extra_body pass through verbatim, so server-specific options
(guided decoding, etc.) just work:
client.chat.completions.create(
model="openai/gpt-oss-120b",
messages=[...],
extra_body={"guided_json": schema},
)
Worker — run it next to the inference server
On the inference box, run a worker that relays jobs to your local server:
openai-rq worker \
--redis-url redis://localhost:6379/0 \
--openai-base-url http://localhost:8000/v1 \
--concurrency 16
Run as many workers as you like against the same Redis — jobs are load-balanced across them via a Redis consumer group.
Backend needs an API key?
The credential lives only on the worker — it never transits Redis or the client.
# Bearer style → Authorization: Bearer <key>
export OPENAI_API_KEY=<key>
openai-rq worker --redis-url redis://localhost:6379/0 --openai-base-url http://localhost:8000/v1
# Server that expects a custom auth header instead of Bearer (repeatable)
openai-rq worker --redis-url redis://localhost:6379/0 \
--openai-base-url http://localhost:8000/v1 \
--openai-header api-key=<key>
Embedding the worker
from openai_rq.worker import Worker
worker = Worker(
redis_url="redis://localhost:6379/0",
openai_base_url="http://localhost:8000/v1",
openai_api_key="<key>", # optional; → Authorization: Bearer
concurrency=16,
)
await worker.run()
Worker options
| Option | Default | Description |
|---|---|---|
--redis-url |
(required) | Redis URL; use rediss:// for TLS |
--openai-base-url |
http://localhost:8000/v1 |
local OpenAI-compatible server |
--openai-api-key |
env OPENAI_API_KEY |
injected as Authorization: Bearer |
--openai-header |
— | extra backend header KEY=VALUE (repeatable) |
--concurrency |
16 |
in-flight jobs per worker |
--stream-flush-ms |
50 |
streaming coalesce window |
--result-ttl-s |
600 |
TTL on result/stream keys |
--max-retries |
3 |
queue retries before dead-letter |
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openai_rq-0.1.2.tar.gz.
File metadata
- Download URL: openai_rq-0.1.2.tar.gz
- Upload date:
- Size: 16.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.4.1 CPython/3.12.13 Linux/6.14.0-1015-nvidia
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48ab8288bc6c776ae657bdd659861ec1b542dd8b54e720f7d80d5dd3869f65b9
|
|
| MD5 |
12c0dcd645dc5ed4a81d807864f8d2e8
|
|
| BLAKE2b-256 |
d7f755b7ea88f49a24decd5acffe53e5b4f902d076bcfb7d76ff6787f4fe4f2a
|
File details
Details for the file openai_rq-0.1.2-py3-none-any.whl.
File metadata
- Download URL: openai_rq-0.1.2-py3-none-any.whl
- Upload date:
- Size: 18.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.4.1 CPython/3.12.13 Linux/6.14.0-1015-nvidia
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67eae7fa6744e9a0c97f868acb1f66da7f2586f5f917fc8fabad4519243e2395
|
|
| MD5 |
379f5853caa4f515d40cf2439d4ef685
|
|
| BLAKE2b-256 |
b06d234e101d155f166a2ca965f78c63df839514c0d1a3284ee99340b25cc2ee
|