Caching and replaying man-in-the-middle proxy for OpenAI APIs
Project description
Rechat
Rechat is a caching and replaying man-in-the-middle proxy for OpenAI's APIs, it provides inspection and debugging layer, particularly useful for quick inspection of interactions of existing clients, developing multi-request workflows, and benchmarks.
Rechat is for you, if you ever wanted to:
- speed up your code that makes repeated calls to OpenAI APIs
- quickly inspect what is being sent to OpenAI APIs
- emulate an endpoint with pre-recorded (or pre-defined) responses
Quickstart
pip install rechat(dev:pip install git+https://gitlab-master.nvidia.com/dchichkov/rechat.git)- Run
rechat, it will listen on eight-nine-ten port (http://localhost:8910/v1) and use OpenAI's endpoint by default as upstream. - Configure your OpenAI client to use it
export OPENAI_BASE_URL=http://localhost:8910/v1and run your requests as usual.
You can specify a different upstream endpoint by providing it as an argument, e.g. rechat https://api.openai.com/v1.
By default, rechat outputs intercepted chat content onto the console:
And it records the session to flows_<timestamp>.dump file in the current directory. During subsequent runs, if a -f/--flow [dump_file] argument is provided, rechat would attempt to load flows_[timestamp].dump files, or the specified dump file. It always tries to use cached responses for any matching requests.
Inspection
Rechat provides http://localhost:8910 web UI for inspecting the current session, with search and filtering capabilities. By default, rechat will output chat content to the console. Use --quiet flag to reduce verbosity.
Additionally, a --diff mode would search for similar requests on cache misses and print difference between the current request and the closest cached request. This is useful for ensuring that different clients or implementations of the same functionality produce the same requests.
Recording and Replaying
Any markdown editor, for example VSCode or GitHub/GitLab web UI, can be used to view and edit the logs, and these modified logs can be loaded into rechat, to emulate model's responses.
Architecture
subgraph Rechat["Rechat Proxy (MITM)"]
direction TB
PROXY["Proxy Server"]
INTERCEPT["Request Interceptor"]
subgraph CacheLookup["Cache Lookup"]
direction TB
EXACT["Exact Match?"]
FUZZY["Fuzzy Match\n(similarity search)"]
CACHE[("Cache Storage")]
end
REPLAY["Replay Cached Response"]
DIFF["Print Diff 📋"]
STORE["Record Response"]
end
subgraph OpenAI["OpenAI API"]
direction TB
API[["api.openai.com"]]
end
%% Force vertical ordering
Client ~~~ Rechat
Rechat ~~~ OpenAI
%% Main flow - top to bottom
APP --> PROXY
PROXY --> INTERCEPT
INTERCEPT --> EXACT
%% Cache lookup flow
EXACT -->|"yes"| REPLAY
EXACT -->|"no"| FUZZY
FUZZY -->|"similar"| DIFF
FUZZY -->|"no match"| FORWARD
DIFF --> FORWARD
%% Cache connections
CACHE <--> EXACT
CACHE <--> FUZZY
%% Forward to API
FORWARD["Forward Request"] --> API
%% Response flow
API --> STORE
STORE --> CACHE
STORE --> RETURN["Return Response"]
REPLAY --> RETURN
RETURN --> APP
%% Styling
style Rechat fill:#1a1a2e,stroke:#16213e,color:#eee
style CacheLookup fill:#0f3460,stroke:#16213e,color:#eee
style CACHE fill:#e94560,stroke:#16213e,color:#fff
style PROXY fill:#533483,stroke:#16213e,color:#fff
style DIFF fill:#f9a825,stroke:#16213e,color:#000
style REPLAY fill:#4caf50,stroke:#16213e,color:#fff
## Rechat Markdown Format
Example markdown snippet, in markdown format. Note `<blockquote>` tags. See more details in [sample.md](docs/sample.md).
```markdown
### user
<blockquote>
What is the capital of France?
</blockquote>
### assistant
<blockquote>
Paris.
</blockquote>
Intercepting traffic to existing OpenAI endpoints
Rechat can intercept traffic to existing endpoints, without changing the client code or configuring OPENAI_BASE_URL, by using mitmproxy as a transparent proxy. For example, to use mitmproxy local proxy mode and intercept traffic from a python script (for example python scripts/query.py), use the following command to intercept the traffic from python:
rechat --mode local:python
And run the python script as follows, specifying mitmproxy's CA certificate for SSL interception:
SSL_CERT_FILE=~/.mitmproxy/mitmproxy-ca-cert.pem python scripts/query.py
Note: SSL_CERT_FILE environment variable is required for Python clients to trust mitmproxy's root CA certificate. Please refer to mitmproxy's documentation for more details on installing and trusting mitmproxy's root CA certificate on your system.
Routing and Load Balancing
TODO: Rechat supports multiple endpoints, using <endpoint>:[local_port]:[model_name] arguments, e.g. https://api.openai.com/v1:8910:gpt-5. It would route the requests to the appropriate endpoint based on the model name in the request, and it'd balance the load between endpoints for the same model.
Miscellaneous
- Replaying queries against an endpoint
- Multiple responses for the same request
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rechat-0.1.7.tar.gz.
File metadata
- Download URL: rechat-0.1.7.tar.gz
- Upload date:
- Size: 13.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0c69f963955fa324ba044cea3c072f59afdf3270291c67426fd8aad8ba5b1d9
|
|
| MD5 |
ae87a3c2800569fea4620eaf785495b4
|
|
| BLAKE2b-256 |
5ad5b53c751b6e49e60048e15816a3e8c1bb32ec9ab3980f67617fcd34464d7f
|
File details
Details for the file rechat-0.1.7-py3-none-any.whl.
File metadata
- Download URL: rechat-0.1.7-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eadddcd21627999068d6e50645e82153ec151f0babae95c6df74181d641b5d17
|
|
| MD5 |
b77971806bf0252d6e1a6a78b083afe3
|
|
| BLAKE2b-256 |
ef6f97b5f6ca3cb5aa45dfc80cdf63599061aec08264e0e224208f481c53bd62
|