OpenAI & Ollama compatible API powered by your ChatGPT account
Project description
GPTMock
This is a fork of RayBytes/chatmock. The original Flask + synchronous
requestsstack has been replaced with FastAPI + asynchttpx, a layered architecture (router / service / infra),pydantic-settingsconfiguration, anduvas the build system.
OpenAI & Ollama compatible API powered by your ChatGPT account.
gptmock runs a local server that proxies requests to the ChatGPT Codex backend, exposing an OpenAI/Ollama compatible API. Use GPT-5, GPT-5-Codex, and other models directly from your ChatGPT Plus/Pro subscription — no API key required.
Requirements
- Python 3.13+
- Paid ChatGPT account (Plus / Pro / Team / Enterprise)
uv(for uvx usage)
Quick Start (uvx)
The fastest way to run gptmock. No clone, no install — just uvx.
1. Login
uvx gptmock login
A browser window will open for ChatGPT OAuth. After login, tokens are saved to ~/.config/gptmock/auth.json.
2. Start the server
uvx gptmock serve
The server starts at http://127.0.0.1:8000. Use http://127.0.0.1:8000/v1 as your OpenAI base URL.
3. Verify
uvx gptmock info
Tip: Shell Alias
alias gptmock='uvx gptmock'
gptmock login
gptmock serve --port 9000
gptmock info
Note: To install directly from the GitHub repository instead of PyPI:
uvx --from "git+https://github.com/rapidrabbit76/GPTMock" gptmock login uvx --from "git+https://github.com/rapidrabbit76/GPTMock" gptmock serve
Quick Start (Docker)
No build required — pull the pre-built image and run.
1. Create docker-compose.yml
services:
serve:
image: rapidrabbit76/gptmock:latest
container_name: gptmock
command: ["serve", "--verbose", "--host", "0.0.0.0"]
ports:
- "8000:8000"
- "1455:1455" # OAuth callback port (needed during first-time login)
volumes:
- gptmock-data:/data
environment:
- GPTMOCK_HOME=/data
- GPTMOCK_LOGIN_BIND=0.0.0.0
healthcheck:
test: ["CMD-SHELL", "python -c \"import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('http://127.0.0.1:8000/health').status==200 else 1)\""]
interval: 10s
timeout: 5s
retries: 5
start_period: 120s # Allows time for first-time login before health checks begin
volumes:
gptmock-data:
2. Start (first run — login + serve in one step)
Run the container interactively. If no credentials are found, the login flow starts automatically:
docker compose run --rm --service-ports serve
A URL will be printed in the terminal:
No credentials found. Starting login flow...
Starting local login server on http://localhost:1455
If your browser did not open, navigate to:
https://auth.openai.com/oauth/authorize?...
If the browser can't reach this machine, paste the full redirect URL here and press Enter:
Two ways to complete login:
- Browser on the same machine — the URL opens automatically and the OAuth callback is caught on port 1455.
- Browser on a different machine — open the URL, complete login, then copy the full redirect URL from the browser address bar (starts with
http://localhost:1455/auth/callback?code=...) and paste it into the terminal.
Once login succeeds, the server starts automatically.
3. Subsequent starts
Once credentials are saved in the volume, just run in the background:
docker compose up -d serve
4. Verify
curl -s http://localhost:8000/health | jq .
Docker Environment Variables
All server options below are also available as environment variables. Use the GPTMOCK_* canonical names (see Server Options).
Additional Docker-specific variables:
| Variable | Default | Description |
|---|---|---|
GPTMOCK_HOME |
/data |
Auth file directory — mount a volume here |
GPTMOCK_LOGIN_BIND |
0.0.0.0 |
OAuth callback server bind address |
GPTMOCK_OLLAMA_VERSION |
0.12.10 |
Ollama API compatibility header version |
Usage Examples
Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:8000/v1",
api_key="anything" # ignored by gptmock
)
resp = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "hello world"}]
)
print(resp.choices[0].message.content)
Python (LangChain)
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="http://127.0.0.1:8000/v1",
api_key="anything",
model="gpt-5",
)
response = llm.invoke("hello world")
print(response.content)
curl
curl http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5",
"messages": [{"role": "user", "content": "hello world"}]
}'
Supported Models
| Model | Reasoning Efforts | Status |
|---|---|---|
gpt-5 |
minimal / low / medium / high |
✅ Supported |
gpt-5.1 |
low / medium / high |
✅ Supported |
gpt-5.2 |
low / medium / high / xhigh |
✅ Supported |
gpt-5-codex |
low / medium / high |
✅ Supported |
gpt-5.1-codex |
low / medium / high |
✅ Supported |
gpt-5.1-codex-max |
low / medium / high / xhigh |
✅ Supported |
gpt-5.2-codex |
low / medium / high / xhigh |
✅ Supported |
gpt-5.3-codex |
low / medium / high / xhigh |
✅ Supported |
gpt-5.3-codex-spark |
low / medium / high / xhigh |
✅ Supported |
Deprecated / Unsupported Models
| Model | Reason |
|---|---|
codex-mini / gpt-5.1-codex-mini |
❌ Discontinued by Codex Backend — removed |
API Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/chat/completions |
OpenAI Chat Completions (stream / non-stream) |
| POST | /v1/completions |
OpenAI Text Completions |
| POST | /v1/responses |
OpenAI Responses API (for LangChain codex routing) |
| GET | /v1/models |
List available models |
| POST | /api/chat |
Ollama-compatible chat |
| GET | /api/tags |
Ollama model list |
| GET | /health |
Health check |
Features
- Streaming & Non-streaming — real-time SSE and buffered JSON responses
- Structured Output —
response_formatwithjson_schema/json_objectsupport - Tool / Function Calling — including web search with URL citation annotations via
responses_tools - Thinking Summaries —
<think>tags,o3reasoning format, or legacy mode - Responses API —
POST /v1/responsesfor LangChain and other clients that auto-route codex models - Ollama Compatibility — drop-in replacement for Ollama API consumers
- Auto Token Refresh — JWT tokens are refreshed automatically before expiry
Server Options
gptmock serve [OPTIONS]
Each option can also be set via environment variable. Precedence: CLI flag > GPTMOCK_* env > CHATGPT_LOCAL_* legacy env > default.
| Option | Env var | Default | Description |
|---|---|---|---|
--host |
GPTMOCK_HOST |
127.0.0.1 |
Bind address |
--port |
GPTMOCK_PORT |
8000 |
Bind port |
--verbose |
GPTMOCK_VERBOSE |
off | Log request/response payloads |
--verbose-obfuscation |
GPTMOCK_VERBOSE_OBFUSCATION |
off | Also dump raw SSE/obfuscation events |
--debug-model |
GPTMOCK_DEBUG_MODEL |
— | Force all requests to use this model name |
--reasoning-effort |
GPTMOCK_REASONING_EFFORT |
medium |
minimal / low / medium / high / xhigh |
--reasoning-summary |
GPTMOCK_REASONING_SUMMARY |
auto |
auto / concise / detailed / none |
--reasoning-compat |
GPTMOCK_REASONING_COMPAT |
think-tags |
How reasoning is exposed: think-tags / o3 / legacy |
--expose-reasoning-models |
GPTMOCK_EXPOSE_REASONING_MODELS |
off | Show effort variants as separate models in /v1/models |
--enable-web-search |
GPTMOCK_DEFAULT_WEB_SEARCH |
off | Enable web search by default when responses_tools is omitted |
Legacy aliases:
CHATGPT_LOCAL_REASONING_EFFORT,CHATGPT_LOCAL_REASONING_SUMMARY,CHATGPT_LOCAL_REASONING_COMPAT,CHATGPT_LOCAL_EXPOSE_REASONING_MODELS,CHATGPT_LOCAL_ENABLE_WEB_SEARCH,CHATGPT_LOCAL_DEBUG_MODELare still accepted as fallbacks.
Web Search
Use --enable-web-search to enable the web search tool by default for all requests. When enabled, the model decides autonomously whether a query needs a web search. You can also enable web search per-request without the server flag by passing the parameters below.
Request Parameters
| Parameter | Values | Description |
|---|---|---|
responses_tools |
[{"type":"web_search"}] |
Enable web search for this request |
responses_tool_choice |
"auto" / "none" |
Let the model decide, or disable |
Annotations (URL Citations)
When web search is active, the model may return annotations containing source URLs. These are included automatically in responses:
Non-streaming (stream: false) — annotations are attached to the message:
{
"choices": [
{
"message": {
"role": "assistant",
"content": "SpaceX launched 29 Starlink satellites...",
"annotations": [
{
"type": "url_citation",
"start_index": 0,
"end_index": 150,
"url": "https://spaceflightnow.com/...",
"title": "SpaceX Falcon 9 launch"
}
]
}
}
]
}
Streaming (stream: true) — annotations arrive as a dedicated chunk before the final stop chunk:
data: {"choices": [{"delta": {"annotations": [{"type": "url_citation", "start_index": 0, "end_index": 150, "url": "https://...", "title": "..."}]}, "finish_reason": null}]}
data: {"choices": [{"delta": {}, "finish_reason": "stop"}]}
Responses API (POST /v1/responses, non-streaming) — annotations are nested inside the output content:
{
"output": [
{
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "SpaceX launched 29 Starlink satellites...",
"annotations": [
{
"type": "url_citation",
"start_index": 0,
"end_index": 150,
"url": "https://spaceflightnow.com/...",
"title": "SpaceX Falcon 9 launch"
}
]
}
]
}
]
}
Example Request
curl http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5",
"messages": [{"role":"user","content":"Find current METAR rules"}],
"stream": true,
"responses_tools": [{"type": "web_search"}],
"responses_tool_choice": "auto"
}'
Notes & Limits
- Requires an active, paid ChatGPT account.
- Context length may be partially used by internal system instructions.
- For the fastest responses, set
--reasoning-efforttolowand--reasoning-summarytonone. - The context size of this route is larger than what you get in the regular ChatGPT app.
- When the model returns a thinking summary, it sends back thinking tags for compatibility with chat apps. Set
--reasoning-compattolegacyto use the reasoning tag instead of inline text. - This project is not affiliated with OpenAI. Use responsibly and at your own risk.
Credits
- Original project: RayBytes/chatmock
- This fork: rapidrabbit76/GPTMock
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gptmock-2026.3.1.tar.gz.
File metadata
- Download URL: gptmock-2026.3.1.tar.gz
- Upload date:
- Size: 59.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ba53ad5fa98e63bf5f5cc4f2425241655819109d73a3e454fe2d828d31a84c6
|
|
| MD5 |
ef06bd68a0c4ea2abdcdca93fc7aa353
|
|
| BLAKE2b-256 |
8d6e08191de86a6ce2c274caab92565d96c21e16a040c596d3e8903fc348bfd7
|
Provenance
The following attestation bundles were made for gptmock-2026.3.1.tar.gz:
Publisher:
pypi-publish.yml on rapidrabbit76/GPTMock
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gptmock-2026.3.1.tar.gz -
Subject digest:
1ba53ad5fa98e63bf5f5cc4f2425241655819109d73a3e454fe2d828d31a84c6 - Sigstore transparency entry: 1006441191
- Sigstore integration time:
-
Permalink:
rapidrabbit76/GPTMock@4f38f7db1b097cef0c15b863ac6e50970f1bf47a -
Branch / Tag:
refs/tags/2026.3.1 - Owner: https://github.com/rapidrabbit76
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@4f38f7db1b097cef0c15b863ac6e50970f1bf47a -
Trigger Event:
release
-
Statement type:
File details
Details for the file gptmock-2026.3.1-py3-none-any.whl.
File metadata
- Download URL: gptmock-2026.3.1-py3-none-any.whl
- Upload date:
- Size: 63.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0753a360ab505cdaf25cb05220f9322ed7e0920ebe85dd21e40277cc2f095dd
|
|
| MD5 |
f7daf38061b1e3c0a2028e2ce9a7c560
|
|
| BLAKE2b-256 |
2bc5f642838100f9c44fdfe6d7f4aded315be4c8246fb0db6d331de76864757f
|
Provenance
The following attestation bundles were made for gptmock-2026.3.1-py3-none-any.whl:
Publisher:
pypi-publish.yml on rapidrabbit76/GPTMock
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gptmock-2026.3.1-py3-none-any.whl -
Subject digest:
a0753a360ab505cdaf25cb05220f9322ed7e0920ebe85dd21e40277cc2f095dd - Sigstore transparency entry: 1006441193
- Sigstore integration time:
-
Permalink:
rapidrabbit76/GPTMock@4f38f7db1b097cef0c15b863ac6e50970f1bf47a -
Branch / Tag:
refs/tags/2026.3.1 - Owner: https://github.com/rapidrabbit76
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@4f38f7db1b097cef0c15b863ac6e50970f1bf47a -
Trigger Event:
release
-
Statement type: