Skip to main content

OpenAI & Ollama compatible API powered by your ChatGPT account

Project description

GPTMock

This is a fork of RayBytes/chatmock. The original Flask + synchronous requests stack has been replaced with FastAPI + async httpx, a layered architecture (router / service / infra), pydantic-settings configuration, and uv as the build system.

OpenAI & Ollama compatible API powered by your ChatGPT account.

gptmock runs a local server that proxies requests to the ChatGPT Codex backend, exposing an OpenAI/Ollama compatible API. Use GPT-5, GPT-5-Codex, and other models directly from your ChatGPT Plus/Pro subscription — no API key required.

Requirements

  • Python 3.13+
  • Paid ChatGPT account (Plus / Pro / Team / Enterprise)
  • uv (for uvx usage)

Quick Start (uvx)

The fastest way to run gptmock. No clone, no install — just uvx.

1. Login

uvx gptmock login

A browser window will open for ChatGPT OAuth. After login, tokens are saved to ~/.config/gptmock/auth.json.

2. Start the server

uvx gptmock serve

The server starts at http://127.0.0.1:8000. Use http://127.0.0.1:8000/v1 as your OpenAI base URL.

3. Verify

uvx gptmock info

Tip: Shell Alias

alias gptmock='uvx gptmock'

gptmock login
gptmock serve --port 9000
gptmock info

Note: To install directly from the GitHub repository instead of PyPI:

uvx --from "git+https://github.com/rapidrabbit76/GPTMock" gptmock login
uvx --from "git+https://github.com/rapidrabbit76/GPTMock" gptmock serve

Quick Start (Docker)

No build required — pull the pre-built image and run.

1. Create docker-compose.yml

services:
  serve:
    image: rapidrabbit76/gptmock:latest
    container_name: gptmock
    command: ["serve", "--verbose", "--host", "0.0.0.0"]
    ports:
      - "8000:8000"
      - "1455:1455"
    volumes:
      - gptmock-data:/data
    environment:
      - GPTMOCK_HOME=/data
      - CHATGPT_LOCAL_LOGIN_BIND=0.0.0.0
    healthcheck:
      test: ["CMD-SHELL", "python -c \"import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('http://127.0.0.1:8000/health').status==200 else 1)\""]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 5s

  login:
    image: rapidrabbit76/gptmock:latest
    command: ["login"]
    ports:
      - "1455:1455"
    volumes:
      - gptmock-data:/data
    environment:
      - GPTMOCK_HOME=/data
      - CHATGPT_LOCAL_LOGIN_BIND=0.0.0.0

volumes:
  gptmock-data:

2. Login

docker compose run --rm --service-ports login login

A URL will be printed. Open it in your browser and complete the OAuth flow. If your browser can't reach the container, copy the full redirect URL from the browser address bar and paste it into the terminal.

3. Start the server

docker compose up -d serve

4. Verify

curl -s http://localhost:8000/health | jq .

Docker Environment Variables

Configure via .env file or docker-compose environment:

Variable Default Description
GPTMOCK_PORT 8000 Server port
GPTMOCK_VERBOSE false Enable request/response logging
GPTMOCK_REASONING_EFFORT medium minimal / low / medium / high / xhigh
GPTMOCK_REASONING_SUMMARY auto auto / concise / detailed / none
GPTMOCK_REASONING_COMPAT think-tags think-tags / o3 / legacy
GPTMOCK_EXPOSE_REASONING_MODELS false Expose reasoning levels as separate models
GPTMOCK_DEFAULT_WEB_SEARCH false Enable web search tool by default

Usage Examples

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:8000/v1",
    api_key="anything"  # ignored by gptmock
)

resp = client.chat.completions.create(
    model="gpt-5",
    messages=[{"role": "user", "content": "hello world"}]
)
print(resp.choices[0].message.content)

Python (LangChain)

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="http://127.0.0.1:8000/v1",
    api_key="anything",
    model="gpt-5",
)
response = llm.invoke("hello world")
print(response.content)

curl

curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "messages": [{"role": "user", "content": "hello world"}]
  }'

Supported Models

Model Reasoning Efforts Status
gpt-5 minimal / low / medium / high ✅ Supported
gpt-5.1 low / medium / high ✅ Supported
gpt-5.2 low / medium / high / xhigh ✅ Supported
gpt-5-codex low / medium / high ✅ Supported
gpt-5.1-codex low / medium / high ✅ Supported
gpt-5.1-codex-max low / medium / high / xhigh ✅ Supported
gpt-5.2-codex low / medium / high / xhigh ✅ Supported
gpt-5.3-codex low / medium / high / xhigh ✅ Supported
gpt-5.3-codex-spark low / medium / high / xhigh ✅ Supported

Deprecated / Unsupported Models

Model Reason
codex-mini / gpt-5.1-codex-mini ❌ Discontinued by Codex Backend — removed

API Endpoints

Method Path Description
POST /v1/chat/completions OpenAI Chat Completions (stream / non-stream)
POST /v1/completions OpenAI Text Completions
POST /v1/responses OpenAI Responses API (for LangChain codex routing)
GET /v1/models List available models
POST /api/chat Ollama-compatible chat
GET /api/tags Ollama model list
GET /health Health check

Features

  • Streaming & Non-streaming — real-time SSE and buffered JSON responses
  • Structured Outputresponse_format with json_schema / json_object support
  • Tool / Function Calling — including web search with URL citation annotations via responses_tools
  • Thinking Summaries<think> tags, o3 reasoning format, or legacy mode
  • Responses APIPOST /v1/responses for LangChain and other clients that auto-route codex models
  • Ollama Compatibility — drop-in replacement for Ollama API consumers
  • Auto Token Refresh — JWT tokens are refreshed automatically before expiry

Server Options

gptmock serve [OPTIONS]
Option Default Description
--host 127.0.0.1 Bind address
--port 8000 Bind port
--verbose off Log request/response payloads
--reasoning-effort medium Default reasoning effort level
--reasoning-summary auto Reasoning summary verbosity
--reasoning-compat think-tags How reasoning is exposed (think-tags / o3 / legacy)
--expose-reasoning-models off Show each reasoning level as a separate model in /v1/models
--enable-web-search off Enable web search tool by default

Web Search

Use --enable-web-search to enable the web search tool by default for all requests. When enabled, the model decides autonomously whether a query needs a web search. You can also enable web search per-request without the server flag by passing the parameters below.

Request Parameters

Parameter Values Description
responses_tools [{"type":"web_search"}] Enable web search for this request
responses_tool_choice "auto" / "none" Let the model decide, or disable

Annotations (URL Citations)

When web search is active, the model may return annotations containing source URLs. These are included automatically in responses:

Non-streaming (stream: false) — annotations are attached to the message:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "SpaceX launched 29 Starlink satellites...",
        "annotations": [
          {
            "type": "url_citation",
            "start_index": 0,
            "end_index": 150,
            "url": "https://spaceflightnow.com/...",
            "title": "SpaceX Falcon 9 launch"
          }
        ]
      }
    }
  ]
}

Streaming (stream: true) — annotations arrive as a dedicated chunk before the final stop chunk:

data: {"choices": [{"delta": {"annotations": [{"type": "url_citation", "start_index": 0, "end_index": 150, "url": "https://...", "title": "..."}]}, "finish_reason": null}]}
data: {"choices": [{"delta": {}, "finish_reason": "stop"}]}

Responses API (POST /v1/responses, non-streaming) — annotations are nested inside the output content:

{
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "SpaceX launched 29 Starlink satellites...",
          "annotations": [
            {
              "type": "url_citation",
              "start_index": 0,
              "end_index": 150,
              "url": "https://spaceflightnow.com/...",
              "title": "SpaceX Falcon 9 launch"
            }
          ]
        }
      ]
    }
  ]
}

Example Request

curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "messages": [{"role":"user","content":"Find current METAR rules"}],
    "stream": true,
    "responses_tools": [{"type": "web_search"}],
    "responses_tool_choice": "auto"
  }'

Notes & Limits

  • Requires an active, paid ChatGPT account.
  • Context length may be partially used by internal system instructions.
  • For the fastest responses, set --reasoning-effort to low and --reasoning-summary to none.
  • The context size of this route is larger than what you get in the regular ChatGPT app.
  • When the model returns a thinking summary, it sends back thinking tags for compatibility with chat apps. Set --reasoning-compat to legacy to use the reasoning tag instead of inline text.
  • This project is not affiliated with OpenAI. Use responsibly and at your own risk.

Credits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gptmock-2026.2.25.tar.gz (56.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gptmock-2026.2.25-py3-none-any.whl (62.7 kB view details)

Uploaded Python 3

File details

Details for the file gptmock-2026.2.25.tar.gz.

File metadata

  • Download URL: gptmock-2026.2.25.tar.gz
  • Upload date:
  • Size: 56.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gptmock-2026.2.25.tar.gz
Algorithm Hash digest
SHA256 a6c18acc37ba7a71a12a4101b234e3fda36c00f72045de4b895bff853a89a929
MD5 721c155b8afbba344d22b31baa96588a
BLAKE2b-256 cad7e49bfa6dec27647f0d90cf94c4ad86a06e4e217dc5ff8b35a9c65f56281b

See more details on using hashes here.

Provenance

The following attestation bundles were made for gptmock-2026.2.25.tar.gz:

Publisher: pypi-publish.yml on rapidrabbit76/GPTMock

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gptmock-2026.2.25-py3-none-any.whl.

File metadata

  • Download URL: gptmock-2026.2.25-py3-none-any.whl
  • Upload date:
  • Size: 62.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gptmock-2026.2.25-py3-none-any.whl
Algorithm Hash digest
SHA256 834e4e0459173bc6dc28b6e155f9205d9faab836f9dc48cf9ccf6063979c9b81
MD5 f8cf6c3b2b8a7ae4e60992bfd73aaf9e
BLAKE2b-256 3fabde77bd128536633d0052b7e23a58dc820232190b7c798853516ff97bde28

See more details on using hashes here.

Provenance

The following attestation bundles were made for gptmock-2026.2.25-py3-none-any.whl:

Publisher: pypi-publish.yml on rapidrabbit76/GPTMock

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page