A mock server that mimics OpenAI and Anthropic API formats for testing
Project description
Mock LLM Server
A FastAPI-based mock LLM server that mimics OpenAI and Anthropic API formats. Instead of calling actual language models, it uses predefined responses from a YAML configuration file.
This is made for when you want a deterministic response for testing or development purposes.
Check out the CodeGate when you're done here!
Project Structure
mockllm/
├── src/
│ └── mockllm/
│ ├── __init__.py
│ ├── config.py # Response configuration handling
│ ├── models.py # Pydantic models for API
│ └── server.py # FastAPI server implementation
├── tests/
│ └── test_server.py # Test suite
├── example.responses.yml # Example response configuration
├── LICENSE # MIT License
├── MANIFEST.in # Package manifest
├── README.md # This file
├── pyproject.toml # Project configuration
└── requirements.txt # Dependencies
Features
- OpenAI and Anthropic compatible API endpoints
- Streaming support (character-by-character response streaming)
- Configurable responses via YAML file
- Hot-reloading of response configurations
- JSON logging
- Error handling
- Mock token counting
Installation
From PyPI
pip install mockllm
From Source
- Clone the repository:
git clone https://github.com/lukehinds/mockllm.git
cd mockllm
- Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate # On Windows, use: venv\Scripts\activate
- Install dependencies:
pip install -e ".[dev]" # Install with development dependencies
# or
pip install -e . # Install without development dependencies
Usage
- Set up the responses.yml
cp example.responses.yml responses.yml
- Start the server:
python -m mockllm
Or using uvicorn directly:
uvicorn mockllm.server:app --reload
The server will start on http://localhost:8000
- Send requests to the API endpoints:
OpenAI Format
Regular request:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "mock-llm",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
]
}'
Streaming request:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "mock-llm",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
],
"stream": true
}'
Anthropic Format
Regular request:
curl -X POST http://localhost:8000/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-sonnet-20240229",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
]
}'
Streaming request:
curl -X POST http://localhost:8000/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-sonnet-20240229",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
],
"stream": true
}'
Configuration
Response Configuration
Responses are configured in responses.yml. The file has two main sections:
responses: Maps input prompts to predefined responsesdefaults: Contains default configurations like the unknown response message
Example responses.yml:
responses:
"what colour is the sky?": "The sky is blue during a clear day due to a phenomenon called Rayleigh scattering."
"what is 2+2?": "2+2 equals 9."
defaults:
unknown_response: "I don't know the answer to that. This is a mock response."
Hot Reloading
The server automatically detects changes to responses.yml and reloads the configuration without requiring a restart.
API Format
OpenAI Format
Request Format
{
"model": "mock-llm",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
],
"temperature": 0.7,
"max_tokens": 150,
"stream": false
}
Response Format
Regular response:
{
"id": "mock-123",
"object": "chat.completion",
"created": 1700000000,
"model": "mock-llm",
"choices": [
{
"message": {
"role": "assistant",
"content": "The sky is blue during a clear day due to a phenomenon called Rayleigh scattering."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 5,
"total_tokens": 15
}
}
Streaming response (Server-Sent Events format):
data: {"id":"mock-123","object":"chat.completion.chunk","created":1700000000,"model":"mock-llm","choices":[{"delta":{"role":"assistant"},"index":0}]}
data: {"id":"mock-124","object":"chat.completion.chunk","created":1700000000,"model":"mock-llm","choices":[{"delta":{"content":"T"},"index":0}]}
data: {"id":"mock-125","object":"chat.completion.chunk","created":1700000000,"model":"mock-llm","choices":[{"delta":{"content":"h"},"index":0}]}
... (character by character)
data: {"id":"mock-999","object":"chat.completion.chunk","created":1700000000,"model":"mock-llm","choices":[{"delta":{},"index":0,"finish_reason":"stop"}]}
data: [DONE]
Anthropic Format
Request Format
{
"model": "claude-3-sonnet-20240229",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
],
"max_tokens": 1024,
"stream": false
}
Response Format
Regular response:
{
"id": "mock-123",
"type": "message",
"role": "assistant",
"model": "claude-3-sonnet-20240229",
"content": [
{
"type": "text",
"text": "The sky is blue during a clear day due to a phenomenon called Rayleigh scattering."
}
],
"usage": {
"input_tokens": 10,
"output_tokens": 5,
"total_tokens": 15
}
}
Streaming response (Server-Sent Events format):
data: {"type":"message_delta","id":"mock-123","delta":{"type":"content_block_delta","index":0,"delta":{"text":"T"}}}
data: {"type":"message_delta","id":"mock-123","delta":{"type":"content_block_delta","index":0,"delta":{"text":"h"}}}
... (character by character)
data: [DONE]
Development
Running Tests
pip install -e ".[dev]" # Install development dependencies
pytest tests/
Code Quality
# Format code
black .
isort .
# Type checking
mypy src/
# Linting
ruff check .
Error Handling
The server includes comprehensive error handling:
- Invalid requests return 400 status codes with descriptive messages
- Server errors return 500 status codes with error details
- All errors are logged using JSON format
Logging
The server uses JSON-formatted logging for:
- Incoming request details
- Response configuration loading
- Error messages and stack traces
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mockllm-0.0.1.tar.gz.
File metadata
- Download URL: mockllm-0.0.1.tar.gz
- Upload date:
- Size: 15.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b215e8adde77127e91ed9676510f18d169b9324857a638cc06426bdc7998d6d3
|
|
| MD5 |
122311382316fd7623081982c1b3ac79
|
|
| BLAKE2b-256 |
48fa7a71658340a99f09fce64ea2f150cc88f80655485bbaf98e21e02007271f
|
Provenance
The following attestation bundles were made for mockllm-0.0.1.tar.gz:
Publisher:
publish.yml on stacklok/mockllm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mockllm-0.0.1.tar.gz -
Subject digest:
b215e8adde77127e91ed9676510f18d169b9324857a638cc06426bdc7998d6d3 - Sigstore transparency entry: 171316969
- Sigstore integration time:
-
Permalink:
stacklok/mockllm@a934fc97de72e6866d8ecf7a00b06fd0a5ffd73a -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/stacklok
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a934fc97de72e6866d8ecf7a00b06fd0a5ffd73a -
Trigger Event:
release
-
Statement type:
File details
Details for the file mockllm-0.0.1-py3-none-any.whl.
File metadata
- Download URL: mockllm-0.0.1-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
676ac4ad046a93ff6818c5eecf1952b2dbe2ceffdbb2c271d7ef499ca818b429
|
|
| MD5 |
258487bcb9cf7ad6cb146cd94bafc31f
|
|
| BLAKE2b-256 |
a29be4efa6f46aca8b3df3e46cc79ac9b75b1e757860f2e326abd92f1b315294
|
Provenance
The following attestation bundles were made for mockllm-0.0.1-py3-none-any.whl:
Publisher:
publish.yml on stacklok/mockllm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mockllm-0.0.1-py3-none-any.whl -
Subject digest:
676ac4ad046a93ff6818c5eecf1952b2dbe2ceffdbb2c271d7ef499ca818b429 - Sigstore transparency entry: 171316970
- Sigstore integration time:
-
Permalink:
stacklok/mockllm@a934fc97de72e6866d8ecf7a00b06fd0a5ffd73a -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/stacklok
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a934fc97de72e6866d8ecf7a00b06fd0a5ffd73a -
Trigger Event:
release
-
Statement type: