🐙 Multi-armed mocks for LLM apps - Drop-in replacement for OpenAI/Anthropic APIs for deterministic testing
Project description
🐙 Mocktopus
Multi-armed mocks for LLM apps
Mocktopus is a drop-in replacement for OpenAI/Anthropic APIs, designed to make your LLM application tests fast, deterministic, and cost-free.
Why Mocktopus?
Testing LLM applications is challenging:
- Non-deterministic: Same prompt, different responses
- Expensive: Every test run costs API credits
- Slow: API calls add latency to test suites
- Network-dependent: Can't run tests offline
- Complex workflows: Tool calls and streaming complicate testing
Mocktopus solves these problems by providing a local mock server that perfectly mimics LLM APIs.
Features
✅ Drop-in replacement - Just change your base URL ✅ Deterministic responses - Same input → same output ✅ Tool/function calling - Full support for complex workflows ✅ Streaming - Server-sent events (SSE) support ✅ Multiple providers - OpenAI and Anthropic compatible ✅ Zero cost - No API charges for tests ✅ Fast - No network latency ✅ Offline - Run tests without internet
Installation
pip install mocktopus
Quick Start
1. Create a scenario file (scenario.yaml):
version: 1
rules:
- type: llm.openai
when:
model: "gpt-4*"
messages_contains: "hello"
respond:
content: "Hello! How can I help you today?"
2. Start the mock server:
mocktopus serve -s scenario.yaml
3. Point your app to Mocktopus:
from openai import OpenAI
# Instead of the real API:
# client = OpenAI(api_key="sk-...")
# Use Mocktopus:
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="mock-key" # Any string works
)
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "hello"}]
)
print(response.choices[0].message.content)
# Output: "Hello! How can I help you today?"
Usage Modes
Mock Mode (Default)
Use predefined YAML scenarios for deterministic responses:
mocktopus serve -s examples/chat-basic.yaml
Record Mode (Coming Soon)
Proxy and record real API calls for later replay:
mocktopus serve --mode record --recordings-dir ./recordings
Replay Mode (Coming Soon)
Replay previously recorded API interactions:
mocktopus serve --mode replay --recordings-dir ./recordings
Scenario Examples
Basic Chat Response
version: 1
rules:
- type: llm.openai
when:
messages_contains: "weather"
respond:
content: "It's sunny today!"
Function Calling
version: 1
rules:
- type: llm.openai
when:
messages_contains: "weather"
respond:
tool_calls:
- id: "call_123"
type: "function"
function:
name: "get_weather"
arguments: '{"location": "San Francisco"}'
Streaming Response
version: 1
rules:
- type: llm.openai
when:
model: "*"
respond:
content: "This will be streamed..."
delay_ms: 50 # Delay between chunks
chunk_size: 5 # Characters per chunk
Limited Usage
version: 1
rules:
- type: llm.openai
when:
messages_contains: "test"
times: 3 # Only responds 3 times
respond:
content: "Limited response"
CLI Commands
Start Server
# Basic usage
mocktopus serve -s scenario.yaml
# Custom port
mocktopus serve -s scenario.yaml -p 9000
# Verbose logging
mocktopus serve -s scenario.yaml -v
Test Scenarios
# Validate a scenario file
mocktopus validate scenario.yaml
# Simulate a request without starting server
mocktopus simulate -s scenario.yaml --prompt "Hello"
# Generate example scenarios
mocktopus example --type basic > my-scenario.yaml
mocktopus example --type tools > tools-scenario.yaml
Testing with Mocktopus
Pytest Integration
import pytest
from mocktopus import use_mocktopus
def test_my_llm_app(use_mocktopus):
# Load scenario
use_mocktopus.load_yaml("tests/scenarios/test.yaml")
# Get a client
client = use_mocktopus.openai_client()
# Test your app
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "test"}]
)
assert "expected" in response.choices[0].message.content
Continuous Integration
# .github/workflows/test.yml
name: Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- run: pip install -e .
- run: mocktopus serve -s tests/scenarios.yaml &
- run: pytest # Your tests hit localhost:8080
Advanced Features
Pattern Matching
Mocktopus supports multiple matching strategies:
- Exact match:
messages_contains: "exact phrase" - Regex:
messages_regex: "\\d+ items?" - Glob:
model: "gpt-4*"
Response Configuration
respond:
content: "Response text"
delay_ms: 100 # Simulate latency
usage:
input_tokens: 10
output_tokens: 20
# For streaming
chunk_size: 10 # Characters per chunk
Roadmap
- OpenAI chat completions API
- Streaming support (SSE)
- Function/tool calling
- Anthropic messages API
- Recording & replay
- Embeddings API
- Assistants API
- Image generation
- Semantic similarity matching
- Response templating
- Load testing mode
Contributing
We welcome contributions! See our Contributing Guide for details.
License
MIT - See LICENSE for details.
Links
Made with 🐙 by EvalOps
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mocktopus-0.1.1.tar.gz.
File metadata
- Download URL: mocktopus-0.1.1.tar.gz
- Upload date:
- Size: 24.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e0949c0800832cbc3073f1588f9309193312974c9c0e992c4a8dd5b52792ebc
|
|
| MD5 |
ca07c6c2d1d8abb3c10c9f66acfab8ba
|
|
| BLAKE2b-256 |
b8811e9dd3033cc24f12b2e70e88119d419c214e86a27735c63b2b9734a23fc4
|
File details
Details for the file mocktopus-0.1.1-py3-none-any.whl.
File metadata
- Download URL: mocktopus-0.1.1-py3-none-any.whl
- Upload date:
- Size: 20.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
992d52667f843ab63e5a8cbc2d39879796033ee32efff2da46b9c4652adf7b09
|
|
| MD5 |
64ddf2c8c191eddc81047e4c645db008
|
|
| BLAKE2b-256 |
14146a1a66fb6386ef5141133f8b598dee583cd829e22975f014770dfb14775c
|