Skip to main content

Record real OpenAI/Anthropic API traffic and emit fakellm.yaml rules automatically.

Project description

fakellm-recorder

Record real OpenAI/Anthropic API traffic and emit fakellm fakellm.yaml rules automatically.

VCR-style zero-effort capture, combined with fakellm's editable, error-injectable YAML rules. Run your real code against the real API once through the proxy, get a fakellm.yaml out, hand-edit it to add the error paths recordings can't capture, and commit it.

Install

pip install fakellm-recorder    # once published
# or, from source:
pip install -e .

The loop

1. Record. Start the proxy and point your SDK's base_url at it, then run your existing test or script once.

fakellm-recorder proxy --upstream auto --out sessions/run1.jsonl
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:8888/v1", api_key="sk-...real-key...")
# run_my_agent(client, ...)  # real traffic flows through and is recorded

The proxy forwards byte-faithfully to the real upstream, so your code behaves exactly as in production. Credentials are stripped before anything is written to disk; prompts/completions are scrubbed for emails and obvious keys by default.

2. Emit. Turn the recorded session into rules.

fakellm-recorder emit sessions/run1.jsonl --out fakellm.yaml --match-strictness balanced

3. Edit + commit. Add the 429s/500s/malformed responses recordings can't capture, then commit fakellm.yaml alongside your tests. At replay time it's plain fakellm — no recorder needed.

fakellm serve   # from the fakellm package

Match strictness

The hard part is choosing a messages_contain substring specific enough to fire on the right turn but loose enough not to only match that exact transcript. The emitter ranks candidate n-grams by inverse frequency across the whole session, so shared boilerplate (e.g. "You are a helpful assistant") is never chosen.

Mode What it emits
loose mostly turn: + a model glob (gpt-4*); "any response will do" tests
balanced (default) turn: + one distinctive substring (or a tool_result_contains anchor on post-tool turns)
strict turn + substring + exact model_matches + tools_include; closest to a faithful replay

Lint

A standalone check for unreachable rules (shadowed by an earlier first-match), unknown condition/response keys, and responses that set neither content nor tool_calls unintentionally.

fakellm-recorder lint fakellm.yaml

Security

  • Credentials are never persisted. Header capture is allowlist-based, so Authorization / x-api-key / auth headers are dropped at capture time even if PII scrubbing is disabled.
  • PII scrubbing is on by default (emails, OpenAI/Anthropic keys, bearer tokens) because the generated YAML is a commitable artifact. Disable with --no-scrub (credentials are still stripped). Add your own regexes in code via Scrubber(custom_patterns=[...]).

Caveats

  • fakellm is a new, single-author beta (0.3.x). Emitted files are stamped with the targeted config version; pin a fakellm version and expect schema churn.
  • Streaming reconstruction handles both SSE dialects and keeps raw events for a future chunk-fidelity replay mode, but assembly of exotic event orderings may need tuning — the raw events are retained so you can fix assembly without re-recording.
  • Single worker. fakellm stores state in process memory; keep both it and this proxy single-worker.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fakellm_recorder-0.1.0.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fakellm_recorder-0.1.0-py3-none-any.whl (17.0 kB view details)

Uploaded Python 3

File details

Details for the file fakellm_recorder-0.1.0.tar.gz.

File metadata

  • Download URL: fakellm_recorder-0.1.0.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for fakellm_recorder-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7f828573bd39a2e95050d8aedf2e6cfe82885d4508c18d36b162695b247dd76a
MD5 92324035bd451db9b8316d504c4a0987
BLAKE2b-256 372a3d828cb495502c9b9cb7d29a53a957ede9991651f6dcdd57ad227b97eea7

See more details on using hashes here.

File details

Details for the file fakellm_recorder-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for fakellm_recorder-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9ef96e24e22231c3ef3330812cd7aba185b63548ceb772b3f64d55646fe34841
MD5 fdfe7c35b3a8de4046d0856e002a0718
BLAKE2b-256 e0a56cbffb44acd3d380a2f888d5dcd6153f63c3e3b2528ddcc17afde5c44f55

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page