Quick CLI to test OpenAI-compatible API endpoints.
Project description
openai-tests
Stop hand-building one-off cURL probes for every OpenAI-compatible endpoint.
Quickstart | See It Work | Checks | Recipes | Documentation
openai-tests is a small CLI for proving whether an API really behaves like an OpenAI endpoint. It sends known-good requests, checks
the response shape and content, compares related API surfaces, and can print the exact redacted HTTP exchange when something looks
wrong.
If you only need one raw request, curl is still perfect. If you are validating a gateway, proxy, hosted model, local server, or
OpenAI-compatible deployment more than once, this gives you repeatable smoke tests instead of a folder full of hand-edited JSON bodies.
Quickstart
git clone https://github.com/donadiosolutions/openai-tests.git
cd openai-tests
uv sync --all-groups
Run the fastest useful check against OpenAI:
export OPENAI_API_KEY="sk-..."
uv run openai-tests text-simple --model gpt-4.1-mini
Or point the same check at any compatible endpoint:
export OPENAI_TESTS_API_KEY="your-token"
uv run openai-tests text-simple \
--base-url https://your-openai-compatible-service.example \
--model your-model
Base URLs may include /v1 or omit it. Both https://example.test and https://example.test/v1 work.
[!IMPORTANT] Trust surface: endpoint tests read API keys from CLI flags or environment variables, send HTTP requests only to the configured
--base-url, and redactAuthorizationin verbose output.asr-simpleuses checked-in MP3 fixtures by default and runsespeak-ngonly when you supply custom text through--expected-transcriptwithout--audio-file; synthesized WAV files are written to a temporary directory and removed after the run.uv syncinstalls project dependencies into the local environment;uv run poe socketalso runsnpm cifrom the checked-in lockfile for the pinned Socket CLI. To remove a local checkout, delete the repository directory and any generated.venvornode_modulesdirectories.
See It Work
$ uv run openai-tests text-simple --model gpt-4.1-mini
/v1/chat/completions: PASSED
Question: What is the capital of France?
Response: Paris is the capital of France.
/v1/responses: PASSED
Question: What is the capital of France?
Response: Paris is the capital of France.
Overall: PASSED
That run did more than check for HTTP 200. It asked the same simple question through both /v1/chat/completions and /v1/responses,
extracted text from each response, verified the text was usable, and would have warned if the responses endpoint echoed important
parameters differently.
When you need to inspect the actual payloads, add --verbose:
uv run openai-tests text-simple \
--base-url https://your-openai-compatible-service.example \
--model your-model \
--verbose
Verbose mode prints the request URL, headers, JSON body, response status, response headers, and raw response body. Bearer tokens are redacted.
Checks
| Module | What it exercises | What it catches |
|---|---|---|
list-models |
GET /v1/models |
malformed model-list responses, missing required fields, non-JSON responses, HTTP failures |
text-simple |
/v1/chat/completions and /v1/responses |
empty text, incompatible response shapes, parameter mismatches, unexpected tool-call-like output |
asr-simple |
/v1/chat/completions with audio input and /v1/audio/transcriptions |
missing transcripts, wrong transcript content, streaming/non-streaming shape issues, metadata mismatches |
Each module is intentionally small. The point is not to benchmark model quality. The point is to answer: "Can this endpoint accept the same request shape my OpenAI client will send, and can I trust the response shape I get back?"
Recipes
List available models
uv run openai-tests list-models \
--base-url https://api.openai.com
Output is a schema check plus the returned model IDs:
/v1/models: PASSED
Models:
- gpt-4.1-mini
- gpt-4.1
- gpt-4o-transcribe
Overall: PASSED
Compare chat completions and responses
uv run openai-tests text-simple \
--base-url https://api.openai.com \
--model gpt-4.1-mini
Use separate models when a provider routes the two APIs differently:
uv run openai-tests text-simple \
--model gpt-4.1-mini \
--responses-model gpt-4.1
Test speech recognition
uv run openai-tests asr-simple \
--base-url https://api.openai.com \
--model gpt-4o-audio-preview
If the transcriptions endpoint needs a different model than chat completions,
pass --transcriptions-model explicitly.
By default, asr-simple sends two checked-in MP3 fixtures:
1. Alpha through Zulu in NATO spelling words
2. The quick brown fox jumps over the lazy dog
To test your own fixture:
uv run openai-tests asr-simple \
--audio-file ./speech.wav \
--audio-format wav \
--expected-transcript \
"Alpha Bravo Charlie Delta Echo Foxtrot Golf Hotel India Juliet"
To synthesize custom spoken text on demand with espeak-ng, omit --audio-file and provide only the transcript text:
uv run openai-tests asr-simple \
--expected-transcript "Please transcribe this sentence exactly."
Pass provider-specific knobs
Optional API parameters stay unset until you pass them. JSON values can be inline or loaded from a file with @path.
uv run openai-tests text-simple \
--responses-metadata-json '{"suite":"compatibility-smoke"}' \
--responses-temperature 0
Boolean parameters use paired flags, so you can distinguish "unset" from explicit true or false:
uv run openai-tests text-simple --responses-store
uv run openai-tests text-simple --no-responses-store
Status Labels
| Status | Meaning |
|---|---|
PASSED |
The endpoint returned a usable response and no warnings were produced. |
PARTIAL SUCCESS |
The endpoint returned usable content, but a warning suggests compatibility drift. |
FAILED |
The request failed, the response shape was invalid, or the content check did not pass. |
The CLI exits with 0 only when all checked endpoints pass. It exits with 1 for failures or partial successes, and 2 for local
configuration errors such as invalid JSON arguments.
For ASR checks, each endpoint result also prints a simple word error rate counter as WER: <percent> (<errors>/<reference words>).
The default acceptance rule allows the transcript to pass when either the expected-word threshold is met or the WER stays below 15%.
Common NATO-style spelling variants such as viktor, whisky, charly, romeu, uniforme, yanke, and zooloo are normalized
before scoring.
Configuration
Common options:
| Option | Environment fallback | Default |
|---|---|---|
--api-key |
OPENAI_API_KEY, then OPENAI_TESTS_API_KEY |
no authorization header |
--base-url |
OPENAI_BASE_URL, then OPENAI_TESTS_BASE_URL |
https://api.openai.com |
--model |
OPENAI_MODEL, then OPENAI_TESTS_MODEL |
module-specific |
--timeout |
none | 30 seconds |
--verbose |
none | off |
The live integration runner also loads OPENAI_API_KEY from a repository-root .env file before falling back to the inherited
environment.
How It Works
The CLI keeps request construction explicit and inspectable. Modules use direct HTTP requests from the Python standard library rather than an SDK, so the payloads stay close to the API surface being tested.
- Required endpoint fields receive conservative defaults.
- Optional endpoint fields remain
Noneuntil the user passes them. Nonevalues are pruned before JSON or multipart requests are sent.- String-or-object API parameters expose both plain string flags and
-jsonflags. - Full HTTP exchanges are captured for verbose output.
- Secrets are redacted before printing.
The module registry lives in src/openai_tests/registry.py. New endpoint checks belong under src/openai_tests/test_modules/ and are
documented under docs/.
Development and CI
Run the standard local checks before merging changes:
uv run poe fmt
uv run poe check
uv run poe socket
uv run poe check runs formatting checks, Ruff linting, type checking, actionlint, unit tests, live OpenAI integration tests, coverage
validation, and pre-commit hooks. The repository requires 100% line and branch coverage.
uv run poe socket installs the pinned Socket CLI from package-lock.json, generates CycloneDX manifests, and runs an authenticated
read-only Socket scan preflight. It requires SOCKET_API_KEY, SOCKET_API_TOKEN, or SOCKET_CLI_API_TOKEN.
GitHub Actions runs unit and integration in parallel, then a validate job succeeds only when both passed. Socket's GitHub App
publishes separate required dependency-security checks.
Documentation
- Documentation index
- Installation and configuration
- CLI usage
- Live OpenAI integration tests
- text-simple module
- asr-simple module
- list-models module
- Development and verification
FAQ
Can I use it against a local service with no auth?
Yes. If no API key is provided through --api-key, OPENAI_API_KEY, or OPENAI_TESTS_API_KEY, no Authorization header is sent.
Is this a replacement for a full API conformance suite?
No. It is a focused smoke-test tool. It is meant to catch obvious request/response incompatibilities quickly and repeatedly.
Why not use the OpenAI SDK?
The tests deliberately use direct HTTP requests so the request body, endpoint URL, response status, and raw response are easy to inspect.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openai_tests-0.1.0.tar.gz.
File metadata
- Download URL: openai_tests-0.1.0.tar.gz
- Upload date:
- Size: 224.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4a5ab42ee97a1adbebbcbf891ddc700036318cb5c6e5e33694a37aed06ea9f5
|
|
| MD5 |
9eaa4022b3187f995ba3a29e5d3e3bcc
|
|
| BLAKE2b-256 |
2703945b2ddadf9aa3c552da6a1a1324fbba272d35c6763e387090c00a9dde5f
|
Provenance
The following attestation bundles were made for openai_tests-0.1.0.tar.gz:
Publisher:
publish-pypi.yml on donadiosolutions/openai-tests
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openai_tests-0.1.0.tar.gz -
Subject digest:
b4a5ab42ee97a1adbebbcbf891ddc700036318cb5c6e5e33694a37aed06ea9f5 - Sigstore transparency entry: 1439056060
- Sigstore integration time:
-
Permalink:
donadiosolutions/openai-tests@2c2d221a45cec47a43abbeffde56f3f287fff8c8 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/donadiosolutions
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@2c2d221a45cec47a43abbeffde56f3f287fff8c8 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file openai_tests-0.1.0-py3-none-any.whl.
File metadata
- Download URL: openai_tests-0.1.0-py3-none-any.whl
- Upload date:
- Size: 169.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
236cd554df29e5d52ae8f8c703fcf7d3db554873189aee7abca6fb5bfac8182d
|
|
| MD5 |
989262a102b274c2dc3b9211d839f90f
|
|
| BLAKE2b-256 |
4351a85d9040bffddfe20eb2a859ce27e62951c85a2866ac8477e2d75345d52b
|
Provenance
The following attestation bundles were made for openai_tests-0.1.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on donadiosolutions/openai-tests
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openai_tests-0.1.0-py3-none-any.whl -
Subject digest:
236cd554df29e5d52ae8f8c703fcf7d3db554873189aee7abca6fb5bfac8182d - Sigstore transparency entry: 1439056072
- Sigstore integration time:
-
Permalink:
donadiosolutions/openai-tests@2c2d221a45cec47a43abbeffde56f3f287fff8c8 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/donadiosolutions
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@2c2d221a45cec47a43abbeffde56f3f287fff8c8 -
Trigger Event:
workflow_dispatch
-
Statement type: