Skip to main content

A command-line tool for querying OpenRouter AI models and providers

Project description

OpenRouter Inspector

CI codecov PyPI Python License: MIT Ruff security: bandit Pre-commit Tests Last commit Issues

A lightweight CLI for exploring OpenRouter AI models, listing provider endpoints with supported model parameters, and benchmarking endpoint latency and throughput.

Installation

Requirements

  • Python 3.10+

From source (current)

The package is not on PyPI yet. Install from source:

  • With pipx (recommended for CLIs):
    pipx install .
    
  • Or with pip into your active environment:
    pip install .
    

To try the latest in editable mode (for experimentation), use:

pip install -e .

Contributing / Development

If you want to hack on the project (dev setup, tests, QA, pre-commit, etc.), see the dedicated contributor guide:

CONTRIBUTING.md

Features

  • Explore available models and provider-specific endpoints from OpenRouter.
  • Rich table output with pricing per 1M tokens and optional provider counts.
  • Change detection for new models and pricing changes between runs.
  • JSON output for easy scripting.

Usage

The CLI supports both subcommands and lightweight global flags.

Authentication

Set your OpenRouter API key via environment variable (required):

export OPENROUTER_API_KEY=sk-or-...

For security, the CLI does not accept API keys via command-line flags. It reads the key only from the OPENROUTER_API_KEY environment variable. If the key is missing or invalid, the CLI shows a friendly error and exits.

Quick starts

Subcommands:

# List all models
openrouter-inspector list

# List models filtered by substring (matches id or display name)
openrouter-inspector list openai

# List models with multiple filters (AND logic)
openrouter-inspector list meta free

# Or just type your search terms; default action is `list`
openrouter-inspector gemini-2.0 free

# Detailed provider endpoints (exact model id)
openrouter-inspector endpoints deepseek/deepseek-r1

# To check the endpoint health and latency
openrouter-inspector ping google/gemini-2.0-flash-exp:free

Commands

list

openrouter-inspector list [filters...] [--with-providers] [--sort-by id|name|context|providers] [--desc] [--format table|json|yaml]
  • Displays all available models with enhanced table output (Name, ID, Context, Input/Output pricing).
  • Optional positional filters performs case-insensitive substring matches against model id and name using AND logic.
  • Context values are displayed with K suffix (e.g., 128K).
  • Input/Output prices are shown per million tokens in USD.
  • Change Detection: Automatically detects new models and pricing changes compared to previous runs with the same parameters. New models are shown in a separate table, and pricing changes are highlighted in yellow.

Options:

  • --format [table|json|yaml] (default: table)
  • --with-providers add a Providers column (makes extra API calls per model)
  • --sort-by [id|name|context|providers] (default: id)
  • --desc sort descending

endpoints

openrouter-inspector endpoints MODEL_ID [--min-quant VALUE] [--min-context VALUE] [--sort-by provider|model|quant|context|maxout|price_in|price_out] [--desc] [--per-1m] [--format table|json|yaml]

Shows detailed provider offers for an exact model id (author/slug), with:

  • Provider, Model (provider endpoint name), Reason (+/-), Quant, Context (K), Max Out (K), Input/Output price (USD/1M)

Behavior:

  • Fails if model id does not match an exact existing model or returns no offers.

Filters and sorting:

  • --min-quant VALUE minimum quantization (e.g., fp8). Unspecified quant (“—”) is included as best.
  • --min-context VALUE minimum context window (e.g., 128K or 131072).
  • --sort-by [provider|model|quant|context|maxout|price_in|price_out] (default: provider)
  • --desc sort descending

check

openrouter-inspector check MODEL_ID PROVIDER_NAME ENDPOINT_NAME

Checks a specific provider endpoint's health using OpenRouter API status. Web-scraped metrics have been removed.

Behavior:

  • Returns one of: Functional, Disabled.
  • If API indicates provider is offline/disabled or not available → Disabled.
  • Otherwise → Functional.

Options:

  • --log-level [CRITICAL|ERROR|WARNING|INFO|DEBUG|NOTSET] set logging level

ping

openrouter-inspector ping MODEL_ID [PROVIDER_NAME]
openrouter-inspector ping MODEL_ID@PROVIDER_NAME

# Examples
openrouter-inspector ping openai/o4-mini
openrouter-inspector ping deepseek/deepseek-chat-v3-0324:free Chutes
openrouter-inspector ping deepseek/deepseek-chat-v3-0324:free@Chutes
  • Performs an end-to-end chat completion call to verify the functional state of a model or a specific provider endpoint.
  • Uses a tiny “Ping/Pong” prompt and minimizes completion size for a fast and inexpensive check.
  • When a provider is specified (positional or @ shorthand), the request pins routing order to that provider and disables fallbacks.
  • Prints the provider that served the request, token usage, USD cost (unrounded when provided by the API), measured latency, and effective TTL.
  • Returns OS exit code 0 on 100% success (zero packet loss) and 1 otherwise, making it suitable for scripting.

Behavior:

  • Default timeout: 60s. Change via --timeout <seconds>.
  • Default ping count: 3. Change via -n <count> or -c <count>.
  • Reasoning minimized by default for low-cost pings (reasoning.effort=low, exclude=true; legacy include_reasoning=false).
  • Caps max_tokens to 4 for expected “Pong” reply.
  • Dynamically formats latency: <1000ms prints in ms; >=1s prints in seconds with two decimals (e.g., 1.63s).

Options:

  • --timeout <seconds>: Per-request timeout override (defaults to 60 if missing or invalid).
  • -n <count>, -c <count>: Number of pings to send (defaults to 3).
  • --filthy-rich: Required if sending more than 10 pings to acknowledge potential API costs.
  • --log-level [CRITICAL|ERROR|WARNING|INFO|DEBUG|NOTSET]: Set logging level.

Example output:


Pinging https://openrouter.ai/api/v1/chat/completions/tngtech/deepseek-r1t2-chimera:free@Chutes with 26 input tokens:
Reply from: https://openrouter.ai/api/v1/chat/completions/tngtech/deepseek-r1t2-chimera:free@Chutes tokens: 4 cost: $0.00 time=2.50s TTL=60s

Pinging https://openrouter.ai/api/v1/chat/completions/tngtech/deepseek-r1t2-chimera:free@Chutes with 26 input tokens:
Reply from: https://openrouter.ai/api/v1/chat/completions/tngtech/deepseek-r1t2-chimera:free@Chutes tokens: 4 cost: $0.00 time=2.30s TTL=60s

Ping statistics for tngtech/deepseek-r1t2-chimera:free@Chutes:
    Packets: Sent = 2, Received = 2, Lost = 0 (0% loss),
Approximate round trip times in seconds:
    Minimum = 2.30s, Maximum = 2.50s, Average = 2.40s
Total API cost for this run: $0.000000

Notes:

  • Provider pinning uses the OpenRouter provider routing preferences (order, allow_fallbacks=false when a provider is specified). See provider routing docs for details.

⚠️ Warning

Running ping against paid endpoints will make a real completion call and can consume your API credits. It is not a simulated or “no-op” health check. Use with care on metered providers.

Additionally, even when using "free" models, each ping counts against the daily request limit of OpenRouter's free tier. Use with caution, especially if incorporating the command into monitoring scripts or frequent, automated checks.

benchmark

openrouter-inspector benchmark MODEL_ID [PROVIDER_NAME] \
  [--timeout <seconds>] [--max-tokens <limit>] [--format table|json|text] [--min-tps <threshold>] [--debug-response]
  • Measures model or provider-specific throughput (tokens per second, TPS) by streaming a long response.
  • When PROVIDER_NAME is specified (either as a second positional argument or using the shorthand MODEL_ID@PROVIDER_NAME), routing is pinned to that provider and fallbacks are disabled. If omitted, OpenRouter automatically selects the best provider.
  • Supports multiple output modes so you can use it in scripts:
    • table (default): Rich table with metrics (Status, Duration, Input/Output/Total tokens, Throughput, Cost). Includes a short “Benchmarking …” preface.
    • json: Emits a JSON object with the same metrics as the table.
    • text: Emits a single line: TPS: <value>.

Options:

  • --timeout <seconds>: Request timeout (default: 120).
  • --max-tokens <limit>: Safety cap for generated tokens (default: 3000).
  • --format [table|json|text]: Output format (default: table).
  • --min-tps <threshold>: Enforce a minimum TPS threshold (range 1–10000) in text mode. Exit code is 1 when measured TPS is lower than threshold, otherwise 0.
  • --debug-response: Print streaming chunk JSON for debugging (noisy).

Examples:

# Human-friendly table for auto-selected provider
openrouter-inspector benchmark google/gemini-2.0-flash-exp:free

# Pin benchmark to a specific provider (positional argument)
openrouter-inspector benchmark google/gemini-2.0-flash-exp:free Chutes

# Same, using @ shorthand
openrouter-inspector benchmark google/gemini-2.0-flash-exp:free@Chutes

# JSON for automation
openrouter-inspector benchmark google/gemini-2.0-flash-exp:free --format json

# Text-only TPS with threshold suitable for CI/monitoring (non-zero exit code on breach)
openrouter-inspector benchmark google/gemini-2.0-flash-exp:free --format text --min-tps 200

Scripting/monitoring notes:

  • In text format with --min-tps, the command exits with code 1 if TPS is below the threshold (else 0). Use this in CI/CD, cron, or health checks.
  • In table/json formats, the exit code reflects execution success, not a threshold check.

Examples

# Top-level listing filtered by vendor substring
openrouter-inspector list "google"

# List models with multiple filters (AND logic)
openrouter-inspector list "meta" "free"

# Endpoints with filters and sorting: min quant fp8, min context 128K, sort by price_out desc
openrouter-inspector endpoints deepseek/deepseek-r1 --min-quant fp8 --min-context 128K --sort-by price_out --desc

# Lightweight mode with sorting
openrouter-inspector --list --sort-by name

Notes

  • Models are retrieved from /api/v1/models. Provider offers per model are retrieved from /api/v1/models/:author/:slug/endpoints.
  • Supported parameters listed on /models are a union across providers. Use /endpoints for per-provider truth.
  • Some fields may vary by provider (context, pricing, features); the CLI reflects these differences.

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openrouter_inspector-0.1.2.tar.gz (79.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openrouter_inspector-0.1.2-py3-none-any.whl (52.6 kB view details)

Uploaded Python 3

File details

Details for the file openrouter_inspector-0.1.2.tar.gz.

File metadata

  • Download URL: openrouter_inspector-0.1.2.tar.gz
  • Upload date:
  • Size: 79.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for openrouter_inspector-0.1.2.tar.gz
Algorithm Hash digest
SHA256 e57db46883d6d170c53316a36d76a784337e93625439944e4c54aded6dc3c4a2
MD5 15cb7c9fd85438c9fd137dd831f8eb46
BLAKE2b-256 10253d2fdc1d34e91428a58708a199b83029fcb6a5da9b6882558bfd33757924

See more details on using hashes here.

File details

Details for the file openrouter_inspector-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for openrouter_inspector-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f9a02df99076054a7540a3cf308dd26649162f3bac5192c0f439655fa9e4553a
MD5 6455c35c72fbc5322f8a6b968d4e85a1
BLAKE2b-256 255bb6027126c931278f40251fb64157162dcf9c1215e12a05135c53e617938f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page