Skip to main content

CLI for the Scorable API

Project description

Scorable logo

Measurement & Control for LLM Automations

The scorable CLI is a powerful command-line tool for interacting with the Scorable API. It provides a convenient way to manage and execute your Judges directly from the terminal.

Installation

You can install the scorable CLI using the following command, which downloads and installs the script to /usr/local/bin:

curl -sSL https://scorable.ai/cli/install.sh | sh

Alternatively, you can install and run the CLI using uvx:

uvx scorable-cli judge list

Authentication

Before using the CLI, you must set your Scorable API key as an environment variable:

# Sign up for a free account at https://scorable.ai/register
export SCORABLE_API_KEY="your-api-key"

Temporary API keys

If no API key is set, the CLI can create a temporary key interactively and save it to ~/.scorable/settings.json as temporary_api_key. Permanent keys should be set via the SCORABLE_API_KEY environment variable, which takes precedence.

Usage

The CLI is organized into a main command, scorable, with subcommands for different functionalities. The primary resource you'll interact with is the judge.

Judge Management

All Judge-related commands are available under the scorable judge subcommand.

list

List all available Judges, with options for filtering and pagination.

scorable judge list

Options:

  • --page-size: Number of results to return per page.
  • --cursor: The pagination cursor value.
  • --search: A search term to filter by.
  • --name: Filter by exact judge name.
  • --ordering: Which field to use for ordering the results.
  • --is-preset / --not-is-preset: Filter by preset status.
  • --is-public / --not-is-public: Filter by public status.
  • --show-global / --not-show-global: Filter by global status.

get

Retrieve a specific Judge by its ID.

scorable judge get <judge_id>

create

Create a new Judge.

scorable judge create --name "My New Judge" --intent "To evaluate the quality of LLM responses."

Options:

  • --name: The name for the new judge (required).
  • --intent: The intent for the new judge (required).
  • --stage: The stage for the new judge.
  • --evaluator-references: JSON string for evaluator references. E.g., '[{"id": "eval-id"}]'

update

Update an existing Judge.

scorable judge update <judge_id> --name "My Updated Judge Name"

Options:

  • --name: The new name for the judge.
  • --stage: The new stage for the judge.
  • --evaluator-references: JSON string to update evaluator references. Use "[]" to clear.

delete

Delete a Judge by its ID. You will be prompted for confirmation.

scorable judge delete <judge_id>

duplicate

Duplicate an existing Judge.

scorable judge duplicate <judge_id>

Judge Execution

execute

Execute a Judge with specific inputs.

scorable judge execute <judge_id> --request "What is the capital of France?" --response "Paris"

Options:

  • --request: Request text.
  • --response: Response text to evaluate.
  • --contexts: JSON list of context strings. E.g., '["Retreived document from a knowledge base"]'
  • --expected-output: Expected output text.
  • --tag: Add one or more tags.
  • --user-id: User identifier for tracking purposes.
  • --session-id: Session identifier for tracking purposes.
  • --system-prompt: System prompt that was used for the LLM call.

Using stdin input:

You can pipe input directly to the --response parameter:

echo "Paris" | scorable judge execute <judge_id> --request "What is the capital of France?"
cat response.txt | scorable judge execute <judge_id>

With tracking parameters:

scorable judge execute <judge_id> \
  --response "Paris" \
  --user-id "user-123" \
  --session-id "session-456" \
  --system-prompt "You are a helpful assistant."

execute-by-name

Execute a Judge by its name.

scorable judge execute-by-name "My New Judge" --request "What is the capital of France?" --response "Paris"

Options:

  • --request: Request text.
  • --response: Response text to evaluate.
  • --contexts: JSON list of context strings. E.g., '["ctx1"]'
  • --expected-output: Expected output text.
  • --tag: Add one or more tags.
  • --user-id: User identifier for tracking purposes.
  • --session-id: Session identifier for tracking purposes.
  • --system-prompt: System prompt that was used for the LLM call.

Input can also be piped in similar way as with execute.

Prompt testing

Initialize a prompt testing experiment config and run it.

scorable prompt-test init
scorable prompt-test run

Development

This project uses uv for dependency management. To set up the development environment, run:

. .venv/bin/activate
uv pip sync pyproject.toml

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scorable_cli-0.1.6.tar.gz (18.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scorable_cli-0.1.6-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file scorable_cli-0.1.6.tar.gz.

File metadata

  • Download URL: scorable_cli-0.1.6.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.2

File hashes

Hashes for scorable_cli-0.1.6.tar.gz
Algorithm Hash digest
SHA256 82627d1f6561f77120ed46ac5f79e61813baabb1f06705617cd70fe566d4ab71
MD5 418c35e3b291f5882e8dfdf2707f6e74
BLAKE2b-256 2cf1df619439dc4e4f4a541c2b76dc33e64d25ec96f05486d763ebfb297890b0

See more details on using hashes here.

File details

Details for the file scorable_cli-0.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for scorable_cli-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 3c36fa7e663465cfab6ca3d7195825cec119c32331ccc237ed190282761bde51
MD5 469b3d5a815e2b4629ef3137863c9873
BLAKE2b-256 78677705b8a986ea6b805c43404bec24f8126a3949ed2a02ead055bfaa79cc22

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page