RAKAM SYSTEMS CLI

Project description

Rakam Eval CLI

A CLI for running LLM evaluations and tracking quality over time.

Quick Start

A typical workflow is:

Write eval function

edit eval/my_eval.py

Run evaluation

rakam eval run

View results

rakam eval show

Installation

pip install rakam-systems-cli

Writing Evaluations

Create an eval/ directory in your project. Each evaluation function must:

Be decorated with @eval_run
Return an EvalConfig object

# eval/examples.py
from rakam_systems_cli.decorators import eval_run
from rakam_systems_tools.evaluation.schema import (
    EvalConfig,
    TextInputItem,
    ClientSideMetricConfig,
    ToxicityConfig,
)

@eval_run
def test_simple_text_eval():
    """A simple text evaluation showcasing a basic client-side metric."""
    return EvalConfig(
        component="text_component_1",
        label="demo_simple_text",
        data=[
            TextInputItem(
                id="txt_001",
                input="Hello world",
                output="Hello world",
                expected_output="Hello world",
                metrics=[ClientSideMetricConfig(name="relevance", score=1)],
            )
        ],
        metrics=[ToxicityConfig(name="toxicity_demo", include_reason=False)],
    )

User Guide

Listing evaluations

rakam eval list evals

This shows all functions decorated with @eval_run in the eval/ directory.

Listing runs

This shows all runs hosted on the evaluation server.

rakam eval list runs

Comparing runs

Compare two runs to see what changed:

# Compare by IDs
rakam eval compare --id 42 --id 45

# Save comparison to file
rakam eval compare --id 42 --id 45 -o comparison.json

Command Reference

Full command reference (click to expand)

`rakam eval list evals`

Usage: rakam eval list evals [OPTIONS] [DIRECTORY]

 List evaluations (functions decorated with @eval_run).

╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│   directory      [DIRECTORY]  Directory to scan (default: ./eval)            │
│                               [default: eval]                                │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --recursive  -r        Recursively search for Python files                   │
│ --help                 Show this message and exit.                           │
╰──────────────────────────────────────────────────────────────────────────────╯

`rakam eval list runs`

Usage: rakam eval list runs [OPTIONS]

 List runs (newest first).

╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --limit   -l      INTEGER  Max number of runs [default: 20]                  │
│ --offset          INTEGER  Pagination offset [default: 0]                    │
│ --help                     Show this message and exit.                       │
╰──────────────────────────────────────────────────────────────────────────────╯

`rakam eval run`

Usage: rakam eval run [OPTIONS] [DIRECTORY]

 Execute evaluations (functions decorated with @eval_run).

╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│   directory      [DIRECTORY]  Directory to scan (default: ./eval)            │
│                               [default: eval]                                │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --recursive   -r            Recursively search for Python files              │
│ --dry-run                   Only list functions without executing them       │
│ --save-runs                 Save each run result to a JSON file              │
│ --output-dir          PATH  Directory where run results are saved            │
│                             [default: eval_runs]                             │
│ --help                      Show this message and exit.                      │
╰──────────────────────────────────────────────────────────────────────────────╯

`rakam eval show`

Usage: rakam eval show [OPTIONS]

 Show a run by ID or tag. Without arguments, shows the most recent run.

╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --id    -i      INTEGER  Run ID                                              │
│ --tag   -t      TEXT     Run tag                                             │
│ --raw                    Print raw JSON instead of formatted output          │
│ --help                   Show this message and exit.                         │
╰──────────────────────────────────────────────────────────────────────────────╯

`rakam eval compare`

Usage: rakam eval compare [OPTIONS]

 Compare two evaluation runs.

 Default: unified git diff

╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --tag           -t      TEXT     Run tag                                     │
│ --id            -i      INTEGER  Run ID                                      │
│ --summary                        Show summary diff only                      │
│ --side-by-side                   Show side-by-side diff (git)                │
│ --help                           Show this message and exit.                 │
╰──────────────────────────────────────────────────────────────────────────────╯

`rakam eval tag`

Usage: rakam eval tag [OPTIONS]

 Assign a tag to a run or delete a tag.

╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --id      -i      INTEGER  Run ID                                            │
│ --tag     -t      TEXT     Tag to assign to the run                          │
│ --delete          TEXT     Delete a tag                                      │
│ --help                     Show this message and exit.                       │
╰──────────────────────────────────────────────────────────────────────────────╯

`rakam eval metrics list`

Usage: rakam eval metrics list [OPTIONS] [DIRECTORY]

 List all metric types used by loaded eval configs.

╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│   directory      [DIRECTORY]  Directory to scan (default: ./eval)            │
│                               [default: eval]                                │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --recursive  -r        Recursively search for Python files                   │
│ --help                 Show this message and exit.                           │
╰──────────────────────────────────────────────────────────────────────────────╯

Documentation

For the full user guide, see the official documentation.

License

See main project LICENSE file.

Project details

Release history Release notifications | RSS feed

0.2.7

Apr 7, 2026

This version

0.2.6

Mar 31, 2026

0.2.5

Feb 27, 2026

0.2.5rc3 pre-release

Mar 31, 2026

0.2.5rc2 pre-release

Mar 31, 2026

0.2.5rc1 pre-release

Mar 31, 2026

0.2.4rc21 pre-release

Feb 26, 2026

0.2.4rc20 pre-release

Feb 26, 2026

0.2.4rc19 pre-release

Feb 25, 2026

0.2.4rc18 pre-release

Feb 24, 2026

0.2.4rc17 pre-release

Feb 18, 2026

0.2.4rc16 pre-release

Feb 16, 2026

0.2.4rc15 pre-release

Feb 13, 2026

0.2.4rc14 pre-release

Feb 11, 2026

0.2.4rc13 pre-release

Feb 11, 2026

0.2.4rc12 pre-release

Feb 11, 2026

0.2.4rc11 pre-release

Feb 11, 2026

0.2.4rc6 pre-release

Feb 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rakam_systems_cli-0.2.6.tar.gz (15.5 kB view details)

Uploaded Mar 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rakam_systems_cli-0.2.6-py3-none-any.whl (20.6 kB view details)

Uploaded Mar 31, 2026 Python 3

File details

Details for the file rakam_systems_cli-0.2.6.tar.gz.

File metadata

Download URL: rakam_systems_cli-0.2.6.tar.gz
Upload date: Mar 31, 2026
Size: 15.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.6

File hashes

Hashes for rakam_systems_cli-0.2.6.tar.gz
Algorithm	Hash digest
SHA256	`f3ff1ba2c7dd3b72eb0d7ed75e6d1f36bc2de77837f2e0ad5d34811f69f61c55`
MD5	`04e6c30e5f63b3776926dcf4fccc1aa8`
BLAKE2b-256	`879144d1ef3a61d1c6b0e4fc08fe0b03e7dd70345a856e84b73b6b83bbfa6c27`

See more details on using hashes here.

File details

Details for the file rakam_systems_cli-0.2.6-py3-none-any.whl.

File metadata

Download URL: rakam_systems_cli-0.2.6-py3-none-any.whl
Upload date: Mar 31, 2026
Size: 20.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.6

File hashes

Hashes for rakam_systems_cli-0.2.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3538f0658a24fdd594c2fb6a24a34477b1c2db4eb7b8c62a43eeb30e8d85c6d2`
MD5	`923f751acd2946508c57d64ecc959f72`
BLAKE2b-256	`daf8e009a534b91034a03eea0fb56ff99c1b0864da093eb398087bcf281c2603`

See more details on using hashes here.

rakam-systems-cli 0.2.6

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Rakam Eval CLI

Quick Start

Installation

Writing Evaluations

User Guide

Listing evaluations

Listing runs

Comparing runs

Command Reference

`rakam eval list evals`

`rakam eval list runs`

`rakam eval run`

`rakam eval show`

`rakam eval compare`

`rakam eval tag`

`rakam eval metrics list`

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes