RAKAM SYSTEMS CLI
Project description
Rakam Eval CLI
A CLI for running LLM evaluations and tracking quality over time.
Quick Start
A typical workflow is:
- Write eval function
edit eval/my_eval.py
- Run evaluation
rakam eval run
- View results
rakam eval show
Installation
pip install rakam-systems-cli
Writing Evaluations
Create an eval/ directory in your project. Each evaluation function must:
- Be decorated with
@eval_run - Return an
EvalConfigobject
# eval/examples.py
from rakam_systems_cli.decorators import eval_run
from rakam_systems_tools.evaluation.schema import (
EvalConfig,
TextInputItem,
ClientSideMetricConfig,
ToxicityConfig,
)
@eval_run
def test_simple_text_eval():
"""A simple text evaluation showcasing a basic client-side metric."""
return EvalConfig(
component="text_component_1",
label="demo_simple_text",
data=[
TextInputItem(
id="txt_001",
input="Hello world",
output="Hello world",
expected_output="Hello world",
metrics=[ClientSideMetricConfig(name="relevance", score=1)],
)
],
metrics=[ToxicityConfig(name="toxicity_demo", include_reason=False)],
)
User Guide
Listing evaluations
rakam eval list evals
This shows all functions decorated with @eval_run in the eval/ directory.
Listing runs
This shows all runs hosted on the evaluation server.
rakam eval list runs
Comparing runs
Compare two runs to see what changed:
# Compare by IDs
rakam eval compare --id 42 --id 45
# Save comparison to file
rakam eval compare --id 42 --id 45 -o comparison.json
Command Reference
Full command reference (click to expand)
rakam eval list evals
Usage: rakam eval list evals [OPTIONS] [DIRECTORY]
List evaluations (functions decorated with @eval_run).
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ directory [DIRECTORY] Directory to scan (default: ./eval) │
│ [default: eval] │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --recursive -r Recursively search for Python files │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
rakam eval list runs
Usage: rakam eval list runs [OPTIONS]
List runs (newest first).
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --limit -l INTEGER Max number of runs [default: 20] │
│ --offset INTEGER Pagination offset [default: 0] │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
rakam eval run
Usage: rakam eval run [OPTIONS] [DIRECTORY]
Execute evaluations (functions decorated with @eval_run).
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ directory [DIRECTORY] Directory to scan (default: ./eval) │
│ [default: eval] │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --recursive -r Recursively search for Python files │
│ --dry-run Only list functions without executing them │
│ --save-runs Save each run result to a JSON file │
│ --output-dir PATH Directory where run results are saved │
│ [default: eval_runs] │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
rakam eval show
Usage: rakam eval show [OPTIONS]
Show a run by ID or tag. Without arguments, shows the most recent run.
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --id -i INTEGER Run ID │
│ --tag -t TEXT Run tag │
│ --raw Print raw JSON instead of formatted output │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
rakam eval compare
Usage: rakam eval compare [OPTIONS]
Compare two evaluation runs.
Default: unified git diff
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --tag -t TEXT Run tag │
│ --id -i INTEGER Run ID │
│ --summary Show summary diff only │
│ --side-by-side Show side-by-side diff (git) │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
rakam eval tag
Usage: rakam eval tag [OPTIONS]
Assign a tag to a run or delete a tag.
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --id -i INTEGER Run ID │
│ --tag -t TEXT Tag to assign to the run │
│ --delete TEXT Delete a tag │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
rakam eval metrics list
Usage: rakam eval metrics list [OPTIONS] [DIRECTORY]
List all metric types used by loaded eval configs.
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ directory [DIRECTORY] Directory to scan (default: ./eval) │
│ [default: eval] │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --recursive -r Recursively search for Python files │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
Documentation
For the full user guide, see the official documentation.
License
See main project LICENSE file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rakam_systems_cli-0.2.6.tar.gz.
File metadata
- Download URL: rakam_systems_cli-0.2.6.tar.gz
- Upload date:
- Size: 15.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3ff1ba2c7dd3b72eb0d7ed75e6d1f36bc2de77837f2e0ad5d34811f69f61c55
|
|
| MD5 |
04e6c30e5f63b3776926dcf4fccc1aa8
|
|
| BLAKE2b-256 |
879144d1ef3a61d1c6b0e4fc08fe0b03e7dd70345a856e84b73b6b83bbfa6c27
|
File details
Details for the file rakam_systems_cli-0.2.6-py3-none-any.whl.
File metadata
- Download URL: rakam_systems_cli-0.2.6-py3-none-any.whl
- Upload date:
- Size: 20.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3538f0658a24fdd594c2fb6a24a34477b1c2db4eb7b8c62a43eeb30e8d85c6d2
|
|
| MD5 |
923f751acd2946508c57d64ecc959f72
|
|
| BLAKE2b-256 |
daf8e009a534b91034a03eea0fb56ff99c1b0864da093eb398087bcf281c2603
|