CLI for running and managing Connexity agent evaluations

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

connexity-cli

Command-line client for Connexity — drive eval runs, manage agents and test cases, and gate CI on regressions, all from the terminal.

connexity-cli is a thin wrapper over the Connexity REST API. It covers the public surface used to drive eval workflows from CI: auth, agents, eval configs, test cases, runs (with SSE streaming), custom metrics, prompt editor, integrations, environments (including deploy + deployment history), calls, config, and health. Account self-service (signup, password reset) stays in the web UI.

Installation

pip install connexity-cli

The wheel pulls in only click, httpx, and httpx-sse — no FastAPI, no SQLModel, no LLM SDKs.

Authentication

The CLI authenticates against a Connexity API server using a Bearer JWT.

Source	When used
`--token` / `--api-url` flags	Highest precedence — explicit per-invocation
`CONNEXITY_CLI_API_TOKEN` / `CONNEXITY_CLI_API_URL` env vars	Typical CI usage
`~/.config/connexity-cli/credentials.json` (mode `0600`)	Written by `connexity-cli login --save`

# One-time interactive login (writes credentials file)
connexity-cli login --email me@example.com --save

# Or set env vars in CI
export CONNEXITY_CLI_API_URL=https://evals.example.com
export CONNEXITY_CLI_API_TOKEN="$CI_EVALS_TOKEN"

Quick start

# Inspect resources
connexity-cli agents list
connexity-cli eval-configs list
connexity-cli test-cases list --tag smoke

# Author resources from JSON files (use "-" to read stdin)
connexity-cli agents create --from-file ./agent.json
connexity-cli eval-configs members replace <eval-config-id> --from-file ./members.json

# End-to-end: trigger a run, wait for completion, mark as baseline
connexity-cli run \
  --agent my-agent \
  --eval-config smoke-suite \
  --stream \
  --set-baseline

# CI gate: trigger a run AND fail if it doesn't clear the eval-config thresholds
# (exits 1 when metrics_passed=false or cases_passed=false; --no-fail-on-thresholds opts out)
connexity-cli run \
  --agent my-agent \
  --eval-config smoke-suite \
  --metrics-pass-threshold 80 \
  --cases-pass-threshold 100

# CI gate: regression check against the baseline (exits 1 on regression
# OR when the candidate fails its own CS-127 thresholds)
connexity-cli compare --candidate <run-id> --against-baseline

# Deploy a pre-validated agent version to Retell via a configured environment
# (eval-gated environments reject the deploy when thresholds fail)
connexity-cli environments deploy <env-id> --agent-version 7

# Stream agent execution events live
connexity-cli runs stream <run-id>

# AI-assisted prompt editing — SSE events go to stderr, final assistant
# message + edited_prompt to stdout (drops to non-streaming when piping)
connexity-cli prompt-editor chat <session-id> --message "tighten the refusal prose"

# JSON output for piping into jq
connexity-cli --output json agents list | jq '.data[].name'

Authoring patterns

Every command that creates or updates a resource takes a single --from-file PATH (or --from-file - for stdin) with a JSON body that matches the backend Pydantic schema (e.g. AgentCreate, RunCreate, EvalConfigCreate, CustomMetricCreate). The CLI does no schema duplication — the server validates and returns clear errors.

# Create an agent from a file
echo '{"name": "support-bot", "endpoint_url": "https://my-agent.example/api"}' \
  | connexity-cli agents create --from-file -

# Patch an eval config
connexity-cli eval-configs update smoke-suite --from-file ./patch.json

# Run with a full RunConfig (judge_config, simulator_config, metrics_selection, ...)
connexity-cli runs create --from-file ./run.json --auto-execute

Pass/fail thresholds (CS-127)

Every run carries two run-level pass/fail dimensions, snapshotted from the eval config and overridable per run:

Threshold	Meaning	Default
`metrics_pass_threshold`	Weighted average of the judge `overall_score` across cases that produced a verdict (0-100)	80
`cases_pass_threshold`	Fraction of cases that pass / total executions, errored cases counting as not-passed (0-100)	100

connexity-cli run and connexity-cli compare gate their exit code on these by default. Override per invocation:

connexity-cli run \
  --agent my-agent \
  --eval-config smoke-suite \
  --metrics-pass-threshold 75 \
  --cases-pass-threshold 95

Pass --no-fail-on-thresholds to print the verdict but exit 0 regardless. Full formula and rationale: docs/scoring-and-thresholds.md.

Output formats

Two formats are supported, switchable per-command via --output or globally via --output on the root group:

table (default) — human-readable tables with auto-detected column widths
json — pretty-printed JSON, friendly to jq / gron / scripting

Command tree

Each top-level group mirrors a backend router:

Group	Purpose
`login` / `logout` / `whoami`	Auth & session
`agents`	CRUD, draft/publish/rollback, versions, version diff, guidelines
`eval-configs`	CRUD, member (test-case) management
`test-cases`	CRUD, bulk import/export, generate, AI editor
`test-case-results`	Per-test-case run result CRUD
`runs`	CRUD, execute, cancel, stream (SSE), baselines, compare, suggestions
`custom-metrics`	CRUD plus LLM-backed metric preview generation
`prompt-editor`	Sessions, messages, presets, streaming chat
`integrations`	Third-party providers (Retell), connection test, list provider-side agents
`environments`	Bindings + `deploy`, `retell-versions`, `deployments list` (history)
`calls`	Observed external calls (Retell), refresh / mark-seen
`config`	Read-only API metadata, available metrics, LLM models
`health`	Server health probe
`run` / `compare` / `baseline`	Top-level convenience wrappers for common one-shot CI workflows

Run connexity-cli <group> --help (or connexity-cli <group> <subcommand> --help) to see flags and arguments.

Subcommand reference (selected)

Not exhaustive — run --help for the full set. These are the commands you'll reach for in CI and day-to-day work:

agents      list | show <ref> | create | update <id> | delete <id>
            versions list <ref> | versions show <ref> <n> | versions diff <ref> <a> <b>
            draft get <ref> | draft set <ref> | draft discard <ref>
            publish <ref> | rollback <ref> --to-version <n>
            guidelines get <ref> | guidelines update <ref>
runs        list | show <id> | create | update <id> | delete <id>
            execute <id> | cancel <id> | stream <id>
            baseline get --agent <ref> --eval-config <ref> | baseline set <id>
            compare --baseline <id> --candidate <id> [--include-analysis] [--fail-on-thresholds]
            compare-suggestions --baseline <id> --candidate <id>
environments list --agent <ref> | create | delete <id>
            deploy <env-id> --agent-version <n>
            retell-versions <env-id>
            deployments list (--agent <ref> | --env-id <id>)
prompt-editor sessions list | sessions show <id> | sessions create | sessions delete <id>
            messages list <session-id> | chat <session-id> --message "..."
            presets list
custom-metrics list | show <id> | create | update <id> | delete <id> | preview | generate
test-cases  list | show <id> | create | update <id> | delete <id>
            import <file> [--overwrite] | export | generate | ai create

# Top-level convenience wrappers for the most common CI flows:
run         --agent <ref> --eval-config <ref> [--metrics-pass-threshold N] [--cases-pass-threshold N] [--stream] [--set-baseline]
compare     --candidate <id> (--baseline <id> | --against-baseline)
baseline    get | set <id>

Exit codes

0 — success
1 — operation completed but indicates failure: run failed / cancelled, regression detected, candidate failed its CS-127 thresholds (default-on, opt out with --no-fail-on-thresholds), deploy returned status=failed, or import returned errors
2 — argument / configuration error, timeout, network failure

License

MIT — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

connexity

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

May 6, 2026

0.1.1

May 6, 2026

0.1.0

Apr 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

connexity_cli-0.2.0.tar.gz (34.0 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

connexity_cli-0.2.0-py3-none-any.whl (52.6 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file connexity_cli-0.2.0.tar.gz.

File metadata

Download URL: connexity_cli-0.2.0.tar.gz
Upload date: May 6, 2026
Size: 34.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for connexity_cli-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`3f374d1abb55dd210dc9cbca3d95ddde5abedd29a52c457e9c9fd70a3938022c`
MD5	`d17750321631f0749b433da65b6fb13b`
BLAKE2b-256	`ff04322c40516cabac46cd5d5d9fb6c140550d635427045d060e47bfceb4731f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for connexity_cli-0.2.0.tar.gz:

Publisher: publish-pypi.yml on Connexity-AI/connexity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: connexity_cli-0.2.0.tar.gz
- Subject digest: 3f374d1abb55dd210dc9cbca3d95ddde5abedd29a52c457e9c9fd70a3938022c
- Sigstore transparency entry: 1449728786
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: Connexity-AI/connexity@1314d246a0a13306e20d4922282832b5212b33ee
- Branch / Tag: refs/tags/cli-v0.2.0
- Owner: https://github.com/Connexity-AI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@1314d246a0a13306e20d4922282832b5212b33ee
- Trigger Event: push

File details

Details for the file connexity_cli-0.2.0-py3-none-any.whl.

File metadata

Download URL: connexity_cli-0.2.0-py3-none-any.whl
Upload date: May 6, 2026
Size: 52.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for connexity_cli-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5a0829c9d49cbd3453ff9783d109c6f7fd1aeb5eff4959fc6c57078e7921d2f1`
MD5	`680b5288112060d7812c0cf873522c6b`
BLAKE2b-256	`9969c2c171632408ee8a5785d07b6e2800ca626981815d287f5330a28a1b4699`

See more details on using hashes here.

Provenance

The following attestation bundles were made for connexity_cli-0.2.0-py3-none-any.whl:

Publisher: publish-pypi.yml on Connexity-AI/connexity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: connexity_cli-0.2.0-py3-none-any.whl
- Subject digest: 5a0829c9d49cbd3453ff9783d109c6f7fd1aeb5eff4959fc6c57078e7921d2f1
- Sigstore transparency entry: 1449728800
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: Connexity-AI/connexity@1314d246a0a13306e20d4922282832b5212b33ee
- Branch / Tag: refs/tags/cli-v0.2.0
- Owner: https://github.com/Connexity-AI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@1314d246a0a13306e20d4922282832b5212b33ee
- Trigger Event: push

connexity-cli 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

connexity-cli

Installation

Authentication

Quick start

Authoring patterns

Pass/fail thresholds (CS-127)

Output formats

Command tree

Subcommand reference (selected)

Exit codes

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance