FluxLoop CLI — Agent evaluation framework

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Programming Language

Project description

FluxLoop CLI

Command-line interface for the FluxLoop agent evaluation framework.

Installation

pip install fluxloop-cli

Quick Start

# Authenticate
fluxloop auth login

# Create a project and scenario
fluxloop projects create --name my-project
fluxloop init scenario smoke-test
fluxloop scenarios create --name smoke-test --goal "Validate agent accuracy"

# Run a skill test
fluxloop skill validate ./SKILL.md
fluxloop skill test ./SKILL.md --input "Generate a summary" --copy-files src/

# Full test workflow (pull inputs → run → push results)
fluxloop test --scenario smoke-test

Commands

Core

Command	Description
`fluxloop run`	Run agent over configured inputs using the selected executor
`fluxloop skill validate`	Validate SKILL.md against static contracts
`fluxloop skill test`	Execute a skill in sandbox with behavior contract evaluation
`fluxloop skill benchmark`	Run N benchmark iterations and report stats
`fluxloop init scenario <name>`	Scaffold a new scenario directory
`fluxloop context show`	Display current project, scenario, and workspace state
`fluxloop test`	Full workflow: pull → run → push
`fluxloop test results`	View local or remote test results
`fluxloop evaluate`	Trigger server-side evaluation and wait for completion

Authentication & Projects

Command	Description
`fluxloop auth login`	Authenticate via device code flow
`fluxloop auth logout`	Remove stored credentials
`fluxloop auth status`	Show login state and token expiry
`fluxloop projects list`	List available projects
`fluxloop projects create`	Create a new project
`fluxloop projects select`	Set active project
`fluxloop apikeys create`	Generate an API key (saved to `.fluxloop/.env`)
`fluxloop apikeys list`	List existing API keys

Scenarios & Data Pipeline

Command	Description
`fluxloop scenarios create`	Create a scenario on the server
`fluxloop scenarios select`	Set active scenario locally
`fluxloop scenarios refine`	Refine scenario contracts
`fluxloop personas suggest`	Generate user personas via LLM
`fluxloop inputs synthesize`	Generate test inputs from personas
`fluxloop inputs list`	List available input sets
`fluxloop inputs qc`	Quality-check generated inputs
`fluxloop inputs refine`	Refine inputs iteratively
`fluxloop bundles publish`	Publish input sets as versioned bundles
`fluxloop bundles list`	List published bundles
`fluxloop manifests show`	Display current manifest
`fluxloop manifests publish`	Publish manifest to server
`fluxloop data push`	Upload knowledge or ground-truth data
`fluxloop data bind`	Bind uploaded data to a scenario
`fluxloop data gt status`	Check ground-truth materialization status
`fluxloop intent refine`	Refine agent profile and test intent

Sync

Command	Description
`fluxloop sync pull`	Pull bundle (inputs, personas, criteria) from server
`fluxloop sync push`	Upload test results to server

Configuration

Scenario configuration lives in YAML files under scenarios/<name>/configs/:

scenarios/
  smoke-test/
    configs/
      simulation.yaml    # Runner, iterations, conversation settings
      input.yaml         # Input source and items
      scenario.yaml      # Scenario metadata
    contracts/
      static.yaml        # SKILL.md structure rules
      behavior.yaml      # Execution assertions
    pulled/              # Data from sync pull

Runner Types

Configure the executor in simulation.yaml:

Function — call a Python handler directly:

runner:
  type: function
  target: "my_agent:handler"
  timeout_seconds: 30

Skill — run a SKILL.md in Claude Agent SDK sandbox:

runner:
  type: skill
  skill_path: ./SKILL.md
  harness: claude
  allowed_tools: ["Read", "Write", "Shell"]
  skill_max_turns: 10
  budget: 0.50

Process — invoke a subprocess via NDJSON protocol:

runner:
  type: process
  command: ["python", "agent.py"]
  protocol: ndjson

Input Sources

input:
  source: inline          # inline | generated | bundle | pulled
  items:
    - text: "Hello, summarize this document"
    - text: "What are the key takeaways?"

When source: pulled, inputs are loaded from pulled/inputs.json after sync pull.

Environment Variable Substitution

YAML config values support ${VAR} syntax, resolved from environment variables.

Contracts

Static Contract

Validates SKILL.md structure before execution:

Required sections (e.g., # Purpose, # Instructions)
File size limits
Encoding checks
Forbidden pattern detection

Behavior Contract

Asserts conditions on execution results:

tool_called / tool_not_called
turn_count (min/max)
output_contains / output_matches
file_exists
cost_below / duration_below

Authentication

FluxLoop uses OAuth device code flow for interactive login:

fluxloop auth login              # Opens browser for approval
fluxloop auth login --no-browser # Manual code entry
fluxloop auth login --no-wait    # Save pending, resume later
fluxloop auth login --resume     # Resume pending login

Tokens are stored in ~/.fluxloop/auth.json. For CI environments, use FLUXLOOP_API_KEY instead.

Environment Variables

Variable	Purpose
`FLUXLOOP_API_URL`	Backend API base URL
`FLUXLOOP_API_KEY`	API key for authenticated requests
`FLUXLOOP_SYNC_API_KEY`	API key specifically for sync operations
`ANTHROPIC_API_KEY`	Required for multi-turn UserSimulator
`OPENAI_API_KEY`	Alternative provider for UserSimulator

Workspace-level variables can also be set in .fluxloop/.env.

Output

Test runs produce standardized output in .fluxloop/results/<experiment>-<timestamp>/:

File	Content
`trace_summary.jsonl`	Per-run execution traces (tool calls, tokens, cost)
`summary.json`	Aggregated statistics (success rate, duration, cost)
`errors.json`	Failure inventory with diagnostics

Developing

cd cli/python

# Install dependencies
uv sync --group dev

# Run in development mode
uv run fluxloop --help

# Run tests
uv run pytest

# Lint
uv run ruff check .

Building & Publishing

# Build
uv build

# Publish to PyPI
uv publish
# or
twine upload dist/*

Tech Stack

Library	Purpose
Typer	CLI framework
Pydantic	Data validation
ruamel.yaml	YAML parsing
httpx	HTTP client
Rich	Terminal output formatting
claude-agent-sdk	Skill execution in Claude sandbox

License

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Programming Language

Release history Release notifications | RSS feed

This version

0.4.0

Mar 9, 2026

0.3.11

Mar 2, 2026

0.3.10

Feb 28, 2026

0.3.9

Feb 13, 2026

0.3.8

Feb 13, 2026

0.3.7

Feb 13, 2026

0.3.6

Feb 13, 2026

0.3.5

Feb 13, 2026

0.3.4

Feb 12, 2026

0.3.3

Jan 27, 2026

0.3.2

Jan 26, 2026

0.3.1

Jan 22, 2026

0.3.0

Dec 21, 2025

0.2.36

Dec 2, 2025

0.2.35

Dec 2, 2025

0.2.34

Dec 2, 2025

0.2.33

Nov 30, 2025

0.2.32

Nov 27, 2025

0.2.31

Nov 26, 2025

0.2.30

Nov 24, 2025

0.2.29

Nov 20, 2025

0.2.28

Nov 20, 2025

0.2.27

Nov 17, 2025

0.2.26

Nov 16, 2025

0.2.25

Nov 16, 2025

0.2.24

Nov 15, 2025

0.2.23

Nov 15, 2025

0.2.22

Nov 15, 2025

0.2.21

Nov 14, 2025

0.2.20

Nov 14, 2025

0.2.19

Nov 14, 2025

0.2.18

Nov 13, 2025

0.2.17

Nov 11, 2025

0.2.16

Nov 4, 2025

0.2.15

Nov 3, 2025

0.2.14

Nov 3, 2025

0.2.13

Nov 3, 2025

0.2.12

Nov 3, 2025

0.2.11

Nov 2, 2025

0.2.10

Nov 2, 2025

0.2.9

Nov 2, 2025

0.2.8

Nov 2, 2025

0.2.7

Nov 2, 2025

0.2.6

Nov 2, 2025

0.2.5

Nov 2, 2025

0.2.4

Nov 2, 2025

0.2.3

Nov 2, 2025

0.2.2

Nov 2, 2025

0.2.1

Nov 1, 2025

0.2.0

Nov 1, 2025

0.1.0

Oct 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fluxloop_cli-0.4.0.tar.gz (153.5 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fluxloop_cli-0.4.0-py3-none-any.whl (86.0 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file fluxloop_cli-0.4.0.tar.gz.

File metadata

Download URL: fluxloop_cli-0.4.0.tar.gz
Upload date: Mar 9, 2026
Size: 153.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fluxloop_cli-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`6c0af643051d39e88614e874895290410157bbea7ecac1edfd114e61db66afb9`
MD5	`27fc82b4b766b245c19caae5b6e97bc8`
BLAKE2b-256	`fab79bf0df6d07d4cf348fd70ac3aeb209a3b64fad6802cb59e63b02f85ebb2f`

See more details on using hashes here.

File details

Details for the file fluxloop_cli-0.4.0-py3-none-any.whl.

File metadata

Download URL: fluxloop_cli-0.4.0-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 86.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fluxloop_cli-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aea1d56f80f66233a04f0a125ed14ff0613430149dc85eb00567565305d9c2bd`
MD5	`3ac5e2e8a15adf6ec8b0c01e8499145a`
BLAKE2b-256	`83f6ef0e4b4d7bca731eee9f88abbc17855cb51c4ba9e5f721ee1346679f3ca3`

See more details on using hashes here.

fluxloop-cli 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FluxLoop CLI

Installation

Quick Start

Commands

Core

Authentication & Projects

Scenarios & Data Pipeline

Sync

Configuration

Runner Types

Input Sources

Environment Variable Substitution

Contracts

Static Contract

Behavior Contract

Authentication

Environment Variables

Output

Developing

Building & Publishing

Tech Stack

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes