Skip to main content

Seer - Multi-Agent System for Evaluating AI Agents

Project description

Seer Agents (seeragents)

Seer Agents is a LangGraph-based evaluation orchestrator for testing autonomous agents end-to-end (align → generate tests → run tests), with optional telemetry (Langfuse) and persistence (Neo4j/Postgres).

Install (recommended: no git clone)

Install with pip

python -m venv .venv
source .venv/bin/activate

pip install -U seeragents

Install with uv

uv venv
source .venv/bin/activate

uv pip install -U seeragents

Configuration (env vars or .env)

Both CLIs (seer, seer-eval) automatically load a local .env file (via python-dotenv). You can also export environment variables directly.

Create a .env in your working directory:

OPENAI_API_KEY=...
GITHUB_TOKEN=...
COMPOSIO_API_KEY=...
E2B_API_KEY=...

You can inspect what Seer sees with:

seer-eval config

API keys by stage (what’s required when)

Seer validates many keys up-front, and seer-eval will prompt you interactively if something is missing.

Stage Required keys Notes
alignment OPENAI_API_KEY Used to turn a human request into a concrete spec (functional requirements + required integrations).
plan OPENAI_API_KEY Generates dataset-style test cases.
testing OPENAI_API_KEY, GITHUB_TOKEN Always required for executing tests and provisioning target repos.
testing (MCP) COMPOSIO_API_KEY Required only when the aligned spec includes MCP services (e.g. GitHub/Asana tools).
sandbox provisioning E2B_API_KEY Required when Seer provisions an E2B sandbox to clone/build/run the target agent.
Asana scenarios ASANA_WORKSPACE_ID or ASANA_TEAM_GID (or ASANA_PROJECT_ID) Required when running tests that create/update Asana entities.
optional tracing LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL Enables Langfuse traces (recommended for debugging).
optional persistence DATABASE_URI Enables Postgres checkpointer for pause/resume across runs and richer history.
optional memory NEO4J_URI, NEO4J_USERNAME, NEO4J_PASSWORD Enables reflection/tool indexing in Neo4j.

Usage: CLI (fastest path)

Start the interactive eval agent

seer-eval run

This launches an interactive loop where you select steps (alignment, plan, testing, finalize) and provide inputs when prompted. The CLI will ask for:

  • Description – what does your agent do?
  • GitHub Repository – in owner/repo format
  • User ID – optional, for authentication context

To resume an existing session:

seer-eval run --thread-id <uuid>

Start the supervisor agent (database operations)

seer-eval new-supervisor

The Supervisor agent helps with PostgreSQL database operations, schema exploration, and related tasks. You can optionally provide a connection string:

seer-eval new-supervisor --db-uri "postgresql://user:pass@host/db"

Other commands

# Show current configuration
seer-eval config

# Export results from a previous run (requires DATABASE_URI for persistence)
seer-eval export <thread-id>

# Enable verbose mode (full tracebacks for debugging)
seer-eval -v run

Interactive session commands

While in a session (run or new-supervisor), you can use:

  • exit, quit, bye – Exit the session
  • clear – Clear the screen

About interactive prompts

  • If a required key is missing, seer-eval will ask you for the value and continue.
  • To avoid prompts (CI / automation), set keys via .env or exported env vars ahead of time.

Usage: start the LangGraph dev server

This starts the eval agent as a LangGraph dev server (useful with LangGraph Studio / HTTP invocation):

seer

By default it serves the Eval Agent on http://127.0.0.1:8002 and writes logs under seer-logs/.

Usage: import and run from Python (notebook-style)

This mirrors the flow in examples/github_asana_bot.ipynb.

import uuid
from dotenv import load_dotenv
from langchain_core.runnables import RunnableConfig
from langgraph.checkpoint.memory import MemorySaver

load_dotenv()  # loads .env in the current directory (optional but recommended)

from agents.eval_agent.graph import build_graph

graph = build_graph()
memory = MemorySaver()
eval_agent = graph.compile(checkpointer=memory)

thread_id = str(uuid.uuid4())

# 1) alignment
aligned = await eval_agent.ainvoke(
    {
        "messages": [{"type": "human", "content": "Evaluate my agent..."}],
        "step": "alignment",
        "input_context": {
            "integrations": {"github": {"name": "owner/repo"}},
            "user_id": "you@example.com",
        },
    },
    config=RunnableConfig(configurable={"thread_id": thread_id}),
)

# 2) plan
planned = await eval_agent.ainvoke(
    {"step": "plan"},
    config=RunnableConfig(configurable={"thread_id": thread_id}),
)

# 3) testing
tested = await eval_agent.ainvoke(
    {"step": "testing"},
    config=RunnableConfig(configurable={"thread_id": thread_id}),
)

Developing locally (optional: git clone)

If you’re contributing to Seer itself:

git clone <your-fork>
cd seer

uv venv
source .venv/bin/activate
uv pip install -e .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seeragents-0.0.8.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seeragents-0.0.8-py3-none-any.whl (140.2 kB view details)

Uploaded Python 3

File details

Details for the file seeragents-0.0.8.tar.gz.

File metadata

  • Download URL: seeragents-0.0.8.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for seeragents-0.0.8.tar.gz
Algorithm Hash digest
SHA256 8002d5d9fc1d3f44375479c9df0a958c1c2a4acf0fd9647f1ca1ffbeeb34dc48
MD5 1e080054a4f292e9a2a163f334be9b0e
BLAKE2b-256 5a3c9e4affe8ed054ea6df166110a002b816d69f9b4b321fa9f2bba484e3550f

See more details on using hashes here.

File details

Details for the file seeragents-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: seeragents-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 140.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for seeragents-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 abd67f2a37a1aedfaf46413ebd5fe5ed4cb800678dc154106ba68ebf6f5d17f0
MD5 a994e6941bb0bb5ca36a44687b80bd63
BLAKE2b-256 5b687a8540e2edce5e03ae0290cdb32c8e4acfd71e8c515215a52df6d0e04347

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page