Seer - Multi-Agent System for Evaluating AI Agents
Project description
Seer Agents (seeragents)
Seer Agents is a LangGraph-based evaluation orchestrator for testing autonomous agents end-to-end (align → generate tests → run tests), with optional telemetry (Langfuse) and persistence (Neo4j/Postgres).
Install (recommended: no git clone)
Install with pip
python -m venv .venv
source .venv/bin/activate
pip install -U seeragents
Install with uv
uv venv
source .venv/bin/activate
uv pip install -U seeragents
Configuration (env vars or .env)
Both CLIs (seer, seer-eval) automatically load a local .env file (via python-dotenv). You can also export environment variables directly.
Create a .env in your working directory:
OPENAI_API_KEY=...
GITHUB_TOKEN=...
COMPOSIO_API_KEY=...
E2B_API_KEY=...
You can inspect what Seer sees with:
seer-eval config
API keys by stage (what’s required when)
Seer validates many keys up-front, and seer-eval will prompt you interactively if something is missing.
| Stage | Required keys | Notes |
|---|---|---|
| alignment | OPENAI_API_KEY |
Used to turn a human request into a concrete spec (functional requirements + required integrations). |
| plan | OPENAI_API_KEY |
Generates dataset-style test cases. |
| testing | OPENAI_API_KEY, GITHUB_TOKEN |
Always required for executing tests and provisioning target repos. |
| testing (MCP) | COMPOSIO_API_KEY |
Required only when the aligned spec includes MCP services (e.g. GitHub/Asana tools). |
| sandbox provisioning | E2B_API_KEY |
Required when Seer provisions an E2B sandbox to clone/build/run the target agent. |
| Asana scenarios | ASANA_WORKSPACE_ID or ASANA_TEAM_GID (or ASANA_PROJECT_ID) |
Required when running tests that create/update Asana entities. |
| optional tracing | LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL |
Enables Langfuse traces (recommended for debugging). |
| optional persistence | DATABASE_URI |
Enables Postgres checkpointer for pause/resume across runs and richer history. |
| optional memory | NEO4J_URI, NEO4J_USERNAME, NEO4J_PASSWORD |
Enables reflection/tool indexing in Neo4j. |
Usage: CLI (fastest path)
Start the interactive eval agent
seer-eval run
This launches an interactive loop where you select steps (alignment, plan, testing, finalize) and provide inputs when prompted. The CLI will ask for:
- Description – what does your agent do?
- GitHub Repository – in
owner/repoformat - User ID – optional, for authentication context
To resume an existing session:
seer-eval run --thread-id <uuid>
Start the supervisor agent (database operations)
seer-eval new-supervisor
The Supervisor agent helps with PostgreSQL database operations, schema exploration, and related tasks. You can optionally provide a connection string:
seer-eval new-supervisor --db-uri "postgresql://user:pass@host/db"
Other commands
# Show current configuration
seer-eval config
# Export results from a previous run (requires DATABASE_URI for persistence)
seer-eval export <thread-id>
# Enable verbose mode (full tracebacks for debugging)
seer-eval -v run
Interactive session commands
While in a session (run or new-supervisor), you can use:
exit,quit,bye– Exit the sessionclear– Clear the screen
About interactive prompts
- If a required key is missing,
seer-evalwill ask you for the value and continue. - To avoid prompts (CI / automation), set keys via
.envor exported env vars ahead of time.
Usage: start the LangGraph dev server
This starts the eval agent as a LangGraph dev server (useful with LangGraph Studio / HTTP invocation):
seer
By default it serves the Eval Agent on http://127.0.0.1:8002 and writes logs under seer-logs/.
Usage: import and run from Python (notebook-style)
This mirrors the flow in examples/github_asana_bot.ipynb.
import uuid
from dotenv import load_dotenv
from langchain_core.runnables import RunnableConfig
from langgraph.checkpoint.memory import MemorySaver
load_dotenv() # loads .env in the current directory (optional but recommended)
from agents.eval_agent.graph import build_graph
graph = build_graph()
memory = MemorySaver()
eval_agent = graph.compile(checkpointer=memory)
thread_id = str(uuid.uuid4())
# 1) alignment
aligned = await eval_agent.ainvoke(
{
"messages": [{"type": "human", "content": "Evaluate my agent..."}],
"step": "alignment",
"input_context": {
"integrations": {"github": {"name": "owner/repo"}},
"user_id": "you@example.com",
},
},
config=RunnableConfig(configurable={"thread_id": thread_id}),
)
# 2) plan
planned = await eval_agent.ainvoke(
{"step": "plan"},
config=RunnableConfig(configurable={"thread_id": thread_id}),
)
# 3) testing
tested = await eval_agent.ainvoke(
{"step": "testing"},
config=RunnableConfig(configurable={"thread_id": thread_id}),
)
Developing locally (optional: git clone)
If you’re contributing to Seer itself:
git clone <your-fork>
cd seer
uv venv
source .venv/bin/activate
uv pip install -e .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seeragents-0.0.8.tar.gz.
File metadata
- Download URL: seeragents-0.0.8.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8002d5d9fc1d3f44375479c9df0a958c1c2a4acf0fd9647f1ca1ffbeeb34dc48
|
|
| MD5 |
1e080054a4f292e9a2a163f334be9b0e
|
|
| BLAKE2b-256 |
5a3c9e4affe8ed054ea6df166110a002b816d69f9b4b321fa9f2bba484e3550f
|
File details
Details for the file seeragents-0.0.8-py3-none-any.whl.
File metadata
- Download URL: seeragents-0.0.8-py3-none-any.whl
- Upload date:
- Size: 140.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
abd67f2a37a1aedfaf46413ebd5fe5ed4cb800678dc154106ba68ebf6f5d17f0
|
|
| MD5 |
a994e6941bb0bb5ca36a44687b80bd63
|
|
| BLAKE2b-256 |
5b687a8540e2edce5e03ae0290cdb32c8e4acfd71e8c515215a52df6d0e04347
|