Local-first, MCP-native multi-agent platform — job tracker + daily briefing for software engineers, with tracing, evals, and budgets.

These details have not been verified by PyPI

Project links

Project description

Aegis

A local-first, MCP-native multi-agent platform you control. Aegis ships as a working personal job-tracker + daily briefing for software engineers — but the substrate underneath (agents, tracing, evals, MCP) is built to be lifted into any domain where you want LLM workflows you can debug, eval, and budget.

Built to be read, forked, and bent to your use case. Everything happens on your machine — your data, your API key, your prompts.

What you can do today

Out of the box — two workflows wired end-to-end:

Workflow	What it does	Trigger
Job tracker	Paste a JD (text or URL). A Researcher → Analyst → Critic pipeline extracts the role, scores fit against your `profile.yaml`, and loops if the critic flags issues. Persists with cost, refinement count, and a trace link.	`add_job` MCP tool, `POST /jobs`, or curl
Daily briefing	Fans out to GitHub (your activity + notifications), Hacker News, and your job pipeline. Prioritises items, synthesises a markdown briefing, then self-scores and refines. Falls open if a source fails.	`run_briefing` MCP tool, `POST /briefing/run`, or curl

Inspect anything — every LLM call emits a span. Run aegis api and open http://localhost:8000/trace/<run_id> for a tree of every prompt, response, cost, and latency, or pull the JSON at /trace/<run_id>/json.

Catch regressions — YAML golden cases under evals/<eval_name>/cases/*.yaml run through the real pipeline; the runner writes a row to eval_runs with git SHA, score, and per-case detail.

Budgets that bite — per-run cap (AEGIS_LLM_MAX_COST_USD_PER_RUN) and daily aggregate cap (AEGIS_LLM_MAX_COST_USD_PER_DAY) checked before every LLM call. Tenacity retries handle transient Anthropic / Postgres failures.

What's actually reusable — the building blocks

Aegis is small (~3k LOC). Pick what you need:

MCP server scaffold (src/aegis/mcp_server.py) — FastMCP wiring with tools, resources, prompts, and a working Claude Desktop integration. Add a tool by writing one @mcp.tool() function.
Multi-agent framework (src/aegis/agents/) — AgentBase gives every agent automatic tracing, cost attribution, and uniform AgentResult. Drop in a new agent by subclassing and writing _run.
LangGraph workflow pattern (src/aegis/workflows/job_analysis/graph.py) — Researcher → Analyst → Critic with conditional refinement. Easy to copy and rewire into a different shape.
Tracing infra (src/aegis/tracing/) — spans persisted to Postgres (agent_traces table) with a built-in HTML viewer. No Datadog needed.
Eval harness (scripts/run_eval.py, evals/*/cases/*.yaml) — golden cases as YAML, assertion-based scoring (not exact-match), regression rows in eval_runs. Each case still produces a trace you can open.
LLM gateway (src/aegis/llm/gateway.py) — Anthropic direct (no LangChain), with cost tracking, per-run + per-day budgets, retry on 5xx / timeouts / rate-limits.
FastAPI surface (src/aegis/app.py, src/aegis/api/) — REST routers for jobs, briefing, traces, health. Exposed via aegis api. Health check returns 503 when degraded.

Quick start

Prerequisites

Docker Desktop — Aegis runs Postgres + Redis locally.
uv — curl -LsSf https://astral.sh/uv/install.sh | sh
Anthropic API key — required. Generate at https://console.anthropic.com/settings/keys.
GitHub personal access token — optional, only for the daily briefing. Generate at https://github.com/settings/tokens. Classic token needs the notifications scope; fine-grained needs Notifications: Read. Without it, briefings still run but skip the GitHub section.

Install

uv tool install aegis-agents
aegis init          # prompts for ANTHROPIC_API_KEY + GITHUB_TOKEN (optional)
aegis up            # boots Postgres + Redis, applies migrations

aegis init writes ~/.config/aegis/{.env,compose.yml,profile.yaml}. Re-run with --force to regenerate. Override the location with AEGIS_HOME=/path.

Edit ~/.config/aegis/profile.yaml to personalise fit scoring (skills, target roles, deal-breakers).

Wire to Claude

Claude Code:

claude mcp add aegis -- aegis serve

Claude Desktop — add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "aegis": {
      "command": "aegis",
      "args": ["serve"]
    }
  }
}

Restart Claude. The aegis tools show up under the connector menu.

Daily ops

Command	What it does
`aegis up`	Start Postgres + Redis (data preserved across restarts)
`aegis down`	Stop the stack
`aegis migrate`	Apply new migrations after an `aegis-agents` upgrade
`aegis doctor`	Diagnose prerequisites, config, and infra
`aegis serve`	Run the MCP server (what Claude calls)
`aegis api`	Run the REST API + trace viewer at http://localhost:8000/docs
`aegis init --force`	Regenerate `~/.config/aegis/` files

Daily workflow

Morning briefing

In Claude: "Run my daily briefing" — fires run_briefing, returns prioritised markdown with trace_url. There's no background scheduler; briefings only fire when you (or Claude) ask.

Adding a job

Copy a job posting (text or URL).
In Claude: "Add this job: [paste]" → add_job runs Researcher → Analyst → Critic, loops up to AEGIS_MAX_REFINEMENTS rounds if the critic flags issues, then persists.

Or curl (requires aegis api running):

curl -X POST http://localhost:8000/jobs \
  -H "Content-Type: application/json" \
  -d '{"text": "Senior Backend Engineer at Stripe..."}'

Inspecting a run

"List my recent Aegis runs" in Claude → list_recent_runs shows newest-first runs with run_id and trace links.
"Get the trace for <run_id>" → flat span summary.
With aegis api running, http://localhost:8000/trace/<run_id> → full HTML tree with every prompt and response.

MCP surface

Tools

Tool	What it does
`run_briefing`	Run a full briefing, return markdown + cost + trace pointer
`get_latest_briefing`	Most recent successful briefing
`add_job`	Run the multi-agent pipeline on a JD, persist + return analysis
`list_jobs`	List tracked jobs, optionally filtered by status
`update_job_status`	Move a job through the pipeline (saved → applied → …)
`get_follow_ups`	Stale-for-N-days jobs that need a nudge
`get_top_job_fits`	Saved jobs with `fit_score >= min_score`
`get_trace`	Span summary for a `run_id`
`list_recent_runs`	Newest-first run summaries with trace URLs

Resources: aegis://briefing/latest, aegis://jobs/summary Prompts: daily_briefing_prompt

Evals — keep prompts honest

The eval harness is maintainer tooling — it ships in the repo, not the package. Clone the repo and run:

uv run python scripts/seed_evals.py             # mirror evals/*/cases/*.yaml → eval_golden_cases
uv run python scripts/run_eval.py job_fit_v1    # run the suite

Each YAML case declares assertions against the analysis (not exact-match output, since LLMs aren't deterministic):

case_id: stripe_senior_backend_apply
eval_name: job_fit_v1
input:
  url_or_text: |
    Senior Backend Engineer — Stripe (Remote, US) ...
expected:
  recommendation: apply
  fit_score: {min: 0.70, max: 1.00}
  must_include_skills: [python, postgresql]
  must_not_have_deal_breakers: true

The runner writes a row to eval_runs per invocation (git SHA, score, pass count, per-case detail). Diff successive runs to track prompt regressions.

When you hit a bad output in agent_traces, the recipe is: copy the JD into a new case file with the correct expected, re-seed, re-run, fix the prompt until it passes.

Profile configuration

aegis init writes ~/.config/aegis/profile.yaml from the bundled template. Edit it to personalise fit-scoring and briefing output:

skills: [python, golang, distributed-systems]
target_roles: [backend-engineer, ai-engineer]
experience_years: 3
preferred_locations: [singapore, remote]
deal_breakers: [php, wordpress]
summary: |
  Backend engineer with 3 years of experience building distributed
  systems in Python and Go. Interested in AI / agent platforms.

Set AEGIS_PROFILE_PATH=/some/other/path.yaml to point Aegis at a profile outside the default location.

Tuning

Env var	Default	Purpose
`AEGIS_LLM_MODEL`	`claude-sonnet-4-5`	Anthropic model for every agent
`AEGIS_LLM_MAX_COST_USD_PER_RUN`	`0.50`	Hard cap per gateway instance
`AEGIS_LLM_MAX_COST_USD_PER_DAY`	`5.0`	Aggregate cap across the last 24h (0 disables)
`AEGIS_MAX_REFINEMENTS`	`2`	Max critic ⇄ analyst loops
`AEGIS_PROFILE_PATH`	`~/.config/aegis/profile.yaml`	Where the candidate profile lives
`LOG_LEVEL`	`INFO`	Loguru level — `DEBUG` for verbose tracing

Project structure

aegis/
├── src/aegis/
│   ├── agents/          # Researcher, Analyst, Critic + auto-tracing base class
│   ├── workflows/
│   │   └── job_analysis/   # LangGraph wiring + service
│   ├── briefing/        # LangGraph daily-briefing nodes + graph + service
│   ├── jobs/            # Job service facade exposed by API and MCP
│   ├── sources/         # GitHub, HN, internal jobs source for the briefing
│   ├── llm/             # Anthropic gateway (cost-tracked, budget-capped, retrying)
│   ├── tracing/         # Span recorder + repository + HTML viewer
│   ├── db/              # asyncpg engine + repositories
│   ├── api/             # FastAPI routers (jobs, briefing, traces, health)
│   ├── app.py           # FastAPI factory
│   └── mcp_server.py    # FastMCP server (tools, resources, prompts)
├── migrations/          # Numbered SQL migrations
├── evals/               # YAML golden cases per eval_name
├── scripts/
│   ├── seed_evals.py    # YAML → eval_golden_cases
│   └── run_eval.py      # Run a suite, write eval_runs
├── tests/
│   ├── unit/            # Mocked LLM, no infra
│   └── integration/     # testcontainers, real Postgres
└── pyproject.toml

Development

Working on Aegis itself (rather than using it):

git clone https://github.com/nareus/aegis
cd aegis
uv sync                                # install deps + dev tools
aegis init                             # one-time, sets up ~/.config/aegis/
aegis up                               # start Postgres + Redis

uv run python -m aegis.cli serve       # run MCP server from source
uv run pytest                          # full suite, all network mocked
uv run pytest tests/unit/              # fast, no Docker
uv run pytest tests/integration/ -v    # spins up Postgres via testcontainers

uv run python scripts/seed_evals.py    # mirror eval cases to Postgres
uv run python scripts/run_eval.py job_fit_v1

scripts/ and evals/ ship in the repo for regression testing and aren't bundled into the published package.

What's deliberately not here (yet)

So you know what you're signing up for:

No background scheduler — briefings run only when triggered.
No auth on the FastAPI surface — assume loopback / firewalled. Add an API-key middleware before exposing publicly.
No SSRF allow-list in the researcher fetch — fine for trusted local use; restrict before exposing to untrusted JD URLs.
No LangGraph checkpointing — a pod restart mid-workflow loses the run.
No multi-tenancy — single profile, single user, single DB.

These are intentional cuts for a local-first tool.

License

MIT. Fork it, rewire it, ship it.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Jun 4, 2026

0.1.1

Jun 2, 2026

0.1.0

Jun 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aegis_agents-0.1.2.tar.gz (177.4 kB view details)

Uploaded Jun 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aegis_agents-0.1.2-py3-none-any.whl (60.0 kB view details)

Uploaded Jun 4, 2026 Python 3

File details

Details for the file aegis_agents-0.1.2.tar.gz.

File metadata

Download URL: aegis_agents-0.1.2.tar.gz
Upload date: Jun 4, 2026
Size: 177.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for aegis_agents-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`842252727c512f2bcbd5b7caf1636915c7f75c9628cad4172f4c5145e6fd8270`
MD5	`bfec034a5a4fec1a192c8a4be9d8ff98`
BLAKE2b-256	`3d248b97480bcfbaaa3e62cf4a74b43ddf2ed40e2b72451b039a3085c54f8ff2`

See more details on using hashes here.

File details

Details for the file aegis_agents-0.1.2-py3-none-any.whl.

File metadata

Download URL: aegis_agents-0.1.2-py3-none-any.whl
Upload date: Jun 4, 2026
Size: 60.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for aegis_agents-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`41cee0e151bda15405799271044726b864056d134517b9d5386eb589e9dddaa5`
MD5	`adf612fe273e1c5d47c4d126c984ceb0`
BLAKE2b-256	`23cb242e42970a312e678adb4495bf4060cde3166c4dcbd681f04cc613b1eb4c`

See more details on using hashes here.

aegis-agents 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Aegis

What you can do today

What's actually reusable — the building blocks

Quick start

Prerequisites

Install

Wire to Claude

Daily ops

Daily workflow

Morning briefing

Adding a job

Inspecting a run

MCP surface

Evals — keep prompts honest

Profile configuration

Tuning

Project structure

Development

What's deliberately not here (yet)

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes