AI multi-agent framework — five agents that read your codebase, write real code files, and run your test suite

These details have not been verified by PyPI

Project links

Project description

NexusForge

An AI multi-agent framework that runs a full software development team on your machine. Five specialized agents — Planner, Developer, Reviewer, QA, and Memory — read your existing codebase, write real code files, run your actual test suite, and coordinate through an asyncio orchestrator. State is stored in plain YAML. No database. No server. No cloud dependency.

What Is NexusForge?

NexusForge turns a backlog of tasks into committed, reviewed, tested code by running five specialized AI agents in a loop. Unlike AI assistants that generate code snippets in a chat window, NexusForge agents read your existing codebase, write files directly to disk, and run your real test suite to validate their work.

Agent	What it actually does
Planner	Scans the project file tree, reads README and key files, selects relevant existing source files, queries past lessons, then writes a concrete plan naming real files to create or modify
Developer	Reads the same project context, generates implementation + tests in a structured format, writes files to disk, saves a manifest of what changed
QA	Reads the written files; either runs `test_command` (your real test suite) or asks the LLM to validate code quality — pass/fail gates the review transition
Reviewer	Reads the actual written files from the manifest, reviews real code, writes `.nexus/reviews/<task>.md` with files reviewed listed
Memory	Logs every decision and reflection; surfaces past lessons to Planner before each planning cycle

You create tasks (or let nexus plan do it from a requirements file). Agents do the work. You approve features and deliveries. State survives restarts and crashes because every write is atomic.

How It Is Different

Capability	NexusForge	AI coding assistants	Multi-agent platforms (AutoGen, CrewAI)
Writes real code to disk	Yes — files created/modified on disk	Chat only	No (orchestration only)
Reads existing codebase	Yes — scans tree, reads relevant files	Per-session only	No
Runs real test suite	Yes — configurable `test_command`	No	No
State persistence	YAML files — survive crash and restart	Session memory	Usually in-memory or DB
Human approval gates	Built-in (`nexus approve`)	Ad-hoc	Optional
Audit trail	Every transition, decision, reflection logged	None	Varies
Crash recovery	Kill-9 tested; atomic writes, no partial state	None	None
Provider agnostic	OpenAI, Anthropic, any Ollama local model	Usually one vendor	Varies
Fully offline	Yes, with Ollama	No	No
CLI-first	Full `nexus` CLI, scriptable	GUI / IDE plugin	Python API

The core differentiator: NexusForge treats software development as a stateful, recoverable process with real file I/O — not a chat conversation. Code lands on disk, tests run against real files, and the entire history is auditable in YAML.

Requirements

Python 3.11 or 3.12
An API key for at least one LLM provider, or a running Ollama instance for fully local operation

Installation

Option A — pip (standard)

# Core package — add the provider you use
pip install nexusforge-ai                     # core only
pip install "nexusforge-ai[anthropic]"        # + Anthropic Claude
pip install "nexusforge-ai[openai]"           # + OpenAI GPT-4o and variants
pip install "nexusforge-ai[local]"            # + Ollama (any local model)
pip install "nexusforge-ai[all]"              # all three providers

Option B — pipx (isolated, nexus available globally)

pipx install "nexusforge-ai[anthropic]"
# nexus is now on your PATH in an isolated environment

Option C — uv (fastest, project-level)

uv add "nexusforge-ai[anthropic]"
# or globally:
uv tool install "nexusforge-ai[anthropic]"

From source (contributors)

git clone https://github.com/your-org/nexusforge
cd NexusForge
uv sync --all-extras
# Run via: uv run nexus <command>

Verify

nexus version
# nexusforge 0.1.0

Getting Started

All examples below assume nexus is on your PATH via pip install nexusforge-ai or pipx install nexusforge-ai. If you installed via uv run, prefix every command with uv run.

1. Initialize inside your project directory

cd my-project          # your existing or new project
nexus init

NexusForge creates .nexus/ alongside your existing code. It does not touch your project files during init.

nexus doctor

           NexusForge Doctor
┌──────────────────────┬────────┬──────────────────────┐
│ Check                │ Status │ Detail               │
├──────────────────────┼────────┼──────────────────────┤
│ .nexus/ directory    │ OK     │                      │
│ config.yaml          │ OK     │ phase=1, provider=.. │
│ API key              │ OK     │ ANTHROPIC_API_KEY set │
│ Provider probe       │ OK     │ claude-opus-4-7      │
└──────────────────────┴────────┴──────────────────────┘

2. Configure your test command

Edit .nexus/config.yaml to point at your real test suite:

# Set to your project's test command. Empty = LLM-simulated QA.
test_command: "uv run pytest -q"   # Python/uv
# test_command: "npm test"         # Node.js
# test_command: "cargo test"       # Rust
# test_command: ""                 # no real tests yet

This is the most important config value. When set, QA runs the real tests after each Developer delivery and uses the exit code to gate the review → done transition.

3. Create your plan from requirements

cat > REQUIREMENTS.md << 'EOF'
# Auth System
Users need to log in and register. Passwords must be hashed.
Sessions expire after 30 minutes of inactivity.

# User Profile
Users can update their display name and avatar.
Profile changes must be audited.
EOF

nexus plan REQUIREMENTS.md

Output:

Decomposing REQUIREMENTS.md with claude-opus-4-7...

Plan: 2 features, 5 tasks

▶ Auth System
  Login, registration and session management
  • Implement JWT login and registration endpoints
    ✓ Returns 200 with token on valid credentials
    ✓ Rejects duplicate emails with 409
  • Add password hashing with bcrypt
    ✓ Passwords never stored in plain text
  • Implement 30-minute session expiry
    ✓ Sessions auto-extend on request; expire after 30 min idle

▶ User Profile
  Profile editing with audit trail
  • Add profile update endpoint with validation
    ✓ Display name and avatar URL validated
  • Implement profile change audit log
    ✓ Every change recorded with timestamp and actor

Created 2 features and 5 tasks. Run nexus start to begin.

Preview without writing: nexus plan REQUIREMENTS.md --dry-run

4. Start the orchestrator

nexus start
# Orchestrator started (PID 12345)

Agents work concurrently — up to 3 tasks per agent simultaneously. Watch what happens in another terminal:

nexus logs --follow

[planner.log]   Planner received task NF-1
[planner.log]   Scanned 47 project files, reading 5 relevant files
[planner.log]   LLM for NF-1: Create src/auth/login.py, src/auth/models.py, tests/test_login.py
[developer.log] Developer received task NF-1
[developer.log] Reading 8 context files from existing codebase
[developer.log] Developer wrote 3 file(s) for NF-1: src/auth/login.py, src/auth/models.py, tests/test_login.py
[qa.log]        QA running 'uv run pytest -q' for NF-1
[qa.log]        QA test run for NF-1: PASS (3 tests, 0 failures)
[reviewer.log]  Reviewer reading 3 files for NF-1
[reviewer.log]  Reviewer LLM for NF-1: approved

5. Review and approve

cat .nexus/reviews/NF-1.md

# Review: NF-1

**Verdict:** approved

**Files reviewed:**
- `src/auth/login.py`
- `src/auth/models.py`
- `tests/test_login.py`

**Comments:**
Login endpoint correctly validates credentials against hashed passwords.
Tests cover valid login, invalid password, and unknown user cases.
Session token expiry is tested with a mocked clock.

nexus approve feature NF-F1 --reason "All tests pass, code reviewed"

6. Query what the agents learned

nexus memory query --kind reflection --limit 5

How Code Generation Works

Developer output format

The Developer prompts the LLM with the project context and requires a structured response. Every file is output as:

## FILE: relative/path/to/file.py
```python
# complete file content here


NexusForge parses this format, validates each path against the project root (preventing path traversal), and writes files to disk. If an existing file is being modified, the LLM includes the complete updated file — partial patches are not used.

### Manifest system

After writing files, the Developer saves a manifest to `.nexus/task_files/<task-id>.yaml`:

```yaml
task_id: NF-1
files:
  - src/auth/login.py
  - src/auth/models.py
  - tests/test_login.py

QA and Reviewer read this manifest to know exactly which files to examine. Review documents list the reviewed files explicitly, creating a complete audit trail.

Project context selection

Before each LLM call, Planner and Developer:

Scan the project tree (skipping .venv, .git, __pycache__, node_modules, build artifacts)
Read key files: README.md, pyproject.toml, Cargo.toml, package.json, etc.
Select the most relevant existing source files by matching keywords from the task title and description
Include up to max_context_files (default: 8) of actual file content in the prompt

This means the LLM sees real code patterns, real naming conventions, and real project structure before generating anything.

For Existing Projects

Drop NexusForge into any codebase — it reads before it writes.

cd /path/to/existing-project
nexus init

Edit .nexus/config.yaml to match your project:

model_provider: anthropic
model_name: claude-opus-4-7

# Your actual test command
test_command: "uv run pytest -q"

# How many existing source files to include in agent prompts
max_context_files: 8

Then create a task:

nexus task create --title "Add rate limiting to the /api/login endpoint"
nexus start

The Planner will scan your project, find your existing auth code, read your README and project manifest, and produce a plan that references your actual file paths and follows your existing patterns. The Developer will read those same files and generate code that extends your existing implementation.

What NexusForge reads (never modifies during init):

README.md, pyproject.toml, package.json, Cargo.toml, go.mod, Makefile
Source files matching task keywords (up to max_context_files)
The project file tree (all non-binary, non-venv files)

What NexusForge writes:

Source and test files generated by the Developer (into your project directory)
State files in .nexus/ (owned exclusively by NexusForge)

Configuration

All configuration lives in .nexus/config.yaml.

# Phase of development (1–6, controls feature gating)
phase: 1

# LLM provider: anthropic | openai | local | fake
model_provider: anthropic

# Model name passed to the provider API
model_name: claude-opus-4-7

# Maximum tokens the provider may return per call
max_context_tokens: 200000

# HTTP timeout for each LLM request (seconds)
request_timeout_seconds: 120

# Retry policy for rate limits and transient errors
retry_max_attempts: 3
retry_backoff_base: 2.0      # seconds before first retry
retry_backoff_max: 30.0      # cap on any single backoff delay

# Local provider (Ollama-compatible)
local_base_url: http://localhost:11434

# Test command run by QA after each Developer delivery.
# Empty string = LLM-simulated QA (for projects without a real test suite).
# Examples: "uv run pytest -q" | "npm test" | "cargo test" | "make test"
test_command: ""

# Maximum existing source files included in agent prompts.
# Higher = more context, more tokens. Range: 1–50.
max_context_files: 8

Provider options

Anthropic (default)

model_provider: anthropic
model_name: claude-opus-4-7   # or claude-sonnet-4-6

export ANTHROPIC_API_KEY=sk-ant-...

OpenAI

model_provider: openai
model_name: gpt-4o

export OPENAI_API_KEY=sk-...

Local (Ollama — fully offline)

model_provider: local
model_name: llama3.2
local_base_url: http://localhost:11434

ollama pull llama3.2 && ollama serve

OS keyring (optional, keys never written to disk)

python -c "import keyring; keyring.set_password('nexusforge', 'ANTHROPIC_API_KEY', 'sk-ant-...')"

Concurrency and Dependencies

Parallel agent execution

Each agent processes up to 3 messages simultaneously. Independent tasks flow through the full pipeline in parallel:

Tasks NF-1, NF-2, NF-3 (no depends_on):

  Planner:   [NF-1][NF-2][NF-3]
  Developer: [NF-1][NF-2][NF-3]   ← three files being written simultaneously
  QA:        [NF-1][NF-2][NF-3]   ← three test runs in parallel
  Reviewer:  [NF-1][NF-2][NF-3]

Task dependencies

Tasks declare depends_on — IDs that must reach done before dispatch. Set automatically by nexus plan for sequential requirements; also settable in tasks.yaml directly.

- id: NF-1
  title: Add database schema migration
  depends_on: []

- id: NF-2
  title: Implement user model using new schema
  depends_on: [NF-1]   # waits for NF-1 to finish

LLM error handling

All provider errors are caught per-task — the orchestrator never crashes.

Error	Behaviour
Rate limit (429)	Exponential backoff with jitter, up to `retry_max_attempts`
Timeout	Same retry policy
Connection error	Same retry policy
Auth failure	Immediate → task `blocked`, no retries
All retries exhausted	Task → `blocked` with reason

nexus blockers
# NF-3 — Implement password reset: LLM unavailable after retries: 503 Service Unavailable

nexus why NF-3
# Task NF-3 — Implement password reset
#   State:    blocked
#   ✗ Blocked: LLM unavailable after retries: connection timeout

Fix the provider issue, then edit .nexus/tasks.yaml to reset state: defined and restart.

Full Command Reference

Planning

Command	Description
`nexus plan [FILE] [--dry-run]`	Decompose requirements file into features + tasks

Lifecycle

Command	Description
`nexus init [--force] [--yes]`	Scaffold `.nexus/` state directory
`nexus start`	Start orchestrator and all five agents
`nexus stop`	Graceful shutdown via SIGTERM
`nexus status`	Tasks by state and in-progress list
`nexus version`	Print version and exit
`nexus doctor`	Health check: config, keys, provider

Tasks

Command	Description
`nexus task list [--state STATE] [--agent AGENT]`	List tasks with optional filters
`nexus task show <id>`	Detail view
`nexus task create --title TEXT [--feature ID]`	Create task in `defined` state

Features and Deliveries

Command	Description
`nexus feature list`	Features with rollup state
`nexus feature show <id>`	Feature detail
`nexus feature create --title TEXT [--description TEXT]`	Create feature
`nexus delivery list`	Deliveries and state
`nexus delivery create --title TEXT`	Create delivery

Reviews and Approvals

Command	Description
`nexus reviews [--pending]`	List review files
`nexus approve feature\|delivery <id> [--reason TEXT]`	Record approval
`nexus reject feature\|delivery <id> --reason TEXT`	Record rejection
`nexus blockers`	Blocked tasks with reasons
`nexus why <task-id>`	Dependency chain and delay explanation

Observability

Command	Description
`nexus logs [--agent NAME] [--tail N] [--follow]`	Stream agent logs
`nexus memory query [--tag T] [--substring S] [--kind K] [--limit N]`	Query memory

State Files

Everything is human-readable and editable. Restart the orchestrator after manual edits.

.nexus/
├── config.yaml          Configuration (provider, model, test_command)
├── tasks.yaml           Task list — state machine owner: orchestrator
├── features.yaml        Feature groups with child task IDs
├── deliveries.yaml      Delivery groups with child feature IDs
├── approvals.yaml       Approval records (feature and delivery)
├── progress.yaml        Append-only state transition log
├── memory.log           Append-only agent decision log
├── reflection.log       Append-only lessons-learned log
├── task_files/          File manifest per task (written by Developer)
│   └── NF-1.yaml        Lists files written for task NF-1
├── logs/                Per-agent structured log files
│   ├── planner.log
│   ├── developer.log
│   ├── qa.log
│   ├── reviewer.log
│   └── memory.log
└── reviews/             Per-task reviewer write-ups
    └── NF-1.md

Task lifecycle

defined → in_progress → review → done
    ↓           ↓
 blocked     blocked

done requires QA pass (real test suite exit 0, or LLM validation if no test_command). All transitions logged to progress.yaml.

Feature lifecycle

planned → dev_complete → ready_for_test → completed

completed = all tasks done + nexus approve feature.

Delivery lifecycle

planned → in_progress → pending_approval → released

released = all features completed + nexus approve delivery.

Running Offline with a Local Model

ollama pull deepseek-coder   # or llama3.2, codellama, mistral
ollama serve

# .nexus/config.yaml:
model_provider: local
model_name: deepseek-coder
local_base_url: http://localhost:11434

nexus start

Smaller models produce lower-quality code and reviews. deepseek-coder is the best open-source choice for code generation tasks. The provider interface is identical — no code changes needed.

Project Layout

NexusForge/
├── src/nexusforge/
│   ├── cli.py              Typer CLI entry point (nexus plan, start, approve, ...)
│   ├── orchestrator.py     asyncio event loop + state machine + dependency dispatch
│   ├── codeops.py          Code I/O: project scan, LLM output parsing, file writing, test runner
│   ├── agents/
│   │   ├── base.py         AgentBase: concurrent dispatch, _complete_or_block, path guard
│   │   ├── planner.py      Reads project context, plans tasks
│   │   ├── developer.py    Writes real code files to disk
│   │   ├── reviewer.py     Reads written files, reviews real code
│   │   ├── qa.py           Runs test_command or LLM validation
│   │   └── memory.py       Logs decisions and reflections, answers queries
│   ├── providers/          LLM backends: OpenAI, Anthropic, Local (Ollama), Fake
│   ├── persistence/        Atomic writes + advisory locking
│   └── models/             Pydantic v2 domain models (Task, Feature, Delivery, ...)
├── tests/
│   ├── unit/               Fast tests using tmp_path + FakeProvider + no real LLM
│   └── integration/        Subprocess-based crash recovery and contention tests
└── .nexus/                 Runtime state (created by nexus init)

Development

uv run ruff format .         # format
uv run ruff check .          # lint
uv run mypy src --strict     # type check
uv run pytest -q             # tests (85% coverage gate)
uv run bandit -r src/ -q     # security scan
uv run pip-audit             # dependency audit

CI matrix: Linux / macOS / Windows × Python 3.11 / 3.12. See .github/workflows/ci.yml.

Exit Codes

Code	Meaning
`0`	Success
`1`	Runtime error — config, provider, persistence
`2`	User error — not found, illegal transition, bad input
`130`	Interrupted by Ctrl+C

License

NexusForge is licensed under the Business Source License 1.1 (BUSL-1.1).

Free use: Non-production use — personal projects, evaluation, research, and development — is permitted without restriction.

Commercial production use requires a commercial license. Contact [YOUR_EMAIL] for pricing.

Converts to open source: On 2030-05-12, this license automatically converts to Apache License 2.0, which is a fully open source license with no restrictions.

See LICENSE for the full terms. See mariadb.com/bsl11 for the BSL 1.1 specification.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nexusforge_ai-0.1.0.tar.gz (84.3 kB view details)

Uploaded May 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nexusforge_ai-0.1.0-py3-none-any.whl (66.6 kB view details)

Uploaded May 13, 2026 Python 3

File details

Details for the file nexusforge_ai-0.1.0.tar.gz.

File metadata

Download URL: nexusforge_ai-0.1.0.tar.gz
Upload date: May 13, 2026
Size: 84.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for nexusforge_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`c77218ca2e546dc4bad689bd1ace42acd7a8d44650505d8c6f6cf4de3f021cdf`
MD5	`3b4a4450ba7352b9b0aa3ba553e8787f`
BLAKE2b-256	`adacbc8af7a55999f205fbf992a80cdc6261d02a10703b32f0b104c1dd91ce2b`

See more details on using hashes here.

File details

Details for the file nexusforge_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: nexusforge_ai-0.1.0-py3-none-any.whl
Upload date: May 13, 2026
Size: 66.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for nexusforge_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`58c55c3c94641b355769d63146d5cc22eefd0c62d9eb291139c9382590b78e64`
MD5	`1e3de265818c843b103eb82f5d4ef180`
BLAKE2b-256	`6434506b868a72d430c727df4e5a14dc978eaa815dec96d2a5a99e595242a23c`

See more details on using hashes here.

nexusforge-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

NexusForge

What Is NexusForge?

How It Is Different

Requirements

Installation

Option A — pip (standard)

Option B — pipx (isolated, nexus available globally)

Option C — uv (fastest, project-level)

From source (contributors)

Verify

Getting Started

1. Initialize inside your project directory

2. Configure your test command

3. Create your plan from requirements

4. Start the orchestrator

5. Review and approve

6. Query what the agents learned

How Code Generation Works

Developer output format

Project context selection

For Existing Projects

Configuration

Provider options

Concurrency and Dependencies

Parallel agent execution

Task dependencies

LLM error handling

Full Command Reference

Planning

Lifecycle

Tasks

Features and Deliveries

Reviews and Approvals

Observability

State Files

Task lifecycle

Feature lifecycle

Delivery lifecycle

Running Offline with a Local Model

Project Layout

Development

Exit Codes

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes