Multi-model autonomous coding engine. Local + Cloud. 24/7.
Project description
ForgeGod
The coding agent that runs 24/7, learns from its mistakes, and costs $0 when you want it to.
19 built-in tools • 5 LLM providers • 5-tier memory • 24/7 autonomous • $0 local mode
ForgeGod orchestrates multiple LLMs (OpenAI, Anthropic, Google Gemini, Ollama, OpenRouter) into a single autonomous coding engine. It routes tasks to the right model, runs 24/7 from a PRD, learns from every outcome, and self-improves its own strategy. Run it locally for $0 with Ollama, or use cloud models when you need them.
pip install forgegod
What Makes ForgeGod Different
Every other coding CLI uses one model at a time and resets to zero each session. ForgeGod doesn't.
| Capability | Claude Code | Codex CLI | Aider | Cursor | ForgeGod |
|---|---|---|---|---|---|
| Multi-model auto-routing | - | - | manual | - | yes |
| Local + cloud hybrid | - | basic | basic | - | native |
| 24/7 autonomous loops | - | - | - | - | yes |
| Cross-session memory | basic | - | - | removed | 5-tier |
| Self-improving strategy | - | - | - | - | yes (SICA) |
| Cost-aware budget modes | - | - | - | - | yes |
| Reflexion code generation | - | - | - | - | 3-attempt |
| Parallel git worktrees | subagents | - | - | - | yes |
The Moat: Harness > Model
Scaffolding adds ~11 points on SWE-bench — harness engineering matters as much as the model. ForgeGod is the harness:
- Ralph Loop — 24/7 coding from a PRD. Progress lives in git, not LLM context. Fresh agent per story. No context rot.
- 5-Tier Memory — Episodic (what happened) + Semantic (what I know) + Procedural (how I do things) + Graph (how things connect) + Error-Solutions (what fixes what). Memories decay, consolidate, and reinforce automatically.
- Reflexion Coder — 3-attempt code gen with escalating models: local (free) → cloud (cheap) → frontier (when it matters). AST validation at every step.
- SICA — Self-Improving Coding Agent. Modifies its own prompts, model routing, and strategy based on outcomes. 6 safety layers prevent drift.
- Budget Modes —
normal→throttle→local-only→halt. Auto-triggered by spend. Run forever on Ollama for $0.
Getting Started (No Coding Required)
You don't need to be a developer to use ForgeGod. If you can describe what you want in plain English, ForgeGod writes the code.
Option A: Free Local Mode ($0)
- Install Ollama: https://ollama.com/download
- Pull a model:
ollama pull qwen3.5:9b - Install ForgeGod:
pip install forgegod - Run:
forgegod init(interactive wizard guides you) - Try it:
forgegod run "Create a simple website with a contact form"
Option B: Cloud Mode (faster, ~$0.01/task)
- Get an OpenAI key: https://platform.openai.com/api-keys
- Install ForgeGod:
pip install forgegod - Run:
forgegod init→ paste your key when prompted - Try it:
forgegod run "Build a REST API with user authentication"
Something not working?
Run forgegod doctor — it checks your setup and tells you exactly what to fix.
Quickstart
# Install
pip install forgegod
# Initialize a project
forgegod init
# Single task
forgegod run "Add a /health endpoint to server.py with uptime and version info"
# Plan a project → generates PRD
forgegod plan "Build a REST API for a todo app with auth, CRUD, and tests"
# 24/7 autonomous loop from PRD
forgegod loop --prd .forgegod/prd.json
# Caveman mode — 50-75% token savings with ultra-terse prompts
forgegod run --terse "Add a /health endpoint"
# Check what it learned
forgegod memory
# View cost breakdown
forgegod cost
# Benchmark your models
forgegod benchmark
# Health check
forgegod doctor
Zero-Config Start
ForgeGod auto-detects your environment on first run:
- Finds API keys in env vars (
OPENAI_API_KEY,ANTHROPIC_API_KEY) - Checks if Ollama is running locally
- Detects your project language, test framework, and linter
- Picks the best model for each role based on what's available
- Creates
.forgegod/config.tomlwith sensible defaults
No manual setup required. Just run forgegod init and go.
How the Ralph Loop Works
┌─────────────────────────────────────────────────┐
│ RALPH LOOP │
│ │
│ ┌──────┐ ┌───────┐ ┌─────────┐ ┌─────┐ │
│ │ READ │──▶│ SPAWN │──▶│ EXECUTE │──▶│ VAL │ │
│ │ PRD │ │ AGENT │ │ STORY │ │IDATE│ │
│ └──────┘ └───────┘ └─────────┘ └──┬──┘ │
│ ▲ │ │
│ │ ┌────────┐ ┌────────┐ │ │
│ └─────────│ROTATE │◀───│COMMIT │◀──┘ │
│ │CONTEXT │ │OR RETRY│ pass │
│ └────────┘ └────────┘ │
│ │
│ Progress is in GIT, not LLM context. │
│ Fresh agent per story. No context rot. │
│ Create .forgegod/KILLSWITCH to stop. │
└─────────────────────────────────────────────────┘
- Read PRD — Pick highest-priority TODO story
- Spawn agent — Fresh context (progress is in git, not memory)
- Execute — Agent uses 19 tools to implement the story
- Validate — Tests, lint, syntax, frontier review
- Commit or retry — Pass: commit + mark done. Fail: retry up to 3x with model escalation
- Rotate — Next story. Context is always fresh.
5-Tier Memory System
ForgeGod has the most advanced memory system of any open-source coding agent:
| Tier | What | How | Retention |
|---|---|---|---|
| Episodic | What happened per task | Full outcome records | 90 days |
| Semantic | Extracted principles | Confidence + decay + reinforcement | Indefinite |
| Procedural | Code patterns & fix recipes | Success rate tracking | Indefinite |
| Graph | Entity relationships + causal edges | Auto-extracted from outcomes | Indefinite |
| Error-Solution | Error pattern → fix mapping | Fuzzy match lookup | Indefinite |
Memories decay without reinforcement (30-day half-life), consolidate automatically (merge similar, prune weak), and are injected into every prompt as a Memory Spine ranked by relevance + recency + importance.
# Check memory health
forgegod memory
# Memory is stored in .forgegod/memory.db (SQLite)
# Global learnings in ~/.forgegod/memory.db (cross-project)
Budget Modes
| Mode | Behavior | Trigger |
|---|---|---|
normal |
Use all configured models | Default |
throttle |
Prefer local, cloud for review only | 80% of daily limit |
local-only |
Ollama only, $0 operation | Manual or 95% limit |
halt |
Stop all LLM calls | 100% of daily limit |
# Check spend
forgegod cost
# Override mode
export FORGEGOD_BUDGET_MODE=local-only
Caveman Mode (--terse)
Ultra-terse prompts that reduce token usage 50-75% with no accuracy loss for coding tasks. Backed by 2026 research:
- Mini-SWE-Agent — 100 lines, >74% SWE-bench Verified
- Chain of Draft — 7.6% tokens, same accuracy
- CCoT — 48.7% shorter, negligible impact
# Add --terse to any command
forgegod run --terse "Build a REST API"
forgegod loop --terse --prd .forgegod/prd.json
forgegod plan --terse "Refactor auth module"
# Or enable globally in config
# .forgegod/config.toml
# [terse]
# enabled = true
Caveman mode compresses system prompts (~200 → ~80 tokens), tool descriptions (3-8 words each), and tool output (tracebacks → last frame only). JSON schemas for planner/reviewer stay byte-identical.
Configuration
ForgeGod uses TOML config with 3-level priority: env vars > project > global.
# .forgegod/config.toml
[models]
planner = "openai:gpt-4o-mini" # Cheap planning
coder = "ollama:qwen3-coder-next" # Free local coding
reviewer = "openai:o4-mini" # Quality gate
sentinel = "openai:gpt-4o" # Frontier sampling
escalation = "openai:gpt-4o" # Fallback for hard problems
[budget]
daily_limit_usd = 5.00
mode = "normal"
[loop]
max_iterations = 100
parallel_workers = 2
gutter_detection = true
[ollama]
host = "http://localhost:11434"
model = "qwen3-coder-next"
[terse]
enabled = false # --terse flag or set true here
[security]
sandbox_mode = "standard" # permissive | standard | strict
redact_secrets = true
audit_commands = true
Environment Variables
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..." # Optional
export OPENROUTER_API_KEY="sk-or-..." # Optional
export GOOGLE_API_KEY="AIza..." # Optional (Gemini)
export FORGEGOD_BUDGET_DAILY_LIMIT_USD=10
Supported Models
| Provider | Models | Cost | Setup |
|---|---|---|---|
| Ollama | qwen3-coder-next, devstral, any | $0 | ollama serve |
| OpenAI | gpt-4o, gpt-4o-mini, o3, o4-mini | $$ | OPENAI_API_KEY |
| Anthropic | claude-sonnet-4-6, claude-opus-4-6 | $$$ | ANTHROPIC_API_KEY |
| Google Gemini | gemini-2.5-pro, gemini-3-flash | $$ | GOOGLE_API_KEY |
| OpenRouter | 200+ models | varies | OPENROUTER_API_KEY |
Model Leaderboard
Run your own: forgegod benchmark
| Model | Composite | Correctness | Quality | Speed | Cost | Self-Repair |
|---|---|---|---|---|---|---|
| openai:gpt-4o-mini | 81.5 | 10/12 | 7.4 | 12s avg | $0.08 | 4/4 |
| ollama:qwen3.5:9b | 72.3 | 8/12 | 6.8 | 45s avg | $0.00 | 3/4 |
Run forgegod benchmark --update-readme to refresh with your own results.
Architecture
forgegod/
├── cli.py # Typer CLI (init, run, loop, plan, review, cost, memory, status, benchmark, doctor)
├── config.py # TOML config + env vars + 3-level priority
├── router.py # Multi-provider LLM router + circuit breaker + Thompson sampling
├── agent.py # Core agent loop (tools + context compression + sub-agents)
├── coder.py # Reflexion code generation (3 attempts, model escalation, GOAP)
├── loop.py # Ralph loop (24/7 autonomous coding from PRD)
├── planner.py # Task decomposition → PRD
├── reviewer.py # Frontier model quality gate (sample-based)
├── sica.py # Self-improving strategy modification (6 safety layers)
├── memory.py # 5-tier cognitive memory (episodic/semantic/procedural/graph/errors)
├── budget.py # SQLite cost tracking + auto budget modes
├── worktree.py # Parallel git worktree workers
├── tui.py # Rich terminal dashboard
├── terse.py # Caveman mode — terse prompts, tool compression, savings tracker
���── benchmark.py # Model benchmarking engine (12 tasks, 4 tiers, composite scoring)
├── onboarding.py # Interactive setup wizard for new users
├── doctor.py # Installation health check (6 diagnostic checks)
├── i18n.py # Translation strings (English + Spanish es-419)
├── models.py # Pydantic v2 data models
└── tools/
├── filesystem.py # read, write, edit (fuzzy match), glob, grep, repo_map
├── shell.py # bash (command denylist + secret redaction)
├── git.py # git status, diff, commit, worktrees
├── mcp.py # MCP server client (5,800+ servers)
└── skills.py # On-demand skill loading
Security
Defense-in-depth, not security theater:
- Command denylist — 13 dangerous patterns blocked (
rm -rf /,curl | sh,sudo, fork bombs) - Secret redaction — 11 patterns strip API keys from tool output before LLM context
- Prompt injection detection — Rules files scanned for injection patterns before loading
- Budget limits — Cost controls prevent runaway API spend
- Killswitch — Create
.forgegod/KILLSWITCHto immediately halt autonomous loops - Sensitive file protection —
.env, credentials files get warnings + automatic redaction
Warning: ForgeGod executes shell commands and modifies files. Review changes before committing. Start autonomous mode with
--max 5to verify behavior.
See SECURITY.md for the full policy and vulnerability reporting.
Contributing
We welcome contributions. See CONTRIBUTING.md for guidelines.
- Bug reports and feature requests: GitHub Issues
- Questions and discussion: GitHub Discussions
License
Apache 2.0 — see LICENSE.
Built by WAITDEAD • Powered by techniques from OpenClaw, Hermes, and SOTA 2026 coding agent research.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file forgegod-0.1.0.tar.gz.
File metadata
- Download URL: forgegod-0.1.0.tar.gz
- Upload date:
- Size: 127.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
40d5a67d030f16c39b234b236f442428de177f31d2c7f9255257a8f69498bce3
|
|
| MD5 |
b8ba08828102b2d521c1c2cccebd2efc
|
|
| BLAKE2b-256 |
7783642b22fc49a62732c767981564d5d16b872cdf2ab2db0a3f0b08a0501922
|
File details
Details for the file forgegod-0.1.0-py3-none-any.whl.
File metadata
- Download URL: forgegod-0.1.0-py3-none-any.whl
- Upload date:
- Size: 109.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a8313a61a0fec524c225adf61b21c9a40d2506441e9a7ae6fb1cfa3460f742ca
|
|
| MD5 |
2a8d19b1bb22a35bea67b462c02cfed7
|
|
| BLAKE2b-256 |
7c107ee19a03d8dd9f82ba2e7b733a3bb73cc9d1ab92aadeb181f6df8a685e41
|