Experience memory layer for coding agents — learn from mistakes, inject proven fixes

These details have not been verified by PyPI

Project links

Project description

Project Forge

An experience memory layer for coding agents.

What is Forge?

Forge is an experience memory layer that sits between your coding agent sessions. It is not an orchestrator, not a harness — it is the long-term memory that agents lack.

The problem: LLM coding agents (Claude Code, Cursor, etc.) start every session from scratch. They repeat the same mistakes, forget yesterday's fixes, and can't learn from past failures.

Forge solves this by:

Remembering — automatically captures failures, decisions, and rules from every session
Learning — uses reinforcement learning (Q-values) to measure which experiences actually help
Injecting — feeds the most useful experiences into the next session's context, ranked by proven effectiveness

It runs as Claude Code hooks — zero manual intervention after setup.

Session Start               Mid-session                 Session End
  forge resume                 forge detect                forge writeback
  ↓                            ↓                           ↓
  Load Q-ranked patterns       Match stderr to DB          Parse transcript
  ↓                            ↓                           ↓
  Inject into context          Warn: "Seen this before,    Update Q-values
  "Last time this failed,       try: use async with"       Record experiment
   here's the fix (Q:0.8)"                                 Auto-promote

Installation

Step 1: Install

# pip
pip install git+https://github.com/gksl5355/Project-Forge.git

# or uv (faster)
uv tool install git+https://github.com/gksl5355/Project-Forge.git

Important: forge must be on your system PATH. Virtual-env-only installs will not work with hooks.

Step 2: Setup

forge setup

This one command:

Creates the experience database (~/.forge/forge.db)
Installs learning hooks (session start/end/failure detection)
Installs guard hooks (secret detection, --no-verify blocking)
Installs team skills (spawn-team, doctor, debate, ralph)
Patches ~/.claude/settings.json (append-only, creates backup)

=== Forge Setup ===

Hooks & Settings:
  + hooks.SessionStart: resume.sh
  + hooks.SessionEnd: writeback.sh
  + hooks.PostToolUse: detect.sh
  = env.AGENT_TEAMS = 1 (ok)           ← existing value preserved
  ! env.SOME_KEY = X (recommends: Y)   ← conflict shown, not overwritten

Skills:
  + ~/.claude/skills/spawn-team/

Proceed? [Y/n]:

forge setup -y to skip confirmation

Step 3: Done

Start coding. Forge learns automatically from every session.

# Check your Forge Score after a few sessions
forge score

# View with full breakdown
forge score --detail

For developers (editable install)

git clone https://github.com/gksl5355/Project-Forge.git
cd Project-Forge
pip install -e ".[dev]"     # or: make dev
forge setup

Features

Automatic Experience Learning

Every session goes through a learn → remember → inject cycle:

Phase	Hook	What happens
Start	`forge resume`	Loads top experiences by Q-value, injects into agent context
During	`forge detect`	Matches stderr/failures against known patterns, warns in real-time
End	`forge writeback`	Parses transcript, extracts new failures, updates Q-values

No manual intervention needed. Forge gets smarter with every session.

Q-Value Learning (MemRL)

Based on MemRL — each experience has a Q-value that measures how useful it actually is:

Q ← Q + α(reward - Q)

reward = 1.0  →  Warning helped (failure was avoided)
reward = 0.0  →  Warning ignored (same error repeated)

Time decay: Q *= (1 - 0.005)^days_since_last_used

High-Q experiences get shown first. Low-Q ones fade away. The system self-corrects.

Forge Score

One number that tells you how well Forge is working:

forge score
# === Forge Score (workspace: default) ===
#
#   Forge Score:     0.68 / 1.00
#
#   학습 효과 (QWHR):     0.72
#   컨텍스트 적중률:       0.65
#   토큰 효율:             0.58
#   패턴: 47개 | 세션: 23개

forge score --detail     # full breakdown with routing, circuit breaker, etc.

The score is computed from 8 internal metrics, weighted and optimized through parameter sweeps. You don't need to know the formula — just watch the number go up.

Smart Context Injection

Forge doesn't dump all experiences at once. It ranks them using:

Q-value — proven effectiveness
Recency — recent failures weighted higher (configurable decay)
Relevance — tag overlap with current session

The top experiences are formatted in a token-efficient format and injected at session start.

Adaptive Warning Formats

Forge automatically tests different warning formats (A/B testing) and converges on whichever format actually helps your agent more:

Essential: [WARN] pattern → hint (minimal tokens)
Annotated: [WARN] pattern Q:0.75 → hint (balanced)
Concise: [WARN] pattern Q:0.75 → hint_short (default)
Detailed: Full stats with seen/helped counts

Guard Hooks

Protective hooks that prevent common agent failure modes:

Hook	What it does
`block-no-verify.sh`	Blocks `--no-verify` — prevents bypassing pre-commit hooks
`guard-secrets.sh`	Detects API keys, tokens, private keys in writes
`suggest-compact.sh`	Suggests `/compact` after 50+ tool calls
`cost-tracker.sh`	Logs session metrics for efficiency tracking

Circuit Breaker

Automatically detects when a session is stuck in a failure loop:

Tracks consecutive failures and total tool calls per session
Trips when limits are exceeded (configurable)
Resets on success

Model Routing

Learns which LLM model works best for different task types:

quick tasks  → claude-haiku-4-5      (fast, cheap)
standard     → claude-sonnet-4-6     (balanced)
deep tasks   → claude-opus-4-6       (thorough)
review       → claude-sonnet-4-6     (good at review)

Routing accuracy improves as more session data accumulates.

Team Orchestration Support

Forge integrates with /spawn-team to learn from multi-agent runs:

forge recommend --complexity MEDIUM
# → sonnet:2+haiku:1 (3 runs, success: 85%, confidence: medium)

forge resume --team-brief
# → Recent team failures + recommended config

Global Promotion

When a pattern appears in 2+ projects, Forge automatically promotes it to a global experience that benefits all workspaces.

Commands

Everyday

Command	Description
`forge setup`	Full setup (DB + hooks + skills + settings)
`forge score`	View your Forge Score
`forge score --detail`	Full score breakdown
`forge config`	View/change settings
`forge stats`	Workspace statistics

Data Management

Command	Description
`forge record failure`	Record a failure pattern with hint
`forge record decision`	Record a decision with rationale
`forge record rule`	Record a project rule (block/warn/log)
`forge list`	List experiences by type
`forge detail PATTERN`	Detailed view of a pattern
`forge search -t TAG`	Search by tag
`forge edit`	Edit existing records

Analysis

Command	Description
`forge trend`	Fitness trend over time
`forge recommend`	Team config recommendation
`forge decay`	Apply time decay to stale patterns
`forge promote ID`	Promote to global or knowledge
`forge ingest`	Ingest team run data
`forge dedup`	Merge duplicate patterns

Hooks (automatic, not manually called)

Command	Trigger	Description
`forge resume`	SessionStart	Inject experiences into context
`forge detect`	PostToolUse	Real-time failure matching
`forge writeback`	SessionEnd	Learn from session transcript

Configuration

forge config                    # view basic settings
forge config --set alpha=0.15   # change a setting
forge config --advanced         # view all parameters (40+)

Basic settings (~/.forge/config.yml):

max_tokens: 3000          # max context tokens for injection
l0_max_entries: 50         # max patterns to show
llm_model: claude-haiku-4-5-20251001
alpha: 0.1                 # EMA learning rate
routing_enabled: true      # model routing on/off
circuit_breaker_enabled: true

All settings are optional. Defaults are pre-optimized.

Architecture

forge/
├── cli.py              # Typer CLI (all commands)
├── config.py           # ForgeConfig + YAML loading
├── engines/            # Core engines
│   ├── resume.py       # Session start: context injection
│   ├── detect.py       # Mid-session: failure matching
│   ├── writeback.py    # Session end: learning
│   ├── fitness.py      # Forge Score computation
│   ├── routing.py      # Model routing
│   ├── prompt_optimizer.py  # A/B testing, hint scoring
│   ├── sweep.py        # Parameter optimization
│   └── ...
├── core/               # Core logic
│   ├── qvalue.py       # Q-value EMA updates
│   ├── context.py      # L0/L1 context formatting
│   ├── circuit_breaker.py
│   └── ...
├── storage/            # SQLite storage
│   ├── db.py           # Schema, connections
│   ├── models.py       # Dataclass models
│   └── queries.py      # Raw SQL queries
├── hooks/              # Shell hook templates
└── skills/             # Bundled SKILL.md files

Data flow:

Agent session
  ↓ SessionStart hook
forge resume → DB query → context injection
  ↓ PostToolUse hook (on failure)
forge detect → pattern match → real-time warning
  ↓ SessionEnd hook
forge writeback → transcript parse → Q update → experiment record

What Gets Installed

Component	Location	Purpose
Experience DB	`~/.forge/forge.db`	SQLite — failures, decisions, rules, experiments
Learning hooks	`~/.forge/hooks/*.sh`	Session start/end/failure detection
Guard hooks	`~/.forge/hooks/*.sh`	Secret guard, no-verify block, compact suggest
Team skills	`~/.claude/skills/`	spawn-team, doctor, debate, ralph
Settings patch	`~/.claude/settings.json`	Hook registration (append-only, backup created)
Config	`~/.forge/config.yml`	Optional overrides (created on first `forge config --set`)

Metrics

Metric	Value
Tests	1,203 (all passing)
Source modules	40
Test files	42
Lines of code	~8,900
DB schema	v5
External dependencies	2 (typer, pyyaml)
Python	3.12+

Tech Stack

Python 3.12+ — runtime
SQLite — built-in DB, zero config, no external server
Typer — CLI framework
PyYAML — config parsing

Acknowledgements

MemRL — EMA-based Q-value learning. Core insight: Q measures "hint usefulness", not "failure severity"
OpenViking (ByteDance) — L0/L1/L2 layered context loading for token efficiency
Claude Code — Hook system that makes automatic learning possible

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.2.0

Mar 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forge_memory-1.2.0.tar.gz (202.2 kB view details)

Uploaded Mar 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

forge_memory-1.2.0-py3-none-any.whl (120.0 kB view details)

Uploaded Mar 24, 2026 Python 3

File details

Details for the file forge_memory-1.2.0.tar.gz.

File metadata

Download URL: forge_memory-1.2.0.tar.gz
Upload date: Mar 24, 2026
Size: 202.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for forge_memory-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`2ded0f4ec401cf75383d4d63c28bc62ea8d9083957f72647001cbc65c7360067`
MD5	`427204496724fbbc286da00741c07ced`
BLAKE2b-256	`bb3ddf83be537146f6c65b68ab84b0bb24ee3aae24bdd160684b3afd0c5d14c3`

See more details on using hashes here.

File details

Details for the file forge_memory-1.2.0-py3-none-any.whl.

File metadata

Download URL: forge_memory-1.2.0-py3-none-any.whl
Upload date: Mar 24, 2026
Size: 120.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for forge_memory-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6a95a3fbf8d2f4c83080a3b9beefbd0dfe511accb63b48e6cd28b3d6f75fb2a6`
MD5	`3389fa2fa1b950803fb0c42735c27027`
BLAKE2b-256	`d7076fa5aba8a5c2ad1a1c02bc9cc53ea06108c4fdbaae5c13d308ead99045ea`

See more details on using hashes here.

forge-memory 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project Forge

What is Forge?

Installation

Step 1: Install

Step 2: Setup

Step 3: Done

For developers (editable install)

Features

Automatic Experience Learning

Q-Value Learning (MemRL)

Forge Score

Smart Context Injection

Adaptive Warning Formats

Guard Hooks

Circuit Breaker

Model Routing

Team Orchestration Support

Global Promotion

Commands

Everyday

Data Management

Analysis

Hooks (automatic, not manually called)

Configuration

Architecture

What Gets Installed

Metrics

Tech Stack

Acknowledgements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes