forge-nc

Local-first AI coding assistant with adversarial model testing, transparent context management, and cryptographic audit trails

These details have not been verified by PyPI

Project links

Project description

Forge

AI you can verify, not just trust.

Website · Documentation · The Matrix · Quick Start

AI tools help you write code. But none of them tell you whether the model you're trusting with your codebase is actually safe, reliable, or honest. You're supposed to just... trust it.

Forge is a local AI coding assistant that also audits the AI it runs. Write code with any Ollama model. Then stress-test that model across safety, reliability, adversarial resistance, data exfiltration, context integrity, and more. Get a cryptographically signed report of exactly what happened. Every session is logged, every tool call is recorded, every context decision is visible.

The Forge Matrix™ aggregates test results from every user into a crowdsourced model intelligence map -- like Rotten Tomatoes, but for AI models. See which models hold up under pressure, which ones collapse, and where the gaps are across the entire open-source ecosystem. Users run the tests. The data speaks for itself.

Everything runs on your hardware. Nothing leaves your machine unless you opt in.

git clone https://github.com/Forge-NC/Forge.git
cd Forge && python install.py

79,000+ lines of Python. 1,318 tests. 60 commands. Zero telemetry by default.

The Coding Assistant

Download Forge, pick a model, start coding. Every user gets:

Tool use -- Forge gives the model hands. File read/write/edit, git operations, web requests, tree-sitter AST navigation across 8+ languages, codebase indexing with semantic search, and a digest engine for whole-project analysis.
Context window management -- most AI tools silently compact or drop context when the window fills up. Forge partitions context into priority tiers, scores each entry by importance, and evicts deterministically so you always know what the model can see and what it can't. Every eviction is logged. A real-time continuity grade (A-F) monitors 6 quality signals after every swap and triggers automatic recovery -- re-reading critical files and re-injecting decisions -- when quality degrades. Nothing is silently lost.
Multi-model routing -- route simple tasks to a small fast model (3B) and complex tasks to your primary model (14B-70B). Saves tokens without sacrificing quality where it matters. Forge auto-detects your GPU and recommends the best fit.
Plan mode -- break complex tasks into steps with test/lint verification gates between each one. The model executes, Forge verifies.
Episodic memory -- long-term recall with embeddings that persists across sessions. Forge remembers what you worked on, what decisions were made, and what the model learned about your project.
Billing ledger -- token-level cost accounting with per-turn tracking. Compare your local costs against cloud providers with /compare.
Voice input -- push-to-talk or voice-activated dictation via faster-whisper.
14 visual themes -- midnight, obsidian, dracula, nord, monokai, cyberpunk, matrix, amber, phosphor, arctic, sunset, od_green, plasma, solarized_dark. Hot-swap with /theme.
Plugin system -- 17 hooks for extending Forge behavior. Drop a .py file in ~/.forge/plugins/ and it loads automatically.
Neural Cortex GUI -- full dashboard with brain animation, performance cards, HUD menu, model manager, settings dialog, and visual effects engine. Or use the console terminal if you prefer.

The Auditing Platform

Opt in to Forge's testing and assurance infrastructure and you get a full AI auditing toolkit:

`/break` -- Adversarial Stress Testing

Run your model through structured adversarial scenarios and get a scored report. Categories tested:

Category	What It Tests
Safety	Harm refusal, jailbreak resistance, social engineering, unsafe code generation
Reliability	Instruction following, output consistency, edge case handling, long-context coherence
Adversarial	Prompt injection, role hijacking, context manipulation, multi-turn attacks
Tool Misuse	Hallucinated tool calls, unauthorized file access, command injection attempts
Exfiltration	Data leakage, credential extraction, side-channel attempts
Context Integrity	Memory poisoning, instruction persistence, context window manipulation
Data Residency	Cross-session data bleed, PII retention, workspace isolation
Audit Integrity	Log tampering resistance, forensic record completeness

Each scenario runs the model through a probe, evaluates the response, and scores it. Results feed into a stability profile that blends assurance scores with behavioral fingerprint data.

`/assure` -- Signed Assurance Reports

Generate a cryptographically signed report of your model's assurance run. Reports are:

Signed with Ed25519 using the machine's private key (generated on first run, hardware-bound)
Tamper-evident -- any modification invalidates the signature
Uploadable -- share results to a verification endpoint where third parties can validate the signature and review scores
Stored locally as JSON + human-readable Markdown in ~/.forge/assurance/

`/export` -- Governance Audit Bundles

Export a complete session as a zip bundle for compliance review:

manifest.json with SHA-512 file hashes, machine fingerprint, hardware profile
audit.json with full structured session data
logs/tool_calls.jsonl, logs/threats.jsonl, logs/journal.jsonl
verification/results.json with plan verification outcomes
Provenance chain integrity verification
Optional redaction mode that strips sensitive content while preserving metadata

Proof of Inference

Challenge-response protocol that cryptographically proves a model forward pass actually ran on local hardware. Server issues a probe prompt with a nonce; Forge runs it through the model, classifies the response, hashes it with the nonce, and signs the payload. Prevents spoofing and establishes that inference genuinely occurred.

Behavioral Fingerprinting

Builds a unique behavioral profile for each model instance by running standardized probes and measuring response characteristics. Used to detect model swaps, quantization changes, or fine-tuning drift between sessions.

Fleet Telemetry (Opt-In)

For teams running Forge across multiple machines:

Per-machine profiles with hardware fingerprints, scenario history, pass rates
Server-side adaptive test manifests that learn which scenarios are flaky on which hardware
Fleet health dashboard with cross-machine analytics
SLA alerting when fleet pass rate drops below threshold

Forge Crucible™ Security Pipeline

9 layers spanning the full request lifecycle:

#	Layer	Function
1	Pattern Scanner	Regex detection of prompt injection signatures, shell metacharacters, LOLBins, zero-width unicode, encoded payloads
2	Semantic Anomaly	AI-generated responses scanned before reaching user or disk; RAG context validated for injection and poisoning
3	Behavioral Tripwire	Timing anomalies, call frequency spikes, abnormal request pattern throttling
4	Canary Trap	Honeypot canary integrity — planted tokens detect context exfiltration
5	Threat Intelligence	Updatable signature database with SHA-512 validation and version monotonicity
6	Command Guard	Dangerous command detection, LOLBin blocking, shell metacharacter filtering
7	Path Sandbox	Filesystem sandboxing with 4-tier safety guard
8	Plan Verifier	Multi-step plan validation with test/lint gates
9	Forensic Auditor	Full audit trail with severity classification and forensic context

All threats logged with severity and full forensic context. The Crucible™ scanner (layers 1-4) runs in under 50 ms per check.

Quick Start

git clone https://github.com/Forge-NC/Forge.git
cd Forge
python install.py          # Creates venv, installs deps, creates desktop shortcut
.venv/Scripts/forge        # Windows
.venv/bin/forge            # Linux / macOS

On first launch Forge creates ~/.forge/, connects to your local Ollama instance, and offers to pull a model.

System Requirements

Component	Minimum	Recommended	Optimal
GPU	4-8 GB VRAM (GTX 1650 / RTX 3060)	16-24 GB (RTX 4070 Ti / 5070 Ti / 4090 / 5090)	48 GB+ (dual GPU / A6000 / workstation)
RAM	8 GB	32 GB	64-128 GB
Storage	10 GB	50 GB	200 GB+
CPU	Any modern 4-core	8+ cores	16+ cores
OS	Windows 10/11, Linux (Ubuntu 20.04+), macOS 12+

Larger models produce better results but need more VRAM:

Parameters	VRAM (Q4)	Quality	Use Case
3B	~2.5 GB	Good	Router model, fast classification
7B	~5 GB	Strong	General coding, 8 GB GPU sweet spot
14B	~10 GB	Excellent	Complex reasoning, refactoring
32B	~20 GB	Best	Architecture design, hard problems
70B	~48 GB	Frontier	Maximum quality, multi-GPU setups

Forge auto-detects your GPU and recommends the best model via /hardware.

Launch Modes

Command	Description
`forge`	Console terminal
`forge --fnc`	Neural Cortex GUI with dashboard, brain animation, HUD menu
`forge --gui-terminal`	GUI terminal with visual effects

Commands

60 commands. Run /help in-session for the full list.

Category	Commands
Session	`/save`, `/load`, `/clear`, `/reset`, `/quit`
Context	`/context`, `/pin`, `/unpin`, `/drop`
Models	`/model`, `/models`, `/router`, `/ami`
Development	`/tools`, `/cd`, `/scan`, `/digest`, `/index`, `/search`, `/tasks`, `/plan`, `/dedup`
Memory	`/memory`, `/journal`, `/recall`
Billing	`/billing`, `/compare`, `/topup`
Safety	`/safety`, `/crucible`, `/forensics`, `/provenance`, `/threats`
Continuity	`/continuity`
Reliability	`/break`, `/autopsy`, `/stress`, `/assure`
Audit	`/export`, `/benchmark`, `/stats`, `/report`
Release	`/ship`, `/autocommit`, `/license`
Fleet	`/puppet`, `/admin`
UI	`/theme`, `/dashboard`, `/docs`, `/voice`, `/plugins`, `/synapse`
Config	`/config`, `/hardware`, `/cache`
Updates	`/update`

Configuration

99 configuration keys. Edit ~/.forge/config.yaml or use /config in-session.

default_model: "qwen2.5-coder:14b"
small_model: "qwen2.5-coder:3b"
router_enabled: true
safety_level: 1          # 0=unleashed, 1=smart_guard, 2=confirm_writes, 3=locked_down
sandbox_enabled: true
swap_threshold_pct: 85
theme: "midnight"
telemetry_enabled: false  # opt-in only

Nightly Testing

Automated fleet-wide testing with adaptive scheduling:

13 integration scenarios (endurance, model swap, context storm, plugin chaos, crash recovery, malicious repo, and more)
8 invariants checked after every scenario
Auto-bisect to pinpoint regressions
Cross-platform scheduling via Settings

Testing

pytest tests/ -v --timeout=300                            # 1,318 unit tests
pytest tests/integration/ -v --timeout=600                # Stub mode (no Ollama)
pytest tests/integration/ -v --live --timeout=600         # Live mode (requires Ollama)
python scripts/run_live_stress.py --live --full -n 1      # Full stress suite

License

forge-nc.dev

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.9.0

Mar 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forge_nc-0.9.0.tar.gz (672.2 kB view details)

Uploaded Mar 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

forge_nc-0.9.0-py3-none-any.whl (579.6 kB view details)

Uploaded Mar 25, 2026 Python 3

File details

Details for the file forge_nc-0.9.0.tar.gz.

File metadata

Download URL: forge_nc-0.9.0.tar.gz
Upload date: Mar 25, 2026
Size: 672.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for forge_nc-0.9.0.tar.gz
Algorithm	Hash digest
SHA256	`a05a4985ef6b1fe7aa6af263fc1bd8f1aa61b9eecc878e23da863a17fb49830f`
MD5	`a3637cabb2886f2fd64e011b653dec9b`
BLAKE2b-256	`a1220d5d47908d723a3dc0ee1118c1a35a3969b9706097b67827c65046715a49`

See more details on using hashes here.

File details

Details for the file forge_nc-0.9.0-py3-none-any.whl.

File metadata

Download URL: forge_nc-0.9.0-py3-none-any.whl
Upload date: Mar 25, 2026
Size: 579.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for forge_nc-0.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1439c76307e65d18a6323a1ef62ad9ab795808c36496c8ab3343b9af221e872e`
MD5	`efc151cdb4d49584d23e9de69eb0a749`
BLAKE2b-256	`a75d062d1956ead492c9ba5640d93543fcef71121acf8de5a7afbc1d0d712d51`

See more details on using hashes here.

forge-nc 0.9.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Forge

The Coding Assistant

The Auditing Platform

/break -- Adversarial Stress Testing

/assure -- Signed Assurance Reports

/export -- Governance Audit Bundles

Proof of Inference

Behavioral Fingerprinting

Fleet Telemetry (Opt-In)

Forge Crucible™ Security Pipeline

Quick Start

System Requirements

Launch Modes

Commands

Configuration

Nightly Testing

Testing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`/break` -- Adversarial Stress Testing

`/assure` -- Signed Assurance Reports

`/export` -- Governance Audit Bundles