A self-improving code reasoning engine with persistent semantic memory
Project description
Neo
A self-improving code reasoning engine that learns from experience using persistent semantic memory. Neo uses multi-agent reasoning to analyze code, generate solutions, and continuously improve through feedback loops.
- Fact-Based Memory: Learns from every solution attempt using a scoped, supersession-based fact store
- Semantic Retrieval: Vector search finds relevant facts via Jina Code embeddings
- Code-First Generation: No diff parsing failures
- Local Storage: Privacy-first JSON storage in ~/.neo/facts/ directory
- Model-Agnostic: Works with any LM provider
- Three integration surfaces — each on equal footing:
- Run as an Agent (CAR / A2A) — host Neo as an Agent2Agent v1.0 endpoint other agents (or orchestrators) can call directly. Real inference path, not a CLI wrapper.
- Claude Code Plugin — six slash commands + a specialized agent inside Anthropic's Claude Code CLI.
- Codex Plugin — same six skills, packaged for OpenAI Codex CLI.
Why Neo? Why Care?
If you've been Vibe Coding, then Vibe Planning, then Context Engineering, and on and on, you have likely hit walls where the models are both powerful and limited, brilliant and incompetent, wise and ignorant, humble yet overconfident.
Worse, your speedy AI Code Assistant sometimes goes rogue and overwrites key code in a project, or writes redundant code even after just reading documentation and the source code, or violates your project's patterns and design philosophy.... It can be infuriating. Why doesn't the model remember? Why doesn't it learn? Why can't it keep the context of the code patterns and tech stack? ... -> This is what Neo is designed to solve.
Neo is the missing context layer for AI Code Assistants. It learns from every solution attempt, using vector embeddings to retrieve relevant patterns for new problems. It then applies the learned patterns to generate solutions, and continuously improves through feedback loops.
Table of Contents
- Design Philosophy
- How It Works
- Quick Start
- Run as an Agent (CAR / A2A)
- Claude Code Plugin
- Codex Plugin
- Works Alongside Your AI Tools
- Installation
- Usage
- Architecture
- Performance
- Configuration
- LM Adapters
- Extending Neo
- Key Features
- Development
- Research & References
- License
- Contributing
- Changelog
Design Philosophy
Fact-Based Learning: Neo builds a semantic memory of facts — constraints, architectural decisions, patterns, review learnings, decisions, known unknowns, and failures — using vector embeddings for retrieval.
Code-First Output: Instead of generating diffs that need parsing, Neo outputs executable code blocks directly, eliminating extraction failures.
Scoped Storage: Facts are scoped to global, organization, or project level, stored locally in ~/.neo/facts/ for privacy and offline access.
Model-Agnostic: Works with OpenAI, Anthropic, Google, local models, or Ollama via a simple adapter interface.
How It Works
User Problem → Neo CLI → Semantic Retrieval → Reasoning → Code Generation
↓
[Vector Search]
[Pattern Matching]
[Confidence Scoring]
↓
Executable Code + Memory Update
Neo retrieves relevant facts using Jina Code embeddings (768-dimensional vectors), applies learned patterns, generates solutions, and stores new facts for continuous improvement.
- Jina's embeddings model (open source) is downloaded automatically when you first run Neo. This model runs locally on your machine to generate vector embeddings.
The Construct
Neo includes The Construct - a curated library of architecture and design patterns with semantic search capabilities. Think of it as your personal reference library for common engineering patterns, indexed and searchable using the same embedding technology that powers Neo's reasoning memory.
What is The Construct?
The Construct is a collection of vendor-agnostic design patterns covering:
- Rate Limiting: Token bucket, sliding window, distributed rate limiting
- Caching: Cache-aside, write-through, invalidation strategies
- More domains: Additional patterns contributed by the community
Each pattern follows a structured format inspired by the Gang of Four:
- Intent: What problem does this solve?
- Forces: Key constraints and tradeoffs
- Solution: Conceptual structure (no framework-specific code)
- Consequences: Benefits, risks, and observability signals
- References: Links to real-world implementations
Using The Construct
# List all patterns
neo construct list
# Filter by domain
neo construct list --domain rate-limiting
# Show a specific pattern
neo construct show rate-limiting/token-bucket
# Semantic search across patterns
neo construct search "how to prevent api abuse"
# Build the search index
neo construct index
Pattern Quality Standards
All patterns must:
- Include author attribution
- Be under 300 lines
- Remain vendor-agnostic (no AWS/GCP/Azure-specific solutions)
- Include concrete consequences and observability guidance
See /construct/README.md for contribution guidelines.
- When you ask Neo for help:
- Your query is embedded locally using the Jina model
- Neo searches the fact store for relevant knowledge (using cosine similarity)
- Retrieved facts are organized into layers: constraints, relevant knowledge, recent changes, known unknowns
- This combined context is sent to your chosen LLM API (OpenAI/Anthropic/Google)
- The LLM generates a solution informed by both your query and past facts
- The result is stored back as a new fact in local memory for future use
Local storage: ~/.neo/facts/facts_global.json ← Global-scoped facts ~/.neo/facts/facts_org_{id}.json ← Organization-scoped facts ~/.neo/facts/facts_project_{id}.json ← Project-scoped facts
Privacy:
- Your code never leaves your machine during embedding/search
- Only your prompt + retrieved facts are sent to the LLM API
- This is the same as using the LLM directly, but with added context from something akin to memory.
Your Prompt
↓
Local Jina Embedding (768-dim vector)
↓
Cosine Similarity Search (finds relevant facts)
↓
Retrieve Facts from ~/.neo/facts/
↓
Assemble Context: Constraints → Knowledge → Recent Changes → Known Unknowns
↓
→→→ NETWORK CALL →→→ LLM API (OpenAI/Anthropic/etc.)
↓
Solution Generated
↓
Store as New Fact in Local Memory
Quick Start
# Install from PyPI (recommended)
pip install neo-reasoner
# Or install with specific LM provider
pip install neo-reasoner[openai] # For GPT (same provider as the default)
pip install neo-reasoner[anthropic] # For Claude
pip install neo-reasoner[google] # For Gemini
pip install neo-reasoner[all] # All providers
# Set API key
export OPENAI_API_KEY=sk-...
# Test Neo
neo --version
See QUICKSTART.md for 5-minute setup guide
Run as an Agent (CAR / A2A)
Neo integrates with Parslee's Common Agent Runtime (CAR) as a first-class peer of the CLI and the plugins. The integration runs both directions:
- Inbound (host) —
neo serveexposes Neo as an Agent2Agent v1.0 endpoint. Other agents and orchestrators call Neo'sneo.processtool over A2A directly. No CLI shell-out, no subprocess parsing. - Outbound (inference) — set
provider="car"to route Neo's own LLM calls through CAR's unified inference layer. CAR's adaptive router picks local backends (Candle + MLX for Qwen3, Gemma 4) or remote providers (OpenAI, Anthropic, Google) per call based on task complexity, context-window headroom, and per-model latency/cost. Rust-enforced policies, deterministic eventlog/replay, and semantic conversation compaction all come for free.
A single CarRuntime is shared per process — if neo serve is running and the same process makes outbound calls, both surfaces see the same state, policies, tool registry, and eventlog.
Install the CAR extras
# CAR-backed serving and inference both require the car-runtime Python bindings
pip install "neo-reasoner[car]"
car-runtime ships as a sealed binary under a separate license (the rest of Neo stays Apache-2.0). Skip this extra if you only need the plugins and the CLI with direct provider SDKs.
Inbound: host Neo as an A2A endpoint
# 1. Start the CAR daemon (default ws://127.0.0.1:9100)
python -m car_runtime.server
# or, if installed standalone:
car-server
# 2. In another terminal, host Neo as an A2A endpoint
neo serve
neo serve boots a CarRuntime, registers Neo as the neo.process tool with its full schema (src/neo/car_tool_schema.py), installs the Python tools.execute handler, and binds the A2A HTTP listener. It blocks until SIGINT/SIGTERM.
Outbound: use CAR as Neo's inference layer
# Switch Neo's default provider
neo --config set --config-key provider --config-value car
# Let CAR's adaptive router pick the backend per call (recommended)
neo --config set --config-key model --config-value ""
# Or pin a specific model — local or remote
neo --config set --config-key model --config-value qwen3-32b
neo --config set --config-key model --config-value gpt-5
The CAR daemon must be running (car-server / python -m car_runtime.server). From Python:
from neo.adapters import create_adapter
adapter = create_adapter("car") # router picks a code-capable model
adapter = create_adapter("car", model="Qwen3-4B") # pin a specific backend
adapter = create_adapter(
"car",
intent_hint={"task": "reasoning", "prefer_local": True}, # override the default
)
Default intent: CarAdapter sends intent_json={"task": "code"} on every call unless you supply your own intent_hint. Neo's workload is overwhelmingly code reasoning (review, optimization, debugging, generation), so the router gets to pick a code-capable model rather than the chat default. CAR's task enum is chat | classify | reasoning | code. The rest of IntentHint (prefer_local, prefer_fast, require: ModelCapability[]) is how you express what else you need without pinning a model ID.
Known limitation upstream: CAR's
route_modelcurrently scores prompts as "simple" by heuristic length and picks the cheap chat-tier model (e.g.gpt-4.1-mini) even when models likegpt-5.3-codexando3are registered and ranked as fallbacks. Tracked at Parslee-ai/car-releases#52. Thetask=codedefault is Neo's local workaround — substantive prompts do escalate; trivial ones don't.
Discover what's installed
# Detects native CLI, car-server, Python bindings, and the default daemon port
neo car status
# Also surfaced in --version output
neo --version
If the CLI/daemon are present but the Python bindings aren't, Neo reports that state cleanly. CAR install options live at Parslee-ai/car-releases.
Why use the CAR surfaces
- Real inference path both ways — inbound, callers see Neo as a typed A2A tool; outbound, Neo gets local-first inference with automatic remote fallback through one provider-agnostic protocol
- One runtime per host — session state, tool registry, policies, and the eventlog stay consistent across A2A inbound and inference outbound in the same process
- Local-first inference, free fallback — Qwen3 / Gemma 4 on-device via Candle + MLX; remote OpenAI / Anthropic / Google when the router decides the task needs it
- Policies enforced in Rust — deny rules and capability requirements run before any side-effecting call
- Memory is shared across all surfaces —
~/.neo/facts/and per-project indexes are the same whether you invoke via CLI, plugin,neo serve, or CAR inference
Claude Code Plugin
Neo ships as a Claude Code plugin with a specialized agent and six slash commands. Anthropic's Claude Code CLI installs it from Parslee's plugin marketplace:
# Add the marketplace
/plugin marketplace add Parslee-ai/claude-code-plugins
# Install Neo
/plugin install neo
Once installed:
- Slash commands:
/neo,/neo-review,/neo-optimize,/neo-architect,/neo-debug,/neo-pattern - Specialized agent: invoke with
Use the Neo agent to ...for delegated semantic reasoning - Shared memory: same
~/.neo/facts/store used by the CLI and the Codex plugin
Examples:
/neo-review src/api/handlers.py
/neo-optimize process_large_dataset function
/neo-architect Should I use microservices or monolith?
/neo-debug Race condition in task processor
The plugin wraps the local neo CLI, so the binary must be installed first (pip install neo-reasoner[openai] and OPENAI_API_KEY set, or your provider of choice).
Plugin sources live under .claude-plugin/ — plugin.json is the manifest, agents/neo.md defines the agent, and commands/*.md defines each slash command.
Codex Plugin
Neo ships as a Codex plugin with the same six skills, packaged for OpenAI Codex CLI. Add the marketplace and install Neo from Codex's plugin directory:
# Add Parslee's hosted marketplace
codex plugin marketplace add Parslee-ai/neo
# Or, from a local checkout, point Codex at the in-tree marketplace
codex plugin marketplace add ./
Once installed:
- Skills:
$neo,$neo-review,$neo-optimize,$neo-architect,$neo-debug,$neo-pattern - Shared memory: same
~/.neo/facts/store used by the CLI and the Claude Code plugin
Examples:
$neo-review src/api/handlers.py
$neo-optimize process_large_dataset function
$neo-architect Should I use microservices or monolith?
$neo-debug Race condition in task processor
The plugin wraps the local neo CLI, so the binary must be installed first (pip install neo-reasoner[openai] and OPENAI_API_KEY set, or your provider of choice). Anything you teach Neo from Codex is immediately available in the Claude Code plugin and the CAR endpoint, and vice versa — there is one fact store per host.
Plugin sources live under plugins/neo/ — see the manifest and skill definitions.
Works Alongside Your AI Tools
Neo automatically reads project-local agent instruction docs from a wide range
of ecosystems and folds them into its reasoning context — no configuration
needed. If you've already invested in writing a CLAUDE.md, an AGENTS.md,
.cursor/rules/, .github/copilot-instructions.md, or a Spec Kit project,
neo respects that work.
| Tool | Files / dirs neo discovers |
|---|---|
| Claude / Claude Code | CLAUDE.md, .claude/CLAUDE.md, .claude/agents/*.md, .claude/commands/*.md |
| Codex / AGENTS.md spec | AGENTS.md, .github/AGENTS.md, .codex/**/*.md |
| Cursor | .cursorrules, .cursor/rules/**/*.md, .cursor/rules/**/*.mdc |
| GitHub Copilot | .github/copilot-instructions.md |
| Windsurf | .windsurfrules |
| Continue | .continue/**/*.md |
| Augment | .augment/**/*.md |
| Spec Kit | .specify/**/*.md |
| Aider | .aider/*.md |
| Codeium | .codeium/*.md |
Discovered docs surface in neo's prompt under PROJECT-LOCAL AGENT CONTEXT, included unconditionally — independent of relevance ranking — because their value is global to the project. Per-file cap of 6KB and total cap of 32KB keep prompt growth bounded.
This means neo composes well with whichever AI coding workflow you already use:
- Claude Code users get the deepest integration via the Claude Code Plugin, but neo runs standalone too.
- Codex CLI users get parity via the Codex Plugin — same six skills, packaged for Codex. Neo also automatically picks up
AGENTS.md(the cross-tool standard Codex co-led) plus anything under.codex/. - Cursor / Windsurf / Aider / Continue / Augment users — the rules dirs you've curated land in every neo session's context.
- GitHub Copilot users —
.github/copilot-instructions.mdis read on every invocation. - Spec Kit projects — your specs are folded into neo's reasoning context, no manual paste.
Adding a new tool is a one-liner: extend the discovery rules in
src/neo/agent_context.py. The list is the load-bearing surface for keeping
this current as new agent ecosystems emerge.
Installation
From PyPI (Recommended)
# Install Neo
pip install neo-reasoner
# With specific LM provider
pip install neo-reasoner[openai] # GPT (recommended)
pip install neo-reasoner[anthropic] # Claude
pip install neo-reasoner[google] # Gemini
pip install neo-reasoner[all] # All providers
# Verify installation
neo --version
Updating Neo
Neo supports both manual and fully automatic updates:
Manual Updates
# Option 1: Use neo's built-in update command (simplest)
neo update
# Option 2: Update with pip
pip install --upgrade neo-reasoner
# Option 3: Use pipx for isolated installation (recommended for end users)
pipx install neo-reasoner # First-time install
pipx upgrade neo-reasoner # Update to latest version
pipx upgrade-all # Update all pipx packages
Fully Automatic Updates
Automatic update installation is enabled by default for pipx and virtualenv installs. You can set it explicitly with:
# Enable auto-install (persisted in ~/.neo/config.json)
neo --config set --config-key auto_install_updates --config-value true
# Or use environment variable
export NEO_AUTO_INSTALL_UPDATES=1
When enabled, Neo will:
- Check for updates once every hour using a stale-while-revalidate cache
- Automatically download and install new versions in the background
- Notify you when updates complete
- Log all auto-update activity to
~/.neo/auto_update.log
Example output when auto-install is enabled:
$ neo "your query"
⚡ Auto-installing neo update: 0.18.0 → 0.18.1
This happens in the background. Please wait...
✓ Auto-update completed: 0.18.1
Restart neo to use the new version.
[Neo] Processing your query...
Update Notifications (Default)
By default, Neo checks for updates once every hour and displays a notification when a new version is available. This check happens in the background and will not interrupt your workflow.
To disable update checks entirely:
export NEO_SKIP_UPDATE_CHECK=1
From Source (Development)
# Clone repository
git clone https://github.com/Parslee-ai/neo.git
cd neo
# Install in development mode with all dependencies
pip install -e ".[dev,all]"
# Verify installation
neo --version
Dependencies
Core dependencies are automatically installed via pyproject.toml:
- numpy >= 1.24.0
- scikit-learn >= 1.3.0
- datasketch >= 1.6.0
- fastembed >= 0.3.0
- faiss-cpu >= 1.7.0
- jsonschema >= 4.0.0
- pyyaml >= 6.0
- openai >= 1.0.0 (default provider; base install is runnable with just
OPENAI_API_KEY) - tree-sitter >= 0.23, < 0.26
- tree-sitter-language-pack >= 0.13.0, < 1.0
Optional: Additional LM Providers
OpenAI is bundled in the base install. Add others as needed:
pip install anthropic # Claude
pip install google-genai>=0.2.0 # Gemini (requires Python 3.10+)
pip install requests # Ollama
See INSTALL.md for detailed installation instructions
Usage
CLI Interface
# Ask Neo a question
neo "how do I fix the authentication bug?"
# With working directory context
neo --cwd /path/to/project "optimize this function"
# Build the per-project semantic index (powers smart file selection)
neo --index
# Incrementally refresh the index after meaningful changes (re-embeds only changed files)
neo --update
# Preview the assembled context without making an LLM call
neo --dry-run "your query"
# Check version and memory stats
neo --version
# Inspect detected local CAR runtime surfaces
neo car status
Memory Maintenance
# Compact fact files by dropping old invalid tombstones (default: > 30 days since last access)
neo memory prune
# Across every local project Neo has touched
neo memory prune --all
# Preview without writing
neo memory prune --dry-run --max-invalid-age-days 14
Use prune when a ~/.neo/facts/facts_project_*.json file grows much larger than its 500-valid-fact cap — that gap is tombstone bloat from supersession. Defaults are conservative; raising --max-invalid-age-days is safe, lowering it past ~7 may evict tombstones still referenced by recent supersession chains.
Timeout Requirements
Neo makes blocking LLM API calls that typically take 30-120 seconds. When calling Neo from scripts or automation, use appropriate timeouts:
# From shell (10 minute timeout)
timeout 600 neo "your query"
# From Python subprocess
subprocess.run(["neo", query], timeout=600)
Insufficient timeouts will cause failures during LLM inference, not context gathering.
Output Format
Neo outputs executable code blocks with confidence scores:
def solution():
# Neo's generated code
pass
Personality System
Neo responds with personality (Matrix-inspired quotes) when displaying version info:
$ neo --version
"What is real? How do you define 'real'?"
neo 0.18.1
Provider: openai | Model: gpt-5.5
Stage: Sleeper | Memory: 0.0%
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
0 facts | 0.00 avg confidence
Load Program - Training Neo's Memory
"The Operator uploads a program into Neo's head."
Neo can bootstrap its memory by importing facts from HuggingFace datasets. This is NOT model fine-tuning - it's retrieval learning that expands local semantic memory with reusable code knowledge.
# Install datasets library
pip install datasets
# Load patterns from MBPP (recommended starter - 1000 Python problems)
neo --load-program mbpp --split train --limit 1000
# Load from OpenAI HumanEval (164 hand-written coding problems)
neo --load-program openai_humaneval --split test
# Load from BigCode HumanEvalPack (multi-language variants)
neo --load-program bigcode/humanevalpack --split test --limit 500
# Dry run to preview
neo --load-program mbpp --dry-run
# Custom column mapping
neo --load-program my_dataset \
--columns '{"text":"pattern","code":"solution"}'
Output (Matrix-style):
"I know kung fu."
Loaded: 847 facts
Deduped: 153 duplicates
Index rebuilt: 1.2s
Memory: 1247 total facts
How it works:
- Acquire: Pull dataset from HuggingFace
- Normalize: Map rows to fact schema
- Dedupe: Hash-based deduplication against existing memory
- Embed: Generate local embeddings (Jina Code v2)
- Store: Add as facts to the fact store
- Report: Matrix quote + counts
Key points:
- NOT fine-tuning - just expanding retrieval memory
- Facts start at 0.3 confidence (trainable via real-world usage)
- Automatic deduplication prevents memory bloat
- Uses local embeddings (no data leaves your machine)
- Stored in
~/.neo/facts/alongside learned facts
See docs/LOAD_PROGRAM.md for detailed documentation
Architecture
Fact-Based Memory
Neo uses a scoped, supersession-based fact store with Jina Code v2 embeddings (768 dimensions) for semantic retrieval:
- Typed Facts: Eight kinds — CONSTRAINT, ARCHITECTURE, DECISION, PATTERN, REVIEW, FAILURE, KNOWN_UNKNOWN, and EPISODE (instance-specific events with
{when, where, why, with_whom}context). - Scoped Organization: Facts are scoped to global, organization, or project level, with per-scope valid-fact caps (200 / 100 / 500 / 50). Org and project are auto-detected from git remotes.
- Supersession & Pre-Write Dedup: New facts with cosine similarity ≥ 0.85 to an existing fact short-circuit (bump the existing fact's access count) or supersede it. The pre-write canonical-signature check uses entity abstraction + verb-synonym folding to catch near-duplicates before they hit the store.
- Confidence + Effectiveness Ranking:
rank_score = recall_decay(sim)·confidence + success_bonus·effectiveness_f + provenance_bonus. The Ebbinghaus recall-probability transform gives frequently-recalled facts slower decay; LessonL-style effectiveness (c/nover reuse outcomes) multiplies the success bonus. Curated facts (CONSTRAINT/ARCHITECTURE/DECISION andseed/community/synthesized-tagged facts) bypass decay. - Hybrid Retrieval: 0.7·dense (Jina) + 0.3·BM25. Half the result slots ranked by full
rank_score, half by raw cosine — novel-but-relevant facts aren't crowded out by validated winners. - Triple-Trigger Consolidation: REVIEW facts cluster into PATTERN / FAILURE archetypes when ANY of count-delta ≥10, elapsed ≥1h, or confidence-decile entropy >0.9 fires. Clusters of ≥3 get an NREM-style Hebbian confidence bump; non-curated facts decay 3% globally after each pass.
- Dual-Buffer Probation: New non-curated facts enter with a
probationtag and a 3-day stale window (vs 7/14 normal); promoted automatically onaccess_count ≥ 2orsuccess_count > 0— quietly evicts noise while keeping real signal. - Four-Layer Context: Retrieved facts are organized into constraints, relevant knowledge, recent changes, and known unknowns. The four-layer state model is from Beyond Conversation: A State-Based Context Architecture for Enterprise AI Agents (Liotta, 2025) — see
papers/state-based-context-architecture.pdf. The token-budget enforcement inmemory/context.pyis ported from the engine described in Memgine: A Deterministic Memory Engine for Stateful AI Agents (Liotta, 2026) — seepapers/memgine-deterministic-memory-engine.pdf. Both are evaluated by StateBench; the 95.8% decision-accuracy result on the v1.0 development split is what drove Neo's move from a separate "Recently Changed" section to inline(changed from: X)annotations.
Output Schemas
Neo generates structured outputs with executable code and planning artifacts:
CodeSuggestion - Executable code with actionable metadata:
@dataclass
class CodeSuggestion:
# Core fields
file_path: str
unified_diff: str # Legacy: backward compatibility
code_block: str = "" # Primary: executable Python code
description: str
confidence: float
tradeoffs: list[str]
# Executable artifacts (v0.8.0+)
patch_content: str = "" # Full unified diff content
apply_command: str = "" # Shell command to apply (advisory)
rollback_command: str = "" # Shell command to undo (advisory)
test_command: str = "" # Shell command to verify (advisory)
dependencies: list[str] = [] # Other suggestion IDs this depends on
estimated_risk: str = "" # "low", "medium", or "high"
blast_radius: float = 0.0 # 0.0-100.0 percentage of codebase affected
PlanStep - Incremental planning with step-level metadata:
@dataclass
class PlanStep:
# Core fields
description: str
rationale: str
dependencies: list[int] = []
# Incremental planning (v0.8.0+)
preconditions: list[str] = [] # Conditions before execution
actions: list[str] = [] # Concrete actions to perform
exit_criteria: list[str] = [] # Success verification criteria
risk: str = "low" # "low", "medium", "high"
retrieval_keys: list[str] = [] # Step-scoped memory retrieval
failure_signatures: list[str] = [] # Known failure patterns
verifier_checks: list[str] = [] # Validation checks (Solver-Critic-Verifier)
expanded: bool = False # Tracks seed → expansion
These schemas enable:
- Actionable Output: Commands and patches ready for execution
- Incremental Planning: Seed plans expand only when blocked (as-needed decomposition)
- Step-Level Learning: Failure signatures attach to specific steps for ReasoningBank
- Multi-Agent Reasoning: Verifier checks support MapCoder's Solver-Critic-Verifier pattern
Code Smell Detection in Context Assembly
Neo scans the relevance-ranked file set during context assembly and surfaces known issues to the model under KNOWN ISSUES IN NEARBY CODE. Detectors are intentionally high-precision (false positives turn into prompt bloat that hurts more than it helps):
- TODO / FIXME / HACK / XXX markers (any text file)
- Python stubs:
pass-only /...-only /raise NotImplementedError - Python bare
except:and swallowed exceptions (except ...: pass) - Hardcoded credentials matching well-known prefixed shapes (OpenAI
sk-, AWSAKIA, GitHubghp_, Slackxox*-)
Per-file cap of 8 + global cap of 20 findings keeps the prompt bounded. Magic numbers and generic high-entropy secret detection are intentionally out of scope — they'd add more noise than signal at this stage.
Smart File Selection
The context gatherer picks files using three signals:
- ProjectIndex semantic boost: when
.neo/index.jsonexists (runneo --indexonce per repo), per-project FAISS over tree-sitter chunks projects top-k chunk hits back to per-file boosts up to +1.0 cosine. Chunks embedsymbols + imports + first ~600 chars of body, so prompt keywords match what a file is, not assertion strings inside tests. Test-file matches are demoted 0.4× unless the prompt mentions test/spec. - Tree-sitter symbol overlap: the parser extracts function/class names + imports from top candidates and adds up to +1.2 for substring matches against prompt tokens (length-3 floor).
- EPISODE-history feedback loop: each Neo run stashes touched file paths as
file:<rel>tags on EPISODE facts. On the next run, the gatherer queries for similar past prompts and gives those files up to +0.5 boost — past file selections measurably influence future ones.
A per-file chunk cap of 2 prevents large files from eating the budget; a one-time first-run hint fires if the index is missing.
Learning Feedback Loop
After each Neo run, the next invocation diffs your repo against the suggestions it made and classifies the result. All confidence deltas are modulated by ±arch_mod (∈ {−0.1, 0, +0.1}) from the architectural-quality snapshot — see Architectural Quality Feedback Loop below.
| Outcome | Trigger | Effect |
|---|---|---|
| ACCEPTED | Code-block overlap ≥ 0.8 (modern path) or unified-diff overlap > 0.3 (legacy path) | linked fact conf +0.2 ± arch_mod, success_count +1, effectiveness "better" |
| MODIFIED | User changed the file differently | linked fact conf −0.2 ± arch_mod (floored at 0.1) + new REVIEW at conf 0.4 |
| UNVERIFIED | File touched but suggestion had no diff to compare | linked fact conf +0.1 ± arch_mod, success_count +1 (no REVIEW) |
| INDEPENDENT | File touched, never suggested by Neo | new REVIEW at conf 0.2; capped 5/session, 50/project |
Storage Architecture
- Scoped JSON Files: Facts stored in
~/.neo/facts/— separate files per scope (global, org, project), with inline embeddings (no separate FAISS index for memory). - Bi-Temporal Supersession: similar facts are soft-deleted by stamping
event_time_endrather than dropped. Tombstones persist untilpurge_dead_factsruns on the next cold start. - Constraint Auto-Ingestion: CLAUDE.md and similar files are automatically scanned and ingested as CONSTRAINT facts.
- Sessions & Metrics:
~/.neo/sessions/holds session manifests + replay logs;~/.neo/metrics.jsonllogs every retrieve / add_fact / lm_call / overseer_tick (disable withNEO_METRICS=off). - Project Index (separate system): Tree-sitter code indexing uses FAISS for per-repository semantic search in
.neo/.
Performance
Neo improves over time as it learns from experience. Initial performance depends on available facts. Performance grows as the semantic memory builds up successful solutions, failure learnings, and architectural decisions.
Memory-Driven Reasoning Effort (gpt-5* models)
Neo monetizes its learning into inference cost. Each query's reasoning.effort
parameter is sized from the strength of the memory hit:
| Memory + difficulty | Effort |
|---|---|
| ≥3 patterns, avg confidence ≥ 0.8 | low |
| Some patterns, avg confidence 0.5–0.8 | medium (API default) |
| No relevant patterns OR avg confidence < 0.5 | high |
| No patterns AND difficulty == "hard" | xhigh |
Familiar queries get cheap thinking; novel-and-hard queries get max thinking.
Cap with NEO_REASONING_EFFORT={none,low,medium,high,xhigh} for cost control.
Model note: the effort vocabulary differs by model. gpt-5.5 (the default) accepts the full
none / low / medium / high / xhighrange. Oldergpt-5-codexonly acceptslow / medium / high— if you switch back to that model, setNEO_REASONING_EFFORT=highto cap the auto-selector.
Architectural Quality Feedback Loop
When a session ends, neo snapshots three structural metrics — import cycles, god files (LOC + function-count thresholds), and max nesting depth — and diffs against the previous snapshot at the next outcome detection. A regression weakens the accept/boost or strengthens the modify/penalty by 0.1; an improvement does the reverse. Confidence becomes a signal of "helped the codebase," not just "got accepted."
Configuration
CLI Configuration Management
Neo provides a simple CLI for managing persistent configuration:
# List all configuration values
neo --config list
# Get a specific value
neo --config get --config-key provider
# Set a value
neo --config set --config-key provider --config-value anthropic
neo --config set --config-key model --config-value claude-sonnet-4-5-20250929
neo --config set --config-key api_key --config-value sk-ant-...
# Reset to defaults
neo --config reset
Exposed Configuration Fields:
provider- LM provider (openai, anthropic, google, azure, ollama, local)model- Model name (e.g., gpt-5.5, claude-sonnet-4-5-20250929)api_key- API key for the chosen providerbase_url- Base URL for local/Ollama endpointsmemory_backend- Memory backend: "fact_store" (default) or "legacy"auto_install_updates- Automatically install updates in background (true/false)constraint_auto_scan- Auto-scan CLAUDE.md for constraints (true/false, default: true)log_level- Logging level: DEBUG, INFO, WARNING, or ERRORreasoning_effort_cap- Optional cap for OpenAI gpt-5 reasoning effort
Configuration is stored in ~/.neo/config.json. Environment variables override
stored config values for the current process.
Secure API Key Storage
On macOS, Neo stores API keys in Keychain rather than config.json. Run:
# Securely prompt for and store an API key in Keychain
neo --config set --config-key api_key
NeoConfig.load() reads the Keychain entry automatically.
Linux / Windows: this command currently raises — Keychain support is macOS-only. Either set the provider env var directly (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.) or export NEO_ALLOW_PLAINTEXT_API_KEY=1 first so the key is persisted in config.json.
Environment Variables
Credentials
# Provider-specific (read by NeoConfig.load() when set)
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GOOGLE_API_KEY=...
# Neo-generic override (takes precedence over provider-specific keys)
export NEO_PROVIDER=openai
export NEO_MODEL=gpt-5.5
export NEO_API_KEY=sk-...
export NEO_BASE_URL=http://localhost:11434 # for Ollama/local endpoints
Behavior
export NEO_REASONING_EFFORT=high # cap auto-effort selection
export NEO_AUTO_INSTALL_UPDATES=1 # auto-install background updates
export NEO_SKIP_UPDATE_CHECK=1 # disable update checks entirely
export NEO_LOG_LEVEL=INFO # DEBUG/INFO/WARNING/ERROR
export NEO_TEMPERATURE=0.7 # generation temperature
export NEO_MAX_TOKENS=4096 # per-call max output tokens
Install Sanity
If you have multiple local installs, make sure the neo command and your test
interpreter import the same package:
which neo
neo --version
python3 -c "import neo; print(neo.__file__)"
Observability
export NEO_METRICS=off # disable ~/.neo/metrics.jsonl writes
Neo writes structured per-operation events (retrieve / add_fact / lm_call / overseer_tick) to ~/.neo/metrics.jsonl and per-session manifests + JSONL outcome logs to ~/.neo/sessions/.
LM Adapters
OpenAI (Default)
from neo.adapters import OpenAIAdapter
adapter = OpenAIAdapter(model="gpt-5.5", api_key="sk-...")
Default model: gpt-5.5. GPT-5/Codex models use the /v1/responses endpoint automatically.
Anthropic
from neo.adapters import AnthropicAdapter
adapter = AnthropicAdapter(model="claude-sonnet-4-5-20250929")
Default model: claude-sonnet-4-5-20250929
Note: Requires Python 3.10+ and google-genai>=0.2.0
from neo.adapters import GoogleAdapter
adapter = GoogleAdapter(model="gemini-2.0-flash")
Default model: gemini-2.0-flash. Uses the google-genai SDK.
Ollama
from neo.adapters import OllamaAdapter
adapter = OllamaAdapter(model="llama2")
CAR (Common Agent Runtime)
from neo.adapters import CarAdapter
# Default: router-picked model with intent_json={"task": "code"} so the
# router selects a code-capable backend rather than the chat default.
adapter = CarAdapter()
# pin a specific backend if you need to:
adapter = CarAdapter(model="Qwen3-4B")
# override the default intent (CAR task enum: chat | classify | reasoning | code):
adapter = CarAdapter(intent_hint={"task": "reasoning", "prefer_local": True})
Requires the [car] extra (pip install neo-reasoner[car]) and a running car-server. See Run as an Agent (CAR / A2A) for the full setup.
Extending Neo
Add a New LM Provider
from neo.cli import LMAdapter
class CustomAdapter(LMAdapter):
def generate(self, messages, stop=None, max_tokens=4096, temperature=0.7):
# Your implementation
return response_text
def name(self):
return "custom/model-name"
Key Features
- Three integration surfaces on equal footing:
- Run as an Agent (CAR / A2A) — host Neo as an Agent2Agent v1.0 endpoint via
neo serve; other agents callneo.processdirectly - Claude Code Plugin — six slash commands + a specialized agent inside Claude Code
- Codex Plugin — the same six skills, packaged for OpenAI Codex CLI
- Run as an Agent (CAR / A2A) — host Neo as an Agent2Agent v1.0 endpoint via
- Fact-Based Memory: Learns from every solution attempt using a scoped, supersession-based fact store
- Semantic Retrieval: Vector search finds relevant facts via Jina Code embeddings
- Code-First Generation: No diff parsing failures
- Scoped Storage: Privacy-first JSON storage in ~/.neo/facts/ with global, org, and project scopes
- Model-Agnostic: Works with any LM provider
- The Construct: Curated library of architecture patterns with semantic search
- Project Indexing: Tree-sitter based multi-language code indexing with FAISS
- Prompt Enhancement: Analyze and improve prompt effectiveness
Development
Running Tests
# Run all tests
pytest
# Run specific test
pytest tests/test_neo.py
# Run with coverage
pytest --cov=neo
Research & References
The 0.18 memory architecture lands deterministic techniques from a focused reading of recent work on long-horizon agent memory and code generation. Citations below are anchored to the file where the technique is actually implemented — the full PDFs are checked into papers/ for reproducibility.
Academic Papers
Memory architecture & lifecycle
-
SCM Sleep Memory: Sleep-Consolidation in Continual Memory Paper 2604.20943
- 4-D ValueTagger composite (novelty, validation, task, repetition); adaptive forgetting threshold; NREM Hebbian strengthening + global downscale; triple-trigger consolidation gate.
- Implementation:
src/neo/memory/value_score.py,store.synthesize_reviews.
-
Memory Systems Survey (1) Paper 2603.07670
- Provenance taxonomy (
STRUCTURAL > OBSERVED > INFERRED); dual-buffer / probation consolidation; Layer-1/2/3 observability split. - Implementation:
src/neo/memory/models.py:42,store.py(probation tag),memory/metrics.py.
- Provenance taxonomy (
-
Memory Survey 2 — Zep / AriGraph bi-temporal pattern Paper 2512.13564 §5.2.2
- Bi-temporal stamps (
event_time/event_time_end/ingest_time); supersession via soft-delete. - Implementation:
src/neo/memory/models.py:241.
- Bi-temporal stamps (
-
Trajectory Memory — Canonical-signature dedup Paper 2603.10600 §7
- Entity abstraction + verb-synonym folding + context strip as a pre-write dedup signature.
- Implementation:
src/neo/memory/generalize.py.
-
Memori — Hybrid dense+BM25 retrieval Paper 2603.19935 §3.3
- Sparse BM25 channel (k1=1.5, b=0.75) min-max-normalized and weighted with the dense channel at 0.7/0.3.
- Implementation:
src/neo/memory/bm25.py(sparse channel),store._fuse_dense_sparse(0.7/0.3 fusion).
-
MemMachine — Query-shape routing & nucleus episode expansion Paper 2604.04853 §4.6, §5.3, §8.4.1
- DIRECT / CHAIN / SPLIT prompt classification with per-branch retrieval; episode-peer expansion at retrieval time; k=20–30 sweet spot.
- Implementation:
src/neo/memory/query_routing.py,store.pynucleus expansion.
-
LessonL — Effectiveness multiplier on reuse outcomes Paper 2505.23946
- Per-fact
c/neffectiveness as a success-bonus multiplier; half-by-rank / half-by-cosine slot allocation (Algorithm 1). - Implementation:
src/neo/memory/models.py:130, 233;store.retrieve_relevant.
- Per-fact
-
Ebbinghaus Recall — Spaced-repetition decay for retrieval Hou et al., paper 2404.00573
- Recall-probability transform
p_n(t) = (1 − exp(−r·exp(−t/g_n))) / (1 − e⁻¹)applied to similarity scores for fluid facts. - Implementation:
src/neo/math_utils.py:40,models.rank_score.
- Recall-probability transform
-
Episodic Memory — Five-property episodic context Paper 2502.06975 Table 1
{when, where, why, with_whom}instance-specific event context.- Implementation:
src/neo/memory/models.py:320EpisodeContext.
-
Multiple Memory Systems — Retrieval / context unit split Paper 2508.15294 §3
- Embed concise keywords (
retrieval_text); inject full narrative (context_text) — same fact, two surfaces. - Implementation:
src/neo/memory/models.py:373.
- Embed concise keywords (
Engine & multi-agent reasoning
-
MapCoder — Solver–Critic–Verifier multi-agent collaboration Islam et al., paper 2405.11403 | GitHub
- Per-step confidence, multi-plan iteration scaffolding.
- Implementation:
PlanStep.confidenceinsrc/neo/models.py.
-
CodeSim — MODIFY / NO_MODIFY decision token Hou et al., paper 2502.05664
- Simulator emits an explicit "no modification needed" token; planner uses it as an override on the agreement-of-outputs heuristic. (Distinct from the 2023 ACM CodeSim paper of the same name.)
- Implementation:
src/neo/engine.py:427.
-
SICA — Asynchronous structured-output watchdog & cache-hit observability Paper 2504.15228 §A.2, Table 1
- Daemon-thread tick loop emitting
overseer_tickevents; loop detection via 5-identical-actions-in-a-row; LM-call cache-hit-rate tracking. - Implementation:
src/neo/overseer.py,src/neo/adapters.py:237.
- Daemon-thread tick loop emitting
In-house papers (Parslee) — the foundational research behind Neo's context architecture
-
Beyond Conversation: A State-Based Context Architecture for Enterprise AI Agents Liotta, 2025 | PDF
- The theoretical foundations for the four-layer state model (constraints / valid facts / invalidated facts / known unknowns), supersession semantics, and the six classes of state failure (resurrection, hallucination, scope leak, stale reasoning, authority violation, temporal decay).
- Implementation:
src/neo/memory/context.py,ContextResultinsrc/neo/memory/models.py.
-
Memgine: A Deterministic Memory Engine for Stateful AI Agents Liotta, 2026 | PDF
- The production engine implementing the full specification: query-relevance sorting, engine-level access control, adaptive inline repair, layered token-budget enforcement (2/3 constraint cap, greedy first-fit accumulation with
at_least_one,Fact.size_hint()heuristic). Achieves 95.8% decision accuracy on the StateBench v1.0 development split with GPT-5.2 (97.3% with Opus 4.6). - Implementation:
_accumulate_within_budgetinsrc/neo/memory/context.py;Fact.size_hint()insrc/neo/memory/models.py; design notes indocs/solutions/token-budget-enforcement.md.
- The production engine implementing the full specification: query-relevance sorting, engine-level access control, adaptive inline repair, layered token-budget enforcement (2/3 constraint cap, greedy first-fit accumulation with
-
StateBench — github.com/parslee-ai/statebench · parslee-ai.github.io/statebench
- The conformance test suite that evaluates the two papers above. PyPI / HuggingFace Dataset / Space. Reference baselines (
state_based,rolling_summary,fact_extraction_with_supersession, etc.) on the v1.0 test split set the bar Neo's port is measured against.
- The conformance test suite that evaluates the two papers above. PyPI / HuggingFace Dataset / Space. Reference baselines (
Background reading (in papers/ but not directly cited in code)
The following papers shaped the design vocabulary but aren't wired into a specific implementation today: 2506.18902 (Jina v4 — Neo currently uses Jina v2), 2508.21290 (Jina Code Embeddings), 2509.17489 (MapCoder-Lite), 2511.20857 (Evo-Memory).
Historical influences (cited in legacy modules under deprecation): ReasoningBank (2509.25140) informed the original src/neo/persistent_reasoning.py; the 0.18 fact store supersedes it.
Technologies & Libraries
Embedding & Search:
-
Jina Embeddings v2 (Code) HuggingFace | GitHub
- 768-dimensional embeddings optimized for code similarity
- Local inference (no API calls)
- Used in: Neo's semantic memory and pattern retrieval
-
FAISS (Facebook AI Similarity Search) GitHub | Docs
- Efficient vector similarity search and clustering
- Billion-scale index support
- Used in: Neo's fast pattern matching (<13ms avg)
-
- Lightweight local embedding generation
- ONNX Runtime backend
- Used in: Neo's local embedding pipeline
Datasets (for Load Program):
-
MBPP (Mostly Basic Programming Problems) HuggingFace | Paper
- 1,000 crowd-sourced Python programming problems
- Used for: Bootstrapping Neo's semantic memory
-
HumanEval HuggingFace | Paper
- 164 hand-written programming problems
- Used for: Quality pattern seeding
Citation
If you use Neo in academic research, please cite:
@software{neo2025,
title={Neo: Self-Improving Code Reasoning Engine with Persistent Semantic Memory},
author={Parslee AI},
year={2025},
url={https://github.com/Parslee-ai/neo},
note={Memory architecture draws on SCM Sleep Memory (2604.20943), MemMachine (2604.04853), LessonL (2505.23946), and the bi-temporal/Ebbinghaus/dual-buffer techniques cataloged in the README's Research \& References section}
}
License
Apache License 2.0 - See LICENSE for details.
Contributing
See CONTRIBUTING.md for contribution guidelines.
Changelog
See CHANGELOG.md for version history.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file neo_reasoner-0.21.0.tar.gz.
File metadata
- Download URL: neo_reasoner-0.21.0.tar.gz
- Upload date:
- Size: 615.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bf295599c42b42f65ab8df01ffc5fbb0e48a824cbcc5d2e019b607ed6adf1d6
|
|
| MD5 |
cfbd69afdd86ab53735e2017b2e69fa3
|
|
| BLAKE2b-256 |
6f75db71de860008c31a4206ea9a089ff4613f6151d7adc0598a856a39be519a
|
Provenance
The following attestation bundles were made for neo_reasoner-0.21.0.tar.gz:
Publisher:
publish.yml on Parslee-ai/neo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
neo_reasoner-0.21.0.tar.gz -
Subject digest:
8bf295599c42b42f65ab8df01ffc5fbb0e48a824cbcc5d2e019b607ed6adf1d6 - Sigstore transparency entry: 1810221335
- Sigstore integration time:
-
Permalink:
Parslee-ai/neo@9d8978aeac33116c2d3207768aec44affe1fac8b -
Branch / Tag:
refs/tags/v0.21.0 - Owner: https://github.com/Parslee-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9d8978aeac33116c2d3207768aec44affe1fac8b -
Trigger Event:
release
-
Statement type:
File details
Details for the file neo_reasoner-0.21.0-py3-none-any.whl.
File metadata
- Download URL: neo_reasoner-0.21.0-py3-none-any.whl
- Upload date:
- Size: 379.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c60e8829638993ace3f465575cdd725b7685840169366e898e93c8cfda29678
|
|
| MD5 |
dc844d4d6d4329cf7a75792f2eecd545
|
|
| BLAKE2b-256 |
4d6a50b17eac398eff852e0b3d87448047331af6ebaaea9b6578befb40c9d2d9
|
Provenance
The following attestation bundles were made for neo_reasoner-0.21.0-py3-none-any.whl:
Publisher:
publish.yml on Parslee-ai/neo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
neo_reasoner-0.21.0-py3-none-any.whl -
Subject digest:
2c60e8829638993ace3f465575cdd725b7685840169366e898e93c8cfda29678 - Sigstore transparency entry: 1810221411
- Sigstore integration time:
-
Permalink:
Parslee-ai/neo@9d8978aeac33116c2d3207768aec44affe1fac8b -
Branch / Tag:
refs/tags/v0.21.0 - Owner: https://github.com/Parslee-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9d8978aeac33116c2d3207768aec44affe1fac8b -
Trigger Event:
release
-
Statement type: