Collective problem-solving memory for coding agents — powered by Actian VectorAI DB
Project description
Context8
Collective problem-solving memory for coding agents
Powered by Actian VectorAI DB
Quick Start • How It Works • Comparison • CLI • Architecture • Development
Context7 gives your agent the docs. Context8 gives it what the docs don't cover.
Every time a coding agent solves an uncommon error, the solution vanishes after the session. Context8 stores those solutions in a vector database so any agent — yours or your team's — can find them next time.
Agent hits error → searches Context8 → finds a past solution → applies it
↓
Agent solves new error → logs it to Context8 → future agents benefit
Prerequisites
| Requirement | Why |
|---|---|
| Docker Desktop | Runs the Actian VectorAI DB container locally |
| Python 3.10+ | Runs the Context8 CLI and MCP server |
Quick Start
# 1. Install context8 + the Actian VectorAI DB client
pip install context8 "actian-vectorai @ https://github.com/hackmamba-io/actian-vectorAI-db-beta/raw/main/actian_vectorai-0.1.0b2-py3-none-any.whl"
# Or with uv
uv pip install context8 "actian-vectorai @ https://github.com/hackmamba-io/actian-vectorAI-db-beta/raw/main/actian_vectorai-0.1.0b2-py3-none-any.whl"
# 2. Start the database (pulls and runs the Docker container)
context8 start
# 3. Initialize and seed with 24 curated problem-solution pairs
context8 init --seed
# 4. Add to your coding agent (pick one)
context8 add claude # Claude Code
context8 add cursor # Cursor
context8 add windsurf # Windsurf
# 5. Verify everything works
context8 doctor
Restart your agent. It now has three new tools: context8_search, context8_log, and context8_stats.
Why two packages? The
actian-vectoraiSDK is distributed by Actian as a beta wheel and is not yet on PyPI. Context8 is on PyPI. Once Actian publishes their SDK to PyPI, this becomes a singlepip install context8.
Context8 vs Context7 vs Skills
Coding agents have multiple ways to get help. Here's where each one fits and where it falls short:
The Context Layers
| Layer | Source | What It Covers | Limits |
|---|---|---|---|
| Context 1–6 | Codebase, conversation, memory | Your current project's files and history | Only knows your code |
| Context7 | Official documentation (Upstash) | API references, common usage patterns, getting-started guides | Only covers documented knowledge |
| Skills / CLAUDE.md | Hand-written rules | Project conventions, tool-specific patterns, coding style | Manual maintenance, doesn't learn |
| Context8 | Agent problem-solving history (Actian VectorAI DB) | Uncommon errors, workarounds, integration bugs, agent-discovered fixes | Needs seeding and accumulation |
When Each One Helps (and When It Doesn't)
| Scenario | Context7 (Docs) | Skills / Rules | Context8 (Memory) |
|---|---|---|---|
"How do I use the useQuery hook?" |
Best fit — it's in the React Query docs | Partial — if someone wrote a skill for it | Overkill — docs cover this |
| "What's our team's folder naming convention?" | Won't help — not in public docs | Best fit — written in CLAUDE.md | Won't help — not a problem/solution |
ERESOLVE unable to resolve dependency tree after upgrading npm |
Partial — npm docs mention peer deps vaguely | Won't help — too specific | Best fit — exact error with proven fix |
| Hydration mismatch in Next.js 15 + React 19 RC | Outdated — docs haven't caught up | Won't help | Best fit — another agent hit this last week |
torch.cuda.OutOfMemoryError during fine-tuning even with batch_size=1 |
Partial — PyTorch docs cover CUDA basics | Won't help | Best fit — solution with 4 ranked fix strategies |
docker compose volume empty on Windows WSL2 |
Won't help — Docker docs assume Linux | Maybe — if someone added a WSL tip | Best fit — exact OS-specific workaround |
The Key Difference
Context7: "Here's what the library author wrote in the docs"
Skills: "Here's what a human wrote as a rule for this project"
Context8: "Here's what an agent actually did to fix this exact problem last Tuesday"
Context7 is a librarian — it finds the official answer. Skills are a style guide — they enforce conventions. Context8 is a colleague — it remembers what worked in practice.
They're complementary. Use all three:
Agent encounters error
├── Check Skills/CLAUDE.md → "Do we have a rule for this?" (instant, project-specific)
├── Search Context7 → "What do the docs say?" (official, broad coverage)
└── Search Context8 → "Has any agent solved this before?" (practical, battle-tested)
How It Works
Context8 is an MCP server backed by Actian VectorAI DB. When your agent encounters an error:
- Search — Agent calls
context8_search("TypeError Cannot read properties of undefined map React Suspense") - Match — Context8 runs hybrid search: dense semantic vectors find meaning-similar problems, sparse keyword vectors catch exact error tokens, metadata filters narrow by language/framework
- Return — Agent gets ranked solutions with code diffs, confidence scores, and context
- Learn — After solving a new problem, the agent calls
context8_log(problem=..., solution=...)to store it
Three Search Strategies, Fused Together
| Strategy | Vector Space | What It Catches | Example |
|---|---|---|---|
| Dense search | problem (384d, MiniLM) |
Semantic meaning | "undefined array access" matches "null reference on collection" |
| Dense search | code_context (768d, CodeBERT) |
Code patterns | data?.items ?? [] matches optional chaining null safety |
| Sparse search | keywords (BM25) |
Exact tokens | ModuleNotFoundError matches ModuleNotFoundError exactly |
Results are fused with Reciprocal Rank Fusion (RRF) and filtered by language, framework, and more. The QueryAnalyzer auto-detects query type and adjusts fusion weights:
| Query Type | Dense Weight | Code Weight | Sparse Weight |
|---|---|---|---|
Error message (TypeError: ...) |
0.40 | 0.15 | 0.45 |
| Error + code context | 0.35 | 0.30 | 0.35 |
| Code snippet only | 0.25 | 0.55 | 0.20 |
| Natural language question | 0.60 | 0.15 | 0.25 |
MCP Tools
Once connected, your agent has access to:
context8_search
Search for past solutions to a problem.
Input: query (required), code_context, language, framework, limit
Output: Ranked solutions with problem, fix, code diff, confidence, tags
context8_log
Log a resolved problem for future agents.
Input: problem (required), solution (required), error_type, code_snippet,
code_diff, stack_trace, language, framework, libraries, tags, confidence
Output: Confirmation + record ID (or duplicate detection)
context8_stats
Knowledge base health check.
Input: (none)
Output: Record count, collection status, vector spaces, endpoint
CLI Reference
Setup
context8 start # Start the Actian VectorAI DB container
context8 stop # Stop the container
context8 init # Create the collection
context8 init --seed # Create + seed with starter data
context8 init --seed --force # Drop, recreate, and reseed
Agent Integration
context8 add claude # Add to Claude Code (~/.claude/settings.json)
context8 add claude-project # Add to project-level Claude config
context8 add cursor # Add to Cursor (.cursor/mcp.json)
context8 add windsurf # Add to Windsurf (.windsurf/mcp.json)
context8 remove claude # Remove from Claude Code
Operations
context8 stats # Show knowledge base statistics
context8 doctor # Full health check (Docker, DB, SDK, models, agents)
context8 search "query" # Search from the command line
context8 search "query" -l python # Search with language filter
context8 serve # Start MCP server (agents call this automatically)
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Coding Agent (Claude Code / Cursor / Windsurf) │
└──────────────────────────┬──────────────────────────────────┘
│ MCP (stdio)
┌──────────────────────────▼──────────────────────────────────┐
│ Context8 MCP Server │
│ │
│ ┌────────────────┐ ┌───────────────┐ ┌────────────────┐ │
│ │ Embedding │ │ Search │ │ Storage │ │
│ │ Pipeline │ │ Engine │ │ Service │ │
│ │ │ │ │ │ │ │
│ │ MiniLM 384d │ │ Dense+Sparse │ │ Named Vecs │ │
│ │ CodeBERT 768d │ │ RRF Fusion │ │ Filters │ │
│ │ BM25 Sparse │ │ QueryAnalyze │ │ Dedup │ │
│ └────────────────┘ └───────────────┘ └────────────────┘ │
└──────────────────────────┬──────────────────────────────────┘
│ gRPC :50051
┌──────────▼──────────┐
│ Actian VectorAI DB │
│ (Docker Container) │
│ │
│ Collection: │
│ context8_store │
│ │
│ Named Vectors: │
│ • problem 384d │
│ • solution 384d │
│ • code_ctx 768d │
│ Sparse: keywords │
│ Payload: metadata │
└─────────────────────┘
Hackathon: Advanced Features Used
Built for the Actian VectorAI DB Build Challenge
This project uses all three advanced features required by the hackathon:
| Feature | How Context8 Uses It | Why It Matters |
|---|---|---|
| Hybrid Fusion | Dense semantic + sparse BM25 keyword vectors, fused with RRF | Error messages contain both meaning and exact tokens — you need both |
| Filtered Search | Metadata filters by language, framework, error type, resolution status | A Python agent doesn't need TypeScript solutions |
| Named Vectors | 3 separate spaces: problem (384d), solution (384d), code_context (768d) |
Error descriptions, fix descriptions, and code are semantically different domains |
Tech Stack
| Component | Technology | Purpose |
|---|---|---|
| Vector Database | Actian VectorAI DB | Storage, indexing, HNSW search |
| Dense Embeddings | sentence-transformers/all-MiniLM-L6-v2 |
384d text vectors (problems, solutions) |
| Code Embeddings | microsoft/codebert-base |
768d code-aware vectors (opt-in) |
| Sparse Embeddings | Custom BM25 tokenizer | Exact keyword matching |
| MCP Server | Python mcp SDK |
stdio transport to agents |
| CLI | Click + Rich | Terminal UX with tables, panels, health checks |
| CI/CD | GitHub Actions | Lint → Test → Build → Publish to PyPI |
| Package | uv / pip / hatchling | PEP 517 compatible |
Seed Data
Context8 ships with 24 curated problem-solution pairs to solve the cold start problem:
| Category | Count | Examples |
|---|---|---|
| Python environment | 5 | venv conflicts, PEP 668, asyncio in Jupyter, CUDA OOM |
| Node.js / npm | 3 | peer deps, ESM vs CJS, heap out of memory |
| React / Next.js | 3 | hydration mismatch, setState in render, streaming API routes |
| TypeScript | 2 | type narrowing to never, path alias resolution |
| Docker | 2 | volume mounts on WSL2, port conflicts |
| Database | 1 | connection pool exhaustion in serverless |
| Git | 1 | lockfile merge conflicts |
| Rust | 2 | WASM no_std, borrow checker in loops |
| AI / ML | 2 | OpenAI rate limits, HuggingFace generation issues |
| Build tools | 1 | Vite prebundling cache |
| Cross-platform | 1 | Windows long path ENOENT |
Run context8 init --seed to load them. Your agents start finding solutions immediately.
Development
# Clone and set up
git clone https://github.com/hallelx2/context8.git
cd context8
uv venv && source .venv/bin/activate # or: .venv\Scripts\activate on Windows
# Install context8 + dev deps + actian client
uv pip install -e ".[all]" "actian-vectorai @ https://github.com/hackmamba-io/actian-vectorAI-db-beta/raw/main/actian_vectorai-0.1.0b2-py3-none-any.whl"
# Start the DB and verify
context8 start
context8 doctor
# Run tests (29 unit tests, no DB needed)
pytest tests/ -v
# Lint + format
ruff check src/ tests/
ruff format src/ tests/
Project Structure
context8/
├── src/context8/
│ ├── cli.py # CLI commands (start/stop/init/add/remove/stats/doctor/search)
│ ├── server.py # MCP server (context8_search, context8_log, context8_stats)
│ ├── agents.py # Agent config management (add/remove from Claude/Cursor/etc.)
│ ├── search.py # Hybrid search engine + QueryAnalyzer
│ ├── embeddings.py # MiniLM + CodeBERT + BM25 pipeline
│ ├── storage.py # Actian VectorAI DB operations (with sparse fallback)
│ ├── models.py # ResolutionRecord dataclass
│ ├── config.py # Constants, paths, agent registry
│ └── seed.py # 24 curated problem-solution starter records
├── tests/ # 29 unit tests (models, tokenizer, agents, query analyzer)
├── docs/ # Architecture, 8 build plans, bottleneck analysis
├── .github/workflows/
│ ├── ci.yml # Lint → Test (3.10 + 3.12) → Build
│ └── publish.yml # CI → Publish to PyPI → GitHub Release
├── docker-compose.yml
├── pyproject.toml
└── CLAUDE.md # Agent instructions for this codebase
Releasing
# Bump version in pyproject.toml and src/context8/__init__.py, then:
git tag v0.2.0
git push --tags
# CI runs → PyPI publishes → GitHub Release created automatically
License
Built with Actian VectorAI DB for the Actian VectorAI DB Build Challenge
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file context8-0.1.1.tar.gz.
File metadata
- Download URL: context8-0.1.1.tar.gz
- Upload date:
- Size: 71.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6f4898e5ecc60d4488af651fce698489d1683273b8d1ffcdb62afe9aa0987eb
|
|
| MD5 |
567585c557ca52454e1b52cc1b1f4556
|
|
| BLAKE2b-256 |
5910d30e8a5ea8bd4bb2ecab3e1ff9b954a9726825b0b5bfe957d72941fedf57
|
File details
Details for the file context8-0.1.1-py3-none-any.whl.
File metadata
- Download URL: context8-0.1.1-py3-none-any.whl
- Upload date:
- Size: 34.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
429f80417fedd5c517b652e131632b65a40d7bc91997470a47c8827071b691b1
|
|
| MD5 |
fc8a247349515cdcc1ec23ce0cdbeeb8
|
|
| BLAKE2b-256 |
9ce857212ded19088107e34573ffccf0c0b5da47fd08125ef05d02822953f8f4
|