Extensible MCP server for semantic code search with plugin architecture supporting multiple embedding providers, vector databases, and data sources.
Project description
CodeWeaver
The missing abstraction layer between AI and your code
Installation • Features • How It Works • Documentation • Contributing
🎯 What is CodeWeaver?
CodeWeaver gives both humans and AI a deep, structural understanding of your project — not just text search, but real context: symbols, blocks, relationships, intent. MCP is just the delivery mechanism; CodeWeaver is the capability.
If you want AI that actually knows your code instead of guessing, this is the foundation.
⚠️ Alpha Release: CodeWeaver is in active development. Use it, break it, shape it, help make it better.
🔍 Why CodeWeaver Exists
The Problems
| Problem | Impact |
|---|---|
| 🔴 Poor Context = Poor Results | Agents are better at generating new code than understanding existing structure |
| 💸 Massive Inefficiency | Agents read the same huge files repeatedly (50%+ context waste is common) |
| 🔧 Wrong Abstraction | Tools built for humans, not for how agents actually work |
| 🔒 No Ownership | Existing solutions locked into specific IDEs or agent clients like Claude Code |
The result: Shallow, inconsistent, fragile context. And you don't control it.
CodeWeaver's Approach
✅ One focused capability: Structural + semantic code understanding ✅ Hybrid search built for code, not text ✅ Works offline, airgapped, or degraded ✅ Deploy it however you want ✅ One great tool instead of 30 mediocre ones
📖 Read the detailed rationale →
🚀 Getting Started
Quick Install
# Add CodeWeaver to your project
uv add --prerelease allow --dev code-weaver
# Initialize config and MCP setup
cw init
# Verify setup
cw doctor
# Start the server
cw server
📝 Note:
cw initdefaults to CodeWeaver'srecommendedprofile, which requires:
- 🔑 Voyage AI API key (generous free tier)
- 🗄️ Qdrant instance (cloud or local, generous free tier for cloud, free local)
🐳 Prefer Docker? See Docker setup guide →
MCP Configuration
cw init will add CodeWeaver to your project's .mcp.json:
{
"mcpServers": {
"codeweaver": {
"type": "http",
"url": "http://127.0.0.1:9328/mcp"
}
}
}
💡 Why HTTP? Unlike most MCP servers, CodeWeaver defaults to
streamable-httptransport for a more predictable, smoother experience.
⚠️ Warning: While
stdiotransport is technically possible, it's untested and may cause issues due to complex background orchestration. Use at your own risk!
✨ Features
🧠 Smart Search
|
🌐 Language Support
|
🔄 Resilient & Offline
|
⚙️ Flexible Configuration
|
🔌 Provider Support
|
🛠️ Developer Experience
|
🏗️ How It Works
CodeWeaver combines AST-level understanding, semantic relationships, and hybrid embeddings (sparse + dense) to deliver both contextual and literal understanding of your codebase.
The goal: give AI the fragments it should see, not whatever it can grab.
Architecture Highlights
┌─────────────────────────────────────────────────────────┐
│ Your Codebase │
└────────────────┬────────────────────────────────────────┘
│
▼
┌────────────────┐
│ Live Indexing │ ← AST parsing + semantic analysis
└────────┬───────┘
│
▼
┌────────────────────────┐
│ Hybrid Vector Store │ ← Sparse + Dense embeddings
└────────┬───────────────┘
│
▼
┌─────────────────┐
│ Reranking Layer │ ← Relevance optimization (heuristic and reranking model)
└────────┬────────┘
│
▼
┌──────────────────┐
│ MCP Interface │ ← Simple "find_code" tool (`find_code("authentication api")`)
└────────┬─────────┘
│
▼
┌─────────┐
│ AI │
└─────────┘
CLI Commands
cw server # Run the MCP server
cw doctor # Full setup diagnostic
cw index # Run indexing without server
cw init # Set up MCP + config
cw list # List providers, models, capabilities
cw status # Live server status, health, index state
cw search # Test the search engine
cw config # View resolved configuration
📊 Current Status (Alpha)
Stability Snapshot: Strong Core, Prickly Edges
| Component | Status | Notes |
|---|---|---|
| 🔄 Live indexing & file watching | ⭐⭐⭐⭐ | Runs continuously; reliable |
| 🌳 AST-based chunking | ⭐⭐⭐⭐ | Full semantic/AST for 26 languages |
| 📝 Context-aware chunking | ⭐⭐⭐⭐ | 136+ languages, heuristic AST-lite |
| 🔌 Provider integration | ⭐⭐⭐ | Voyage/FastEmbed reliable, others vary |
| 🛡️ Automatic fallback | ⭐⭐⭐ | Seamless offline/degraded mode |
| 💻 CLI | ⭐⭐⭐⭐ | Core commands fully wired and tested |
| 🐳 Docker build | ⭐⭐⭐ | Skip local Qdrant setup entirely |
| 🔗 MCP interface | ⭐⭐⭐ | Core ops reliable, some edge cases |
| 🌐 HTTP endpoints | ⭐⭐⭐ | Health, metrics, state, versions stable |
Legend: ⭐⭐⭐⭐ = solid | ⭐⭐⭐ = works with quirks | ⭐⭐ = experimental | ⭐ = chaos gremlin
🗺️ Roadmap
The enhancement issues describe detailed plans. Short version:
- 📚 Way better docs – comprehensive guides and tutorials
- 🤖 AI-powered context curation – agents identify purpose and intent
- 🔧 Data provider integration – Tavily, DuckDuckGo, Context7, and more
- 💉 True DI system – replace existing registry
- 🕸️ Advanced orchestration – integrate
pydantic-graph
What Will Stay: One Tool
One tool. We give AI agents one simple tool: find_code.
Agents just need to explain what they need. No complex schemas. No novella-length prompts.
📚 Documentation
For Users
For Developers
Product Philosophy
- 💭 Product Decisions – transparency matters
- 🤔 Why CodeWeaver? – detailed rationale
🤝 Contributing
PRs, issues, weird edge cases, feature requests — all welcome!
This is still early, and the best time to help shape the direction.
How to Contribute
- 🍴 Fork the repository
- 🌿 Create a feature branch
- ✨ Make your changes
- ✅ Add tests if applicable
- 📝 Update documentation
- 🚀 Submit a PR
You'll need to agree to our Contributor License Agreement.
Found a Bug?
🐛 Report it here – include as much detail as possible!
🔗 Links
Project
- 📦 Repository: github.com/knitli/codeweaver
- 🐛 Issues: Report bugs & request features
- 📋 Changelog: View release history
Company
- 🏢 Knitli: knitli.com
- ✍️ Blog: blog.knitli.com
- 🐦 X/Twitter: @knitli_inc
- 💼 LinkedIn: company/knitli
- 💻 GitHub: @knitli
Support the Project
We're a one-person company at the moment... and make no money... if you like CodeWeaver and want to keep it going, please consider sponsoring me 😄
📦 Package Info
- Python package:
codeweaver - CLI commands:
cw/codeweaver - Python requirement: ≥3.12 (tested on 3.12, 3.13, 3.14)
- Entry point:
codeweaver.cli.app:main
📄 License
Licensed under MIT OR Apache 2.0 — you choose! Some vendored code is Apache 2.0 only and some is MIT only. Everything is permissively licensed.
The project follows the REUSE specification. Every file has detailed licensing information, and we regularly generate a software bill of materials.
📊 Telemetry
The default includes very anonymized telemetry to improve CodeWeaver. See the implementation or read the README.
Opt out: export CODEWEAVER__TELEMETRY__DISABLE_TELEMETRY=true
Opt in to detailed feedback (helps us improve): export CODEWEAVER__TELEMETRY__TOOLS_OVER_PRIVACY=true
⚠️ API Stability
Warning: The API will change. Our priority right now is giving you and your coding agent an awesome tool.
To deliver on that, we can't get locked into API contracts while we're in alpha. We also want you to be able to extend and build on CodeWeaver — once we get to stable releases.
Built with ❤️ by Knitli
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file code_weaver-0.1.0a1.tar.gz.
File metadata
- Download URL: code_weaver-0.1.0a1.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16fad3048225c20979ed7854410da3f60424e042adc3cde995e22f0030a273d9
|
|
| MD5 |
1f70aa73d2a8856014640cf9c46929f9
|
|
| BLAKE2b-256 |
51878e597f8150d1155abb45683ce9b8eb437fc6131eee32142c10f92d769621
|
Provenance
The following attestation bundles were made for code_weaver-0.1.0a1.tar.gz:
Publisher:
release.yml on knitli/codeweaver
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
code_weaver-0.1.0a1.tar.gz -
Subject digest:
16fad3048225c20979ed7854410da3f60424e042adc3cde995e22f0030a273d9 - Sigstore transparency entry: 729671862
- Sigstore integration time:
-
Permalink:
knitli/codeweaver@d417ec318045deb0643f1e8d2d599e790aa0d447 -
Branch / Tag:
refs/tags/v0.1.0-alpha.1 - Owner: https://github.com/knitli
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d417ec318045deb0643f1e8d2d599e790aa0d447 -
Trigger Event:
push
-
Statement type:
File details
Details for the file code_weaver-0.1.0a1-py3-none-any.whl.
File metadata
- Download URL: code_weaver-0.1.0a1-py3-none-any.whl
- Upload date:
- Size: 765.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6bff0164229c16a1191d844595f875cae5004fcd9d815a37d4c3f8d145c807e
|
|
| MD5 |
4464f350590234c4c2e6b5a68e08621f
|
|
| BLAKE2b-256 |
bb969786d70e4fe71b55253548707465e832d1ee1ac22d8f3eeda573d13f5020
|
Provenance
The following attestation bundles were made for code_weaver-0.1.0a1-py3-none-any.whl:
Publisher:
release.yml on knitli/codeweaver
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
code_weaver-0.1.0a1-py3-none-any.whl -
Subject digest:
c6bff0164229c16a1191d844595f875cae5004fcd9d815a37d4c3f8d145c807e - Sigstore transparency entry: 729671868
- Sigstore integration time:
-
Permalink:
knitli/codeweaver@d417ec318045deb0643f1e8d2d599e790aa0d447 -
Branch / Tag:
refs/tags/v0.1.0-alpha.1 - Owner: https://github.com/knitli
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d417ec318045deb0643f1e8d2d599e790aa0d447 -
Trigger Event:
push
-
Statement type: