Full-stack AI enablement platform
Project description
๐ฌ dolphin
โ ๏ธ EXPERIMENTAL - This is a developmental library under active development. APIs and interfaces are unstable and subject to change without notice.
A semantic code search and knowledge management system with AI-native interfaces (MCP, REST API, CLI).
Quick Start
Installation
Core Installation (~200MB)
# install with uv (recommended)
uv pip install pb-dolphin
# โ ๏ธ IMPORTANT: Ensure OPENAI_API_KEY is set as env var
export OPENAI_API_KEY="sk-your-key-here"
Optional: Cross-Encoder Reranking (~2GB additional)
For advanced search quality improvement (+20-30% MRR):
uv pip install pb-dolphin[reranking]
Trade-off: Better relevance but 2-3x slower searches. See Advanced Features for configuration.
Basic Usage
# Initialize global knowledge store and index a repository
dolphin init
dolphin add-repo my-project /path/to/project
dolphin index my-project
# Search your indexed code
dolphin search "authentication logic"
# Start API server
dolphin serve
Core Commands
dolphin init- Initialize configuration (auto-creates~/.dolphin/config.toml)dolphin init --repo- Create repo-specific config in current directorydolphin add-repo <name> <path>- Register a repository for indexingdolphin index <name>- Index a repository with language-aware chunkingdolphin search <query>- Search indexed code semanticallydolphin serve- Start REST API server (port 7777)dolphin config --show- Display current configuration
Architecture
High-Level Overview
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AI Interfaces (Claude, Continue, etc) โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MCP Protocol
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Dolphin Knowledge Base โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ-โ โ
โ โ MCP Bridge โโโโโบโ REST API โ โ
โ โ (TypeScript)โ โ (Python/FastAPI)โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโฌโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโ
โผ โผ
โโโโโโโโโโโ โโโโโโโโโโโโ
โLanceDB โ โ SQLite โ
โ(Vectors)โ โ(Metadata)โ
โโโโโโโโโโโ โโโโโโโโโโโโ
Key Features
- Language-Aware Chunking - Code parsing for Python, TypeScript, JavaScript, Markdown
- Semantic Search - OpenAI embeddings with LanceDB vector storage
- REST API - FastAPI server with search, retrieval, and metadata endpoints
- Unified CLI - Single
dolphincommand for all operations - Configuration - Per-repo chunking and ignore configuration
- MCP Support - MCP server implementation available at
bunx dolphin-mcp
Environment Variables
# Required when using OpenAI embeddings (recommended for production)
export OPENAI_API_KEY="sk-your-openai-api-key-here"
Configuration
Dolphin uses a multi-level configuration system:
- Repo-specific (
./.dolphin/config.toml) - Optional per-repository chunking settings - User-global (
~/.dolphin/config.toml) - Auto-created on first use
Configuration
You can use dolphin init to initialize your config and edit from there.
# ~/.dolphin/config.toml
default_embed_model = "large" # or "small"
[embedding]
provider = "openai"
batch_size = 100
[retrieval]
top_k = 8
score_cutoff = 0.0
MCP Configuration
The small companion MCP interface can be run via bun without install. Add to your favorite AI application's config:
{
"mcpServers": {
"dolphin": {
"command": "bunx",
"args": ["dolphin-mcp"]
}
}
}
Make sure you are running the HTTP retrieval server: uv run dolphin serve
Available MCP tools: search_knowledge, fetch_chunk, fetch_lines, get_vector_store_info
REST API
# Start server
dolphin serve
# Search
curl -X POST http://127.0.0.1:7777/v1/search \
-H "Content-Type: application/json" \
-d '{"query": "authentication", "top_k": 5}'
# List repositories
curl http://127.0.0.1:7777/v1/repos
# Health check
curl http://127.0.0.1:7777/v1/health
Advanced Features
Cross-Encoder Reranking
Cross-encoder reranking improves search result relevance by re-scoring each result pairwise against the query using an ML model, leading to 20-30% improvements in search result ranking quality (Nogueira & Cho, 2019).
Performance Impact:
- โ ๏ธ 2-3x slower searches - cross-encoder is compute-intensive
- โ ๏ธ ~2GB install size - requires torch and sentence-transformers
Installation
uv pip install pb-dolphin[reranking]
Configuration
Enable in your ~/.dolphin/config.toml:
[retrieval.reranking]
enabled = true # Enable cross-encoder reranking
model = "cross-encoder/ms-marco-MiniLM-L-6-v2" # HuggingFace model
device = "" # Auto-detect (CPU or CUDA if available)
batch_size = 32 # Higher = faster but more memory
candidate_multiplier = 4 # Rerank top_k ร multiplier candidates
score_threshold = 0.3 # Minimum relevance score (0-1)
Restart the API server to apply changes:
uv run dolphin serve
Development Status
Current: Beta (0.1.x)
- โ Core indexing and search pipeline
- โ Language-aware chunking (Python, TS, JS, Markdown)
- โ
REST API with MCP bridge available at
bunx dolphin-mcp - โ ๏ธ Developmental stage
Upcoming:
- Performance optimization
- Production hardening
- Evaluation framework
- Expanded language support
Requirements
- Python โฅ3.12
- OpenAI API key (for embeddings)
- Bun (for MCP bridge)
- Git (for repository scanning)
Testing
# Run all tests
uv run pytest
# Run specific test suite
uv run pytest tests/unit/
uv run pytest tests/integration/
License
MIT License
Acknowledgments
Built with LanceDB, OpenAI, FastAPI, and Bun
โ ๏ธ Remember: This is experimental software under active development. Use at your own risk.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pb_dolphin-0.1.12.tar.gz.
File metadata
- Download URL: pb_dolphin-0.1.12.tar.gz
- Upload date:
- Size: 104.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37282fbb171055681fc586efc06521018bfe7dac43ee93d484d28a96dd2f9ed1
|
|
| MD5 |
759badd34ff5f4a46d6ba9b3f9cc4bd6
|
|
| BLAKE2b-256 |
5a1fc5e70eeb17bec03abd781f6a2df507783a1b2f4789827163e93b9559185d
|
File details
Details for the file pb_dolphin-0.1.12-py3-none-any.whl.
File metadata
- Download URL: pb_dolphin-0.1.12-py3-none-any.whl
- Upload date:
- Size: 134.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86eda469d5ea9fa75a65b4392ea7258836c7df65fd285da61f0998afca5aadab
|
|
| MD5 |
1092ed6421400664b36d17b9a7ac02fa
|
|
| BLAKE2b-256 |
8d141f2a5115bddb3853bb7187d476233750fe7700590da720584113c99c2122
|