Full-stack AI enablement platform
Project description
๐ฌ Dolphin
Hybrid search across all your repositories.
Dolphin indexes your repositories and lets you perform hybrid (semantic + keyword) search across them.
Quickstart
# Install
uv pip install pb-dolphin
# Set your OpenAI key (used for embeddings)
export OPENAI_API_KEY="sk-..."
# Initialize, add a repo, and search
dolphin init
dolphin add-repo my-project /path/to/project
dolphin index my-project
dolphin search "database connection pooling"
Dolphin indexes your code with language-aware chunking, embeds it, and returns ranked results.
Want live re-indexing as you edit files? Start the server:
dolphin serve
Agent Integration
A small companion MCP server is available at bunx dolphin-mcp. Add this to your AI app's MCP config:
{
"mcpServers": {
"dolphin": {
"command": "bunx",
"args": ["dolphin-mcp"]
}
}
}
Make sure dolphin serve is running, and your agent can now search, retrieve chunks, and read files from your indexed repos.
Additionally, a Claude skill is available in this repo's marketplace as a personal Plugin.
How it works
You / Agent
|
v
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Dolphin โ
โ โ
โ CLI โโโ REST API โโโ MCP Bridge โ
โ | โ
โ โโโโโโโโดโโโโโโโ โ
โ v v โ
โ LanceDB SQLite โ
โ (vectors) (metadata + BM25) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Indexing: Your code is scanned, split into semantic chunks using language-aware AST parsers, embedded via OpenAI, and stored in LanceDB (vectors) and SQLite (metadata + full-text).
Searching: Your query is embedded and matched against both vector similarity and BM25 keyword relevance. Results are fused with Reciprocal Rank Fusion, optionally reranked with a cross-encoder, and returned as structured snippets with file paths, line numbers, and scores.
Features
Intelligent hybrid search
- Hybrid vector + BM25 keyword search with RRF fusion
- Optional cross-encoder reranking for +20-30% ranking improvement
- MMR diversity to reduce redundant results
- Filter by repo, language, path, or glob pattern
Language-aware indexing
- AST-based chunking for Python, TypeScript, JavaScript, Markdown, SQL, and Svelte
- Fallback text chunking for everything else
- Respects
.gitignoreand an optional repo-specific Dolphin config (dolphin init --repo)
Live sync
- File-watching built into
dolphin serveso edits are re-indexed automatically - Git-aware: handles branch switches gracefully
Multiple interfaces
dolphinCLI with compact, verbose, and JSON output modes- FastAPI server with full search and retrieval endpoints
- MCP server for integration via
bunx dolphin-mcp
CLI reference
| Command | What it does |
|---|---|
dolphin init |
Create config at ~/.dolphin/config.toml |
dolphin add-repo <name> <path> |
Register a repository |
dolphin index <name> |
Index (or re-index) a repository |
dolphin search <query> |
Search across indexed repos |
dolphin serve |
Start API server with file-watching |
dolphin status |
Show indexed repos and stats |
dolphin repos |
List registered repositories |
dolphin rm-repo <name> |
Remove a repo and its data |
dolphin config --show |
Display current config |
Search options
dolphin search "error handling" \
--repo myapp \
--lang py \
--path src/ \
--top-k 10 \
--verbose # or --json for scripting
Configuration
Dolphin auto-creates its config at ~/.dolphin/config.toml when you run dolphin init. The defaults work well out of the box.
default_embed_model = "small" # "small" (faster) or "large" (better)
[retrieval]
top_k = 8
[retrieval.hybrid_search]
enabled = true
fusion_method = "rrf"
For per-repo overrides (custom ignore patterns, chunking settings), run dolphin init --repo inside a repository.
Full config reference: docs/ARCHITECTURE.md
Optional: cross-encoder reranking
For the best possible search quality, enable cross-encoder reranking. This re-scores results pairwise against your query using an ML model.
uv pip install "pb-dolphin[reranking]"
Then in ~/.dolphin/config.toml:
[retrieval.reranking]
enabled = true
Trade-offs: ~2GB disk for model weights, 2-3x slower searches.
Requirements
| Dependency | Purpose |
|---|---|
| Python 3.12+ | Core runtime |
| uv | Python package management |
| OpenAI API key | Embedding generation |
| Bun | MCP bridge runtime (optional) |
| Git | Repository scanning |
Troubleshooting
Server not responding?
curl http://127.0.0.1:7777/v1/health # check health
lsof -i :7777 # check port
dolphin serve # start it
No search results?
dolphin status # verify repos are indexed
dolphin index <repo-name> --full --force # force re-index
MCP not connecting?
- Make sure
dolphin serveis running - Check that Bun is installed:
bun --version - Set
DOLPHIN_API_URLif the server isn't athttp://127.0.0.1:7777
License
MIT โ Plastic Beach, LLC
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pb_dolphin-0.2.4.tar.gz.
File metadata
- Download URL: pb_dolphin-0.2.4.tar.gz
- Upload date:
- Size: 245.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4cec7d6bc374d046ddaa930e1b0fd5b9f611bbd63b295360398c197d7380f41f
|
|
| MD5 |
437f0767583c91bbcf0f3aba86207067
|
|
| BLAKE2b-256 |
f59fd04bbb8ed6828ac40229d3a84cfe160e6867274461d9b0b875dfc49d7efa
|
Provenance
The following attestation bundles were made for pb_dolphin-0.2.4.tar.gz:
Publisher:
publish-kb.yml on plasticbeachllc/dolphin
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pb_dolphin-0.2.4.tar.gz -
Subject digest:
4cec7d6bc374d046ddaa930e1b0fd5b9f611bbd63b295360398c197d7380f41f - Sigstore transparency entry: 1005003164
- Sigstore integration time:
-
Permalink:
plasticbeachllc/dolphin@f8e683489e64ed4a834949d9f299614608e21494 -
Branch / Tag:
refs/tags/py-v0.2.4 - Owner: https://github.com/plasticbeachllc
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-kb.yml@f8e683489e64ed4a834949d9f299614608e21494 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pb_dolphin-0.2.4-py3-none-any.whl.
File metadata
- Download URL: pb_dolphin-0.2.4-py3-none-any.whl
- Upload date:
- Size: 286.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2aebf9a7d6280466a4e6c139392d049a3c241825f7d639664713168b3c0ba1cb
|
|
| MD5 |
6fce7e77d03374d22c581e688f2fe04f
|
|
| BLAKE2b-256 |
7ce563e8a50e56aeb6893a17493a5a26bcee7a34e42ee54aeee4314c276140d0
|
Provenance
The following attestation bundles were made for pb_dolphin-0.2.4-py3-none-any.whl:
Publisher:
publish-kb.yml on plasticbeachllc/dolphin
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pb_dolphin-0.2.4-py3-none-any.whl -
Subject digest:
2aebf9a7d6280466a4e6c139392d049a3c241825f7d639664713168b3c0ba1cb - Sigstore transparency entry: 1005003167
- Sigstore integration time:
-
Permalink:
plasticbeachllc/dolphin@f8e683489e64ed4a834949d9f299614608e21494 -
Branch / Tag:
refs/tags/py-v0.2.4 - Owner: https://github.com/plasticbeachllc
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-kb.yml@f8e683489e64ed4a834949d9f299614608e21494 -
Trigger Event:
push
-
Statement type: