Full-stack AI enablement platform

These details have not been verified by PyPI

Project links

Project description

🐬 dolphin

⚠️ EXPERIMENTAL - This is a developmental library under active development. APIs and interfaces are unstable and subject to change without notice.

A semantic code search and knowledge management system with AI-native interfaces (MCP, REST API, CLI).

Quick Start

Installation

Core Installation (~200MB)

# Install core functionality with pip
pip install pb-dolphin

# Or with uv (recommended)
uv pip install pb-dolphin

# ⚠️ IMPORTANT: Ensure OPENAI_API_KEY is set as env var
export OPENAI_API_KEY="sk-your-key-here"

Optional: Cross-Encoder Reranking (~2GB additional)

For advanced search quality improvement (+20-30% MRR):

# With pip
pip install pb-dolphin[reranking]

# With uv (recommended)
uv pip install pb-dolphin[reranking]

Trade-off: Better relevance but 2-3x slower searches. See Advanced Features for configuration.

Optional: MCP Orchestrator

For MCP server management capabilities:

# With pip
pip install pb-dolphin[orchestrator]

# With uv
uv pip install pb-dolphin[orchestrator]

Basic Usage

# Initialize global knowledge store and index a repository
dolphin init
dolphin add-repo my-project /path/to/project
dolphin index my-project

# Search your indexed code
dolphin search "authentication logic"

# Start API server
dolphin serve

Core Commands

dolphin init - Initialize configuration (auto-creates ~/.dolphin/config.toml)
dolphin init --repo - Create repo-specific config in current directory
dolphin add-repo <name> <path> - Register a repository for indexing
dolphin index <name> - Index a repository with language-aware chunking
dolphin search <query> - Search indexed code semantically
dolphin serve - Start REST API server (port 7777)
dolphin config --show - Display current configuration

Architecture

High-Level Overview

┌──────────────────────────────────────────┐
│   AI Interfaces (Claude, Continue, etc)  │
└──────────────┬───────────────────────────┘
               │ MCP Protocol
               ▼
┌──────────────────────────────────────────┐
│          Dolphin Knowledge Base          │
│  ┌─────────────┐    ┌────────────────┐  │
│  │ MCP Bridge  │◄──►│ REST API       │  │
│  │ (TypeScript)│    │ (Python/FastAPI)│  │
│  └─────────────┘    └────────┬────────┘  │
└──────────────────────────────┼───────────┘
                               │
               ┌───────────────┴────────────┐
               ▼                            ▼
          ┌─────────┐                ┌──────────┐
          │LanceDB  │                │ SQLite   │
          │(Vectors)│                │(Metadata)│
          └─────────┘                └──────────┘

Key Features

Language-Aware Chunking - Intelligent code parsing for Python, TypeScript, JavaScript, Markdown
Semantic Search - OpenAI embeddings with LanceDB vector storage
MCP Support - Native Model Context Protocol integration for Claude Desktop
REST API - FastAPI server with search, retrieval, and metadata endpoints
Unified CLI - Single dolphin command for all operations
Auto-Configuration - Smart config hierarchy (repo → user → defaults)

Environment Variables

Dolphin requires the following environment variables depending on your usage:

Required for OpenAI Embeddings

# Required when using OpenAI embeddings (recommended for production)
export OPENAI_API_KEY="sk-your-openai-api-key-here"

Getting Your OpenAI API Key

Visit OpenAI Platform
Sign up or log in to your account
Navigate to API Keys
Click "Create new secret key"
Copy the key and set it as OPENAI_API_KEY

Configuration

Dolphin uses a multi-level configuration system:

Repo-specific (./.dolphin/config.toml) - Per-repository chunking settings
User-global (~/.dolphin/config.toml) - Auto-created on first use

Example Config

# ~/.dolphin/config.toml
default_embed_model = "large"  # or "small"

[embedding]
provider = "openai"
batch_size = 100

[retrieval]
top_k = 8
score_cutoff = 0.15

Claude Desktop Integration (MCP)

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "dolphin": {
      "command": "bun",
      "args": ["run", "/path/to/dolphin/mcp-bridge/src/index.ts"],
      "env": {
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Start the server: dolphin serve

Available MCP tools: search_knowledge, fetch_chunk, fetch_lines, get_vector_store_info

REST API

# Start server
dolphin serve

# Search
curl -X POST http://127.0.0.1:7777/v1/search \
  -H "Content-Type: application/json" \
  -d '{"query": "authentication", "top_k": 5}'

# List repositories
curl http://127.0.0.1:7777/v1/repos

# Health check
curl http://127.0.0.1:7777/v1/health

Advanced Features

Cross-Encoder Reranking

Cross-encoder reranking improves search result relevance by re-scoring results with a more sophisticated ML model.

Performance Impact:

✅ +20-30% improvement in Mean Reciprocal Rank (MRR)
✅ Better first-result quality - more relevant top results
⚠️ 2-3x slower searches - cross-encoder is compute-intensive
⚠️ ~2GB install size - requires torch and sentence-transformers

Installation

# With uv (recommended)
uv pip install pb-dolphin[reranking]

# Or with pip
pip install pb-dolphin[reranking]

Configuration

Enable in your ~/.dolphin/config.toml:

[retrieval.reranking]
enabled = true  # Enable cross-encoder reranking
model = "cross-encoder/ms-marco-MiniLM-L-6-v2"  # HuggingFace model
device = ""  # Auto-detect (CPU or CUDA if available)
batch_size = 32  # Higher = faster but more memory
candidate_multiplier = 4  # Rerank top_k × multiplier candidates
score_threshold = 0.3  # Minimum relevance score (0-1)

Restart the API server to apply changes:

dolphin serve

When to Use Reranking

Enable when:

Search quality is critical
Willing to accept higher latency
Have sufficient compute resources
Precision matters more than speed

Don't enable when:

Speed is priority
Install size matters
Basic vector search + hybrid search is sufficient

How It Works

Normal Search:
Query → Embeddings → Vector Search → Top Results

With Reranking:
Query → Embeddings → Vector Search → Fetch top_k×4 candidates
      → Cross-encoder re-scores each (query, result) pair
      → Re-sort by cross-encoder scores → Top Results

The cross-encoder model evaluates each query-result pair directly, providing more accurate relevance scores than simple vector similarity.

Development Status

Current: Pre-alpha (0.1.x)

✅ Core indexing and search pipeline
✅ Language-aware chunking (Python, TS, JS, Markdown)
✅ REST API with MCP bridge
⚠️ Developmental stage

Upcoming:

Performance optimization
Production hardening
Evaluation framework
Expanded language support

Requirements

Python ≥3.12
OpenAI API key (for embeddings)
Bun (for MCP bridge)
Git (for repository scanning)

Testing

# Run all tests
uv run pytest

# Run specific test suite
uv run pytest tests/unit/
uv run pytest tests/integration/

License

MIT License

Acknowledgments

Built with LanceDB, OpenAI, FastAPI, and Bun

⚠️ Remember: This is experimental software under active development. Use at your own risk.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.4

Feb 28, 2026

0.2.3

Feb 23, 2026

0.2.2

Feb 22, 2026

0.2.1

Jan 27, 2026

0.2.0

Jan 26, 2026

0.1.13

Nov 8, 2025

0.1.12

Nov 4, 2025

This version

0.1.11

Nov 3, 2025

0.1.10

Nov 3, 2025

0.1.9

Nov 2, 2025

0.1.8

Nov 2, 2025

0.1.7

Nov 2, 2025

0.1.6

Nov 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pb_dolphin-0.1.11.tar.gz (97.5 kB view details)

Uploaded Nov 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pb_dolphin-0.1.11-py3-none-any.whl (127.7 kB view details)

Uploaded Nov 3, 2025 Python 3

File details

Details for the file pb_dolphin-0.1.11.tar.gz.

File metadata

Download URL: pb_dolphin-0.1.11.tar.gz
Upload date: Nov 3, 2025
Size: 97.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for pb_dolphin-0.1.11.tar.gz
Algorithm	Hash digest
SHA256	`b35e7b8269bc7c3048e31d6da54f8f94a80b6b9f53b66c003be1eb0ab2622f0f`
MD5	`6bb508afb9ca9bfd7982261b9463c496`
BLAKE2b-256	`8c025377dffbca6d380ca0968360e09a13b7c5826802affe72d6e4cb8355e463`

See more details on using hashes here.

File details

Details for the file pb_dolphin-0.1.11-py3-none-any.whl.

File metadata

Download URL: pb_dolphin-0.1.11-py3-none-any.whl
Upload date: Nov 3, 2025
Size: 127.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for pb_dolphin-0.1.11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ea826031f18655be9f28c2e565ec058c6c603f71801aefb8845f1b97da1ff4b4`
MD5	`673e98b6d1874616ac09f9e6a0166841`
BLAKE2b-256	`108a1e8f61790208cdd086b65fcded7f7617a38173be3728bc71e44d34ff664e`

See more details on using hashes here.

pb-dolphin 0.1.11

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🐬 dolphin

Quick Start

Installation

Core Installation (~200MB)

Optional: Cross-Encoder Reranking (~2GB additional)

Optional: MCP Orchestrator

Basic Usage

Core Commands

Architecture

High-Level Overview

Key Features

Environment Variables

Required for OpenAI Embeddings

Getting Your OpenAI API Key

Configuration

Example Config

Claude Desktop Integration (MCP)

REST API

Advanced Features

Cross-Encoder Reranking

Installation

Configuration

When to Use Reranking

How It Works

Development Status

Requirements

Testing

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes