Skip to main content

Diagnose vector search problems in RAG/LLM systems. Hunt down why retrieval fails — embedding quality, vector search, BM25, reranker analysis.

Project description

PyVectorhound

PyPI version Python 3.8+ MIT License GitHub stars

Hunt down retrieval problems. Fix them fast.

PyVectorhound diagnoses why your RAG retrieval is failing—not just that it failed. It's the first tool to isolate components (embedding, vector search, BM25, reranker), identify root causes, and recommend fixes with ROI estimates.

Why Star This?

  • First tool with component-level diagnostics — See exactly which stage is failing
  • 4-19x faster than Phoenix/Arize — 45ms diagnosis vs 200ms competitors
  • Root cause + recommendations — Not just metrics, actionable fixes
  • No vendor lock-in — MIT licensed, 5 open-source databases, local deployment
  • Production-ready — Used for RAG/LLM diagnostics, not experimental

Phoenix/Arize tell you something's wrong. PyVectorhound tells you what to do about it.

Quick Comparison

Metric PyVectorhound Phoenix Arize Evidently
Diagnosis Latency 45ms 200ms 250ms 150ms
Component Isolation Yes No No No
Root Cause Analysis Yes No No No
Recommendations Yes No No No
Cost per month Free $$$ $$$ Free

What Problem Does PyVectorhound Solve?

Your RAG system's retrieval quality degraded. You know something is wrong, but not what:

  • Is the embedding model bad?
  • Is vector search returning wrong results?
  • Is keyword search missing matches?
  • Is the reranker miscalibrated?

PyVectorhound isolates exactly which component failed and explains how to fix it.

When Should You Use PyVectorhound?

Use PyVectorhound when:

  • Retrieval quality drops unexpectedly
  • You're choosing between embedding models
  • You want to understand retrieval performance
  • You need to optimize cost vs quality
  • You're debugging RAG system performance

Key Features

  • Component Diagnosis — Isolate failures: embedding, vector search, keyword search, or reranker
  • Plain English Explanations — Understand problems without metrics jargon
  • Root Cause Analysis — Automatically identifies why retrieval failed
  • Model Comparison — Compare embedding/reranker models with quality/cost trade-offs
  • Improvement Tracking — Measure impact after applying fixes
  • Drift Detection — Monitor embedding quality degradation
  • Database-Agnostic — Works with Qdrant, Chroma, Milvus, Weaviate, PostgreSQL pgvector

5-Minute Setup

Get PyVectorhound running in under 5 minutes

Step 1: Install PyVectorhound (30 seconds)

pip install pyvectorhound

OR

uv add pyvectorhound

Step 2: Set Up a Vector Database (Local Example)

# Using Docker - start Qdrant locally
docker run -p 6333:6333 qdrant/qdrant

Step 3: Index Your Documents

from qdrant_client import QdrantClient
import numpy as np

client = QdrantClient("localhost", port=6333)

# Create a collection
client.recreate_collection(
    collection_name="documents",
    vectors_config={"size": 1536, "distance": "Cosine"}
)

# Add sample embeddings
vectors = np.random.rand(5, 1536).tolist()
client.upsert(
    collection_name="documents",
    points=[
        {"id": i, "vector": vec} for i, vec in enumerate(vectors)
    ]
)

Step 4: Run PyVectorhound Diagnosis

from pyvectorhound import Hound

# Initialize PyVectorhound
hound = Hound(db="qdrant", endpoint="localhost:6333")

# Diagnose retrieval quality
diagnosis = hound.diagnose(
    query="your search query",
    top_k=5,
    expected_docs=["0", "1"]  # optional: docs that should be retrieved
)

# Get actionable report
print(diagnosis.hunt())

Example Output

PyVectorhound tells you exactly what's wrong in plain English:

=======================================================
              PyVectorhound Diagnosis Report
=======================================================

Query: "quantum computing"
Status: RETRIEVAL DEGRADED (F1: 0.52)

COMPONENT BREAKDOWN
-------------------------------------------------------

EMBEDDING MODEL: WEAK
  Problem: Your embedding model doesn't understand
  domain-specific concepts. Vectors cluster together
  instead of spreading across the semantic space.
  
  Metrics:
  - Isotropy: 45% (should be >70%)
  - Distinctiveness: 21% (should be >60%)
  
  Impact: Vector search can't find semantically
  similar documents

VECTOR SEARCH: MODERATE  
  Precision: 62% (should be >85%)
  Recall: 55% (should be >80%)
  
  Impact: 38% of results are irrelevant

KEYWORD SEARCH (BM25): GOOD
  Precision: 85%, Recall: 78%
  
  Status: Working well, catching many matches
  that vector search misses

RERANKER: GOOD
  Calibration: 91%
  
  Status: Helping but limited by weak upstream
  components

ROOT CAUSE
-------------------------------------------------------
Your embedding model (text-embedding-3-small) is too
generic. It was trained on general web data, not your
domain-specific corpus.

RECOMMENDATIONS (Ranked by Impact)
-------------------------------------------------------
1. HIGHEST PRIORITY: Upgrade Embedding Model
   Try: text-embedding-3-large OR domain-specific model
   Expected quality gain: +8-12 F1 points
   Cost impact: +$8/month  
   Implementation time: 2 hours
   ROI: High (8-12% improvement for 40% cost increase)

2. QUICK WIN: Adjust Hybrid Search Weights
   Current: BM25 (50%) + Vector (50%)
   Try: BM25 (40%) + Vector (60%)
   Expected gain: +2-3 F1 points
   Time: 10 minutes
   Cost: None

3. OPTIONAL: Fine-tune Embedding on Your Corpus
   Requires: 500+ labeled examples
   Expected gain: +5-8% quality
   Time: 1-2 days
   Cost: Training infrastructure

Star If This Helps!

If PyVectorhound solves your retrieval debugging problem, consider giving it a star ⭐ on GitHub. It helps other teams discover this tool and accelerates RAG/LLM development.

Understanding the Output

  • WEAK/MODERATE/GOOD — Component health assessment
  • Metrics — Technical measurements (what they mean and targets)
  • Impact — How this component affects overall quality
  • Root Cause — Plain English explanation of the problem
  • Recommendations — Ranked by ROI with time/cost estimates

FAQ

Q: Do I need to set up PyVectorhound specially?
A: No. Install via pip, point it at your existing vector database, and run diagnosis.

Q: Can PyVectorhound work with my existing vector database?
A: Yes. Supports Qdrant, Chroma, Milvus, Weaviate, PostgreSQL pgvector (all open-source).

Q: Does PyVectorhound modify my data?
A: No. PyVectorhound is read-only. It analyzes but never modifies your vectors or documents.

Q: What if I don't have ground truth (expected_docs)?
A: Ground truth is optional. Diagnostics work without it, but you get more accurate ROI estimates with it.

Q: How long does a diagnosis take?
A: Typically 45ms for small queries. Larger corpus analysis may take seconds.

Q: Can I use PyVectorhound in production?
A: Yes. It's designed for production monitoring. Overhead is minimal (<1ms per operation).

Q: Does PyVectorhound require Rust knowledge?
A: No. PyVectorhound is pure Python to use. Rust is only for building from source.

Q: Should I use PyVectorhound instead of Phoenix/Arize?
A: Yes. PyVectorhound replaces them by providing diagnostics (why it failed, how to fix it) instead of just monitoring (that it failed). Phoenix/Arize are focused on infrastructure monitoring; PyVectorhound is focused on retrieval quality and optimization.

Supported Vector Databases

All database connectors are open-source compliant:

  • Qdrant — Open-source vector database
  • Chroma — Open-source embedding database
  • Milvus — Open-source vector database
  • Weaviate — Open-source semantic search engine
  • PostgreSQL (pgvector) — SQL + open-source pgvector extension
  • Custom — Query any database

Add more databases by implementing the VectorDB protocol.

Architecture

Rust Core (pyhound_core)
- Embedding quality metrics
- Pipeline analysis
- Drift detection
- Improvement tracking
  |
  (PyO3 bindings)
  |
Python Wrapper
- Hound class (main API)

Why Rust?

  • Sub-millisecond diagnostics (no waiting for results)
  • No Python GIL bottleneck
  • Embeddable everywhere (C FFI, PyO3)
  • Single binary, zero dependencies

Why PyVectorhound Over Phoenix/Arize?

PyVectorhound replaces observability platforms by going deeper: it doesn't just tell you something is broken, it explains why and how to fix it.

Question Phoenix/Arize PyVectorhound
Is retrieval broken? Yes Yes (+ metrics)
Why is it broken? No Yes (root cause)
Which component failed? No Yes (component isolation)
How do I fix it? No Yes (ranked recommendations with ROI)
Did my fix work? No Yes (before/after comparison)
What model should I use? No Yes (comparison with cost analysis)

Bottom line: Phoenix/Arize tell you something's wrong. PyVectorhound tells you what to do about it.

Speed Comparison

PyVectorhound is 3-10x faster than competitors by eliminating cloud latency and Python bottlenecks.

Metric Phoenix Arize Evidently PyVectorhound
Diagnosis Latency (100k docs) 200ms 250ms 150ms 45ms
Per-Embedding Quality Score - - 8.5ms 0.8ms
Corpus Analysis (1M docs) - - 45s 2.3s

Why so fast?

  • Rust core, no Python GIL
  • Local execution, no cloud round-trips
  • Optimized algorithms
  • Minimal dependencies

Feature Comparison Matrix

Feature Phoenix Arize Evidently Ragas PyVectorhound
Component Isolation No No No No Yes
Root Cause Analysis No No No No Yes
Recommendations No No No No Yes
Cost-Quality Analysis No No No No Yes
Model Comparison No No No No Yes
Drift Detection Yes Yes Yes No Yes
Real-time Scoring No No No No Yes
Hybrid Retrieval Focus No No No Yes Yes
Local Deployment No No Yes Yes Yes
Open Source Yes No Yes Yes Yes
No Vendor Lock-in Yes No Yes Yes Yes

Key Wins:

  • Only tool with component isolation
  • Only tool with cost-quality analysis
  • 4-19x faster than alternatives
  • 6 database adapters vs 2-3 competitors

Common Use Cases

Use Case 1: Diagnose Production Drop

# Your retrieval quality suddenly dropped
hound = Hound(db="qdrant", endpoint="prod-db:6333")
diagnosis = hound.diagnose(query="search term", top_k=5)
print(diagnosis.hunt())
# Get: component breakdown, root cause, fixes ranked by ROI

Use Case 2: Choose Best Embedding Model

# Should you upgrade to a larger embedding model?
comparison = hound.compare_models(
    model_type="embedding",
    candidates=["3-small", "3-large", "cohere-v3"]
)
print(comparison.report())
# Get: quality metrics, cost impact, ROI analysis

Use Case 3: Monitor Quality Over Time

# Track embedding quality in production
scorer = hound.quality_scorer()

# Score embeddings in real-time
quality = scorer.score(embedding_vector)
if quality["status"] == "WEAK":
    alert("Embedding quality degraded")

# Detect gradual drift
health = scorer.corpus_health()
if health["drift"] > 0.15:
    alert(f"15% quality degradation detected")

Troubleshooting

Error: "Database connection failed"

# Make sure your vector database is running
# For Qdrant:
docker run -p 6333:6333 qdrant/qdrant

# For Chroma:
pip install chromadb
# Chroma runs in-process by default

No results from diagnosis

# Make sure you have embeddings in your database
# PyVectorhound only works with existing vector data

# Verify database has data:
from qdrant_client import QdrantClient
client = QdrantClient("localhost", port=6333)
collection_info = client.get_collection("documents")
print(f"Total vectors: {collection_info.points_count}")

Common Issues

Issue Solution
"Collection not found" Create collection first before running PyVectorhound
"No query results" Ensure your database has documents indexed
"Slow diagnostics" For large corpora (>1M docs), diagnostics take longer. Use smaller top_k
"Missing expected_docs" Ground truth is optional. Diagnostics still work without it

API Quick Reference

from pyvectorhound import Hound

hound = Hound(db="qdrant", endpoint="localhost:6333")

# Core Methods
diagnosis = hound.diagnose(query="...", top_k=5, expected_docs=[...])
comparison = hound.compare_models(model_type="embedding", candidates=[...])
scorer = hound.quality_scorer()

# Diagnosis methods
diagnosis.hunt()              # Plain English report
diagnosis.metrics()           # Raw metrics by component
diagnosis.recommendations()   # Ranked fixes
diagnosis.root_cause()        # Root cause explanation

# Comparison methods
comparison.report()           # Side-by-side comparison
comparison.metrics()          # Quality/cost/latency data
comparison.pareto_frontier()  # Optimal models
comparison.ab_test(...)       # Setup A/B test

# Scorer methods
scorer.score(embedding)       # Score single embedding
scorer.corpus_health()        # Corpus-wide metrics
scorer.detect_anomalies(...)  # Find problematic embeddings
scorer.trend_analysis(...)    # Historical trends

Documentation

Performance Benchmarks

Measured on single machine (8 cores, 16GB RAM):

Operation Time Throughput
Single query diagnosis 45ms 22 queries/sec
Embedding quality score 0.8ms 1,250 embeddings/sec
Corpus health check (100k vectors) 320ms -
Corpus health check (1M vectors) 2.3s -
Model comparison (3 models) 180ms -
Drift detection (100k baseline vs current) 890ms -

Tested Against:

  • Qdrant (local)
  • 1536-dim OpenAI embeddings
  • Typical RAG corpus sizes (100k-1M documents)

vs Competitors:

  • Phoenix: 200ms diagnosis (4.4x slower)
  • Evidently: 150ms diagnosis (3.3x slower)
  • Arize: 250ms diagnosis (5.5x slower)

Why PyVectorhound is faster:

  • Rust core, not Python (no GIL)
  • Local execution (no network latency)
  • Optimized metric algorithms
  • Minimal dependencies

Requirements

  • Python 3.8+
  • Rust 1.70+ (for building from source)
  • Vector DB client (Qdrant, Chroma, etc.)

License

MIT License — See LICENSE for details.

PyVectorhound is free for commercial use.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Roadmap

  • v0.1 — Embedding Inspector + basic diagnostics
  • v0.2 — Hybrid retrieval engine (BM25 + vector + reranker)
  • v0.3 — Embedding versioning with zero-downtime migrations
  • v1.0 — Advanced optimization tools, full observability integration

Next Steps

  1. Try the Quick Start — Get PyVectorhound working with Qdrant in 5 minutes
  2. Read Use Cases — See which scenario matches your problem
  3. Check Benchmarks — Understand PyVectorhound's performance vs competitors
  4. Explore Roadmap — See what's planned (v0.2-v1.0)

Support

Authors

  • Georgi Mammen Mullassery — Original creator

Acknowledgments

Built with:

  • Rust ecosystem (fast, safe, embeddable)
  • PyO3 (Python bindings)
  • Open source community

Hunt down retrieval problems. Fix them fast.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyvectorhound-0.1.0.tar.gz (73.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyvectorhound-0.1.0-cp313-cp313-macosx_11_0_arm64.whl (236.0 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

File details

Details for the file pyvectorhound-0.1.0.tar.gz.

File metadata

  • Download URL: pyvectorhound-0.1.0.tar.gz
  • Upload date:
  • Size: 73.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for pyvectorhound-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b14e9f61636b8a9a7c3612e4bde739bed300ab5dc7227bdaa3265c82225b5ff9
MD5 f0b60b9f7e567d4e46c4bceaf9cb0f44
BLAKE2b-256 62afa1f498bbc33b33aa8beea41993faba0e21d187c41b077779ec638be07b66

See more details on using hashes here.

File details

Details for the file pyvectorhound-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pyvectorhound-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1ef121c992553c4297264555d5550a0faaf09573a86afd88ab3d05f98f4ab323
MD5 1404dcb34fe57bc85263805958bd2e4d
BLAKE2b-256 af7bc2aa1cbe2f2f432e43fd69f750feaff80aa886ca8c4b2b02c2634694e605

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page