Skip to main content

Yet Another RAG Pipeline(YARP) with strong focus on In Memory Vector DB with ANN

Project description

YARP - Yet Another RAG Pipeline

Documentation

YARP (Yet Another RAG Pipeline) is a lightweight, high-performance Python library focused on in-memory vector database operations with Approximate Nearest Neighbor (ANN) search. Built for fast document similarity search and retrieval-augmented generation (RAG) applications.

🚀 Key Features

  • Fast In-Memory Vector Search: Uses Annoy (Spotify's ANN library) for lightning-fast similarity search
  • Hybrid Scoring: Combines semantic similarity (via sentence transformers) with lexical similarity (Levenshtein distance)
  • Easy Document Management: Add, delete, and update documents dynamically
  • Persistence: Save and load your vector indices to/from disk
  • Lightweight: Minimal dependencies, maximum performance
  • Configurable: Adjustable similarity metrics, tree counts, and scoring weights
  • Type Safe: Built with Pydantic models for reliable data handling

📦 Installation

pip install python-yarp

Development Installation

git clone https://github.com/regmibijay/yarp.git
cd yarp
pip install -e .

🔧 Quick Start

Basic Usage

from yarp import LocalMemoryIndex

# Initialize with your documents
documents = [
    "The cat sat on the mat",
    "Python programming language", 
    "Machine learning with transformers",
    "Natural language processing",
    "Vector similarity search"
]

# Create and build the index
index = LocalMemoryIndex(documents, model_name="all-MiniLM-L6-v2")
index.process()

# Search for similar documents
results = index.query("programming languages", top_k=3)

# Access results
for result in results:
    print(f"Document: {result.document}")
    print(f"Score: {result.matching_score:.2f}%")
    print("---")

Advanced Usage with Hybrid Scoring

from yarp import LocalMemoryIndex

# Initialize index
index = LocalMemoryIndex(documents)
index.process(num_trees=256, metrics_type="angular")

# Query with custom weights
results = index.query(
    "machine learning algorithms",
    top_k=5,
    weight_semantic=0.7,      # 70% semantic similarity
    weight_levenshtein=0.3,   # 30% lexical similarity
    search_k=100             # Search more candidates for better accuracy
)

# Invert results (lowest to highest scores)
inverted_results = results.invert(inplace=False)

Document Management

# Add new documents
index.add("New document about artificial intelligence")
index.add(["Multiple", "documents", "at once"])

# Delete documents  
index.delete("The cat sat on the mat")

# Query updated index
results = index.query("AI and machine learning")

Persistence

# Save index to disk
index.backup("/path/to/backup/directory")

# Load index from disk
loaded_index = LocalMemoryIndex.load("/path/to/backup/directory")

# Continue using loaded index
results = loaded_index.query("your query here")

📖 API Reference

LocalMemoryIndex

The main class for creating and managing vector indices.

Constructor

LocalMemoryIndex(documents: List[str], model_name: str = "all-MiniLM-L6-v2")
  • documents: List of text documents to index
  • model_name: SentenceTransformer model name for embeddings

Methods

process(num_trees: int = 128, metrics_type: str = "angular")

Build the vector index with specified parameters.

  • num_trees: Number of trees in Annoy index (more trees = better accuracy, slower build)
  • metrics_type: Distance metric ("angular", "euclidean", "manhattan", "hamming", "dot")
query(q: str, top_k: int = 5, weight_semantic: float = 0.5, weight_levenshtein: float = 0.5, search_k: int = 50)

Search for similar documents.

  • q: Query string
  • top_k: Number of results to return
  • weight_semantic: Weight for semantic similarity (0.0-1.0)
  • weight_levenshtein: Weight for lexical similarity (0.0-1.0)
  • search_k: Number of candidates to search (higher = better accuracy)

Returns LocalMemorySearchResult object.

add(documents: str | List[str])

Add new documents to the index. Automatically rebuilds the index.

delete(document: str)

Remove a document from the index. Automatically rebuilds the index.

backup(path: str)

Save the index and metadata to disk.

load(path: str, model_name: str = "all-MiniLM-L6-v2")

Class method to load an index from disk.

Data Models

LocalMemorySearchResult

Container for search results with built-in iteration and sorting capabilities.

class LocalMemorySearchResult(BaseModel):
    results: List[LocalMemorySearchResultEntry]
    
    def __iter__(self):
        """Iterate over results"""
        
    def invert(self, inplace: bool = True):
        """Reverse sort order of results"""

LocalMemorySearchResultEntry

Individual search result entry.

class LocalMemorySearchResultEntry(BaseModel):
    document: str           # The matched document
    matching_score: float   # Similarity score (0-100%)

🎯 Use Cases

  • Document Similarity Search: Find similar documents in large collections
  • RAG Applications: Retrieve relevant context for language model prompts
  • Content Recommendation: Recommend similar articles, products, or content
  • Semantic Search: Search beyond exact keyword matching
  • Duplicate Detection: Find near-duplicate documents with hybrid scoring
  • Question Answering: Retrieve relevant passages for Q&A systems

⚡ Performance

YARP is optimized for speed and memory efficiency:

  • Fast Indexing: Efficient embedding generation and Annoy index building
  • Quick Queries: Sub-millisecond search times for most datasets
  • Memory Efficient: Stores embeddings in optimized Annoy format
  • Scalable: Tested with thousands of documents

Benchmarks

Operation Small (10 docs) Medium (100 docs) Large (1K docs)
Index Build <1s ~3s ~15s
Query Time <1ms <5ms <10ms
Memory Usage ~10MB ~50MB ~200MB

Benchmarks run on standard laptop with all-MiniLM-L6-v2 model

🛠️ Configuration

Model Selection

Choose from various SentenceTransformer models based on your needs:

# Lightweight and fast
index = LocalMemoryIndex(docs, model_name="all-MiniLM-L6-v2")

# Better accuracy, slower
index = LocalMemoryIndex(docs, model_name="all-mpnet-base-v2")

# Multilingual support
index = LocalMemoryIndex(docs, model_name="paraphrase-multilingual-MiniLM-L12-v2")

Distance Metrics

  • angular: Cosine similarity (default, good for text)
  • euclidean: L2 distance
  • manhattan: L1 distance
  • dot: Dot product similarity

Tuning Parameters

  • num_trees: Higher values increase accuracy but slow down indexing
  • search_k: Higher values increase query accuracy but slow down search
  • weight_semantic/weight_levenshtein: Balance between semantic and lexical matching

🚦 Error Handling

YARP provides specific exception types for different error conditions:

from yarp.exceptions import (
    LocalMemoryTreeNotBuildException,
    LocalMemoryBadRequestException
)

try:
    results = index.query("test query")
except LocalMemoryTreeNotBuildException:
    print("Index not built yet - call process() first")
except LocalMemoryBadRequestException as e:
    print(f"Invalid request: {e}")

🧪 Testing

Run the test suite:

# Run all tests
pytest

# Run with coverage
pytest --cov=yarp

# Run only fast tests (skip integration)
pytest -m "not slow"

# Run integration tests
pytest -m integration

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Quick Development Setup

# Clone the repository
git clone https://github.com/regmibijay/yarp.git
cd yarp

# Install in development mode with dev dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Run tests
pytest

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📈 Roadmap

  • Support for more embedding models (OpenAI, Cohere, etc.)
  • Batch query operations
  • Distributed index support
  • Integration with popular vector databases
  • Web API interface
  • Advanced filtering capabilities

📞 Support


Made with ❤️ for the Python community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_yarp-0.1.1.tar.gz (113.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_yarp-0.1.1-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file python_yarp-0.1.1.tar.gz.

File metadata

  • Download URL: python_yarp-0.1.1.tar.gz
  • Upload date:
  • Size: 113.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for python_yarp-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5ee302138eaf14b44871cd83cb3e6982a87aea3845333ee6305d586e3e80821d
MD5 82b36d897462a211a8e0fd157c773d17
BLAKE2b-256 7c3e39ec913dd489aea68db97712c2bd75ed4ceb1d30db5933a905323ac7753b

See more details on using hashes here.

Provenance

The following attestation bundles were made for python_yarp-0.1.1.tar.gz:

Publisher: pip_package.yml on regmibijay/yarp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file python_yarp-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: python_yarp-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 12.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for python_yarp-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b61592c22242180edc884c89002318eebbde4da3580298e8c007c39aa6c73f2a
MD5 65bf62fab537849f52f89d02d44514f7
BLAKE2b-256 24b474825ba104022689ee67323577e5f30384fc792dc2f185108f1fbf42b804

See more details on using hashes here.

Provenance

The following attestation bundles were made for python_yarp-0.1.1-py3-none-any.whl:

Publisher: pip_package.yml on regmibijay/yarp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page