Unified retrieval module for RAG system with multiple vector database support
Project description
Retriever
Unified retrieval module for RAG system with support for multiple vector databases.
Features
- Multiple vector database backends: Qdrant, ChromaDB, Milvus
- Filename search: Separate collection for efficient filename-based search
- Context enrichment: Fetch neighboring chunks for better context
- Category filtering: Filter results by accessible categories
- Unified interface: Single API for all vector stores
Installation
poetry add donkit-retriever
Usage
Basic Setup
from donkit.retriever import create_vectorstore_service, RetrievalConfig
from langchain.embeddings import OpenAIEmbeddings
# Configure retrieval options
config = RetrievalConfig(
vector_database="qdrant",
retriever_options={
"filename_search": True,
"partial_search": True,
"max_retrieved_docs": 10,
}
)
# Create service
embeddings = OpenAIEmbeddings()
service = create_vectorstore_service(
db_type="qdrant",
embeddings=embeddings,
config=config,
collection_name="my_collection",
database_uri="http://localhost:6333",
)
# Search documents
documents = await service.search_documents(
query="What is RAG?",
k=5
)
Supported Vector Databases
Qdrant
service = create_vectorstore_service(
db_type="qdrant",
embeddings=embeddings,
config=config,
database_uri="http://localhost:6333",
)
ChromaDB
service = create_vectorstore_service(
db_type="chroma",
embeddings=embeddings,
config=config,
database_uri="http://localhost:8000",
)
Milvus
service = create_vectorstore_service(
db_type="milvus",
embeddings=embeddings,
config=config,
database_uri="http://localhost:19530",
)
Configuration Options
from donkit.retriever import RetrievalConfig, RetrieverOptions
config = RetrievalConfig(
vector_database="qdrant", # qdrant | chroma | milvus
retriever_options=RetrieverOptions(
filename_search=True, # Enable filename-based search
partial_search=True, # Fetch neighboring chunks
max_retrieved_docs=10, # Max documents to retrieve
),
ranker="http://ranker-service:8000", # Optional reranker URL
)
Architecture
VectorstoreModule
Each database has its own module implementing VectorstoreModuleAbstract:
QdrantVectorstoreModuleChromaVectorstoreModuleMilvusVectorstoreModule
VectorstoreService
Orchestrates search operations:
- Filename search (if enabled)
- Vector search
- Neighbor fetching (if partial_search enabled)
- Document combination and deduplication
Development
# Install dependencies
poetry install
# Run tests
poetry run pytest
# Run linter
poetry run ruff check .
License
Proprietary
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
donkit_retriever-0.1.3.tar.gz
(10.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file donkit_retriever-0.1.3.tar.gz.
File metadata
- Download URL: donkit_retriever-0.1.3.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.13.0 Linux/6.8.0-1041-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c99afcfa883e1171912dd2b3a05ad61183ffe94c4e82addeea844aa1f06e658
|
|
| MD5 |
6ca7a9b94bcbd39f1c71efdd2c8b5c54
|
|
| BLAKE2b-256 |
a755635b9eec05511e6fae365701c0bcde138ff81f909c1541cee09237fda3b6
|
File details
Details for the file donkit_retriever-0.1.3-py3-none-any.whl.
File metadata
- Download URL: donkit_retriever-0.1.3-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.13.0 Linux/6.8.0-1041-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2af2fcd8373ecdb5de6f3aa03b9790c79cf4b4be5b02e5a0b6fa7e34101b2e26
|
|
| MD5 |
cdb9f5f2bc5a7f03a615ca2faba33a13
|
|
| BLAKE2b-256 |
642178d759210d75a671b6eb2661fcb7e5f7a835d14aaade4b47aea60165db21
|