Simple vector database operations with Qdrant
Project description
Ragger Simple
A simple Python package for vector database operations using Qdrant, designed for semantic search and document retrieval.
Features
- Initialize connection with Qdrant vector database (local or cloud)
- Parse, chunk, and process text into vector embeddings
- Search for relevant text chunks based on semantic similarity
- List collections in the database
- Purge collection data
- View collection statistics
Use Cases
- Create a semantic search engine for your documents
- Build a question-answering system with context retrieval
- Implement similarity search for content recommendations
- Create a knowledge base with semantic retrieval
Installation
pip install ragger-simple
Usage
Python API
Import and initialize VectorDB:
from ragger_simple import VectorDB
db = VectorDB(
collection_name="my_documents",
model_name="all-MiniLM-L6-v2",
model_path=None,
qdrant_url=None,
qdrant_api_key=None,
qdrant_path=None,
qdrant_timeout=500.0,
)
Constructor parameters:
collection_name(str, default:"documents") — Qdrant collection namemodel_name(str, default:"all-MiniLM-L6-v2") — sentence-transformers modelmodel_path(str, optional) — local folder with your modelqdrant_url(str, optional) — cloud URL (use with API key)qdrant_api_key(str, optional) — cloud API keyqdrant_path(str, optional) — local path for database storageqdrant_timeout(float, default:500) — request timeout in seconds
Methods:
# Add documents to the database
db.add_documents(
documents: Dict[str, str],
chunk_size: int = 200,
overlap: int = 50,
)
# Search for relevant chunks
results = db.search(
query: str,
k: int = 5,
) -> List[Dict]
# List all collections
collections = db.list_collections() -> List[str]
# Delete a collection
success = db.delete_collection(collection_name: Optional[str] = None) -> bool
# Purge all points from a collection but keep its structure
success = db.purge_collection(collection_name: Optional[str] = None) -> bool
# Get statistics about a collection
stats = db.get_collection_stats(collection_name: Optional[str] = None) -> Dict[str, Any]
Example:
documents = {
"Article 1": "This is the content of article 1...",
"Article 2": "This is the content of article 2..."
}
db.add_documents(documents, chunk_size=200, overlap=50)
results = db.search("your query here", k=5)
print(results)
CLI Commands
The CLI provides these commands:
# Initialize vector database (saves config for future commands)
ragger-simple init --model all-MiniLM-L6-v2 --collection documents --qdrant-url "https://your-qdrant-instance.com" --qdrant-key "your-api-key" --qdrant-path "/path/to/local/db"
# Process documents into vector database
ragger-simple process --input documents.json --chunk-size 200 --overlap 50
# Search for relevant chunks
ragger-simple search --query "your query here" --k 5 --output results.json
# List all collections
ragger-simple list-collections
# Delete all points from a collection but keep its structure
ragger-simple purge-collection --collection documents --confirm
# Completely delete a collection from the database
ragger-simple delete-collection --collection documents --confirm
# View collection statistics
ragger-simple collection-stats --collection documents
The CLI saves your connection settings in ~/.ragger-simple/config.json for convenience.
Configuration Guidelines
- Collection naming: Use descriptive names for different document sets
- Chunk size:
- Smaller (100-200 words): Better for precise Q&A
- Larger (300-500 words): Better for contextual understanding
- Model selection:
all-MiniLM-L6-v2: Good balance of performance and speedall-mpnet-base-v2: Higher quality but slower
- Local vs Cloud:
- Local: Specify
qdrant_pathfor persistence - Cloud: Use both
qdrant_urlandqdrant_api_key
- Local: Specify
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ragger_simple-0.2.0.tar.gz.
File metadata
- Download URL: ragger_simple-0.2.0.tar.gz
- Upload date:
- Size: 10.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8659347ab78bda925730677e89990972bcd406fc8c3e03cc133e50e322440210
|
|
| MD5 |
15fa3a2c170153d5a8b16caf983c637d
|
|
| BLAKE2b-256 |
a721383e321488f656c549506364309b913288182698b50acc3fc4649eaf7a9d
|
File details
Details for the file ragger_simple-0.2.0-py3-none-any.whl.
File metadata
- Download URL: ragger_simple-0.2.0-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
507c4f40b404e1d05094ab5b2e353cbd33a4f1453d9759c6e3276626484f290d
|
|
| MD5 |
0a42ad9231f8c52d41a962a4acca313c
|
|
| BLAKE2b-256 |
3651287aa2333909fece537cc662999d2f441a50050917ddc333578fefa5828b
|