A lightweight, efficient vector database with similarity search capabilities
Project description
VecStream
A lightweight, efficient vector database with similarity search capabilities, optimized for machine learning applications.
Features
- 🚀 Fast in-memory vector storage and retrieval
- 🔍 Semantic similarity search using cosine similarity
- 📊 Built-in text embedding using Sentence Transformers
- 💻 Clean CLI interface for easy interaction
- 🛠 Python API for programmatic access
Installation
pip install vecstream
CLI Usage
VecStream provides a command-line interface for common operations:
Add a text entry
vecstream add "Your text here" text_id_1
Search for similar entries
vecstream search "Query text" --k 5 --threshold 0.5
Get vector by ID
vecstream get text_id_1
Remove vector
vecstream remove text_id_1
Show database info
vecstream info
Clear database
vecstream clear
Python API Usage
from vecstream import VectorStore, IndexManager, QueryEngine
from sentence_transformers import SentenceTransformer
# Initialize components
store = VectorStore()
index_manager = IndexManager(store)
query_engine = QueryEngine(index_manager)
model = SentenceTransformer('all-MiniLM-L6-v2')
# Add vectors
text = "Example text"
vector = model.encode(text)
store.add("doc1", vector)
# Search
query_vector = model.encode("Search query")
results = query_engine.search(query_vector, k=5)
for id, similarity in results:
print(f"Match {id}: {similarity:.4f}")
Performance
- Fast query response times (typically < 10ms)
- Efficient memory usage
- Linear scaling with dataset size
- Support for concurrent queries
Technical Details
- Uses Sentence Transformers for text embedding
- 384-dimensional vectors by default
- Cosine similarity for vector comparison
- In-memory storage with optional persistence
- Rich CLI interface with progress indicators
Requirements
- Python 3.8+
- numpy
- scipy
- scikit-learn
- sentence-transformers
- click
- rich
License
MIT License
Author
Torin Etheridge
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vecstream-0.1.1.tar.gz
(10.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
vecstream-0.1.1-py3-none-any.whl
(11.3 kB
view details)
File details
Details for the file vecstream-0.1.1.tar.gz.
File metadata
- Download URL: vecstream-0.1.1.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8483ded65a7ddc6bd1a5e3e0ec051c505f5e7d34177d571edd9b74df294316a0
|
|
| MD5 |
250ae6438ec045793a095c5b1a0b1790
|
|
| BLAKE2b-256 |
9b2ac8744f0d134e3a0763f2f7cfe6aca80d67f527fdc611117568446df2e3cc
|
File details
Details for the file vecstream-0.1.1-py3-none-any.whl.
File metadata
- Download URL: vecstream-0.1.1-py3-none-any.whl
- Upload date:
- Size: 11.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9325a455ccc0c68367ae6310453e555fe1ec993a78d43b34452ed9bcd0e9e342
|
|
| MD5 |
f24a132030b194da1b63d9c1702201b5
|
|
| BLAKE2b-256 |
258d8a1830c3d2f06fe16282f53119fd1dbab860c1a947f5e411740c1054ea3b
|