Skip to main content

A lightweight, efficient vector database with similarity search capabilities

Project description

VecStream

A lightweight, efficient vector database with similarity search capabilities, optimized for machine learning applications.

Features

  • 🚀 Fast in-memory vector storage and retrieval
  • 🔍 Semantic similarity search using cosine similarity
  • 📊 Built-in text embedding using Sentence Transformers
  • 💻 Clean CLI interface for easy interaction
  • 🛠 Python API for programmatic access

Installation

pip install vecstream

CLI Usage

VecStream provides a command-line interface for common operations:

Add a text entry

vecstream add "Your text here" text_id_1

Search for similar entries

vecstream search "Query text" --k 5 --threshold 0.5

Get vector by ID

vecstream get text_id_1

Remove vector

vecstream remove text_id_1

Show database info

vecstream info

Clear database

vecstream clear

Python API Usage

from vecstream import VectorStore, IndexManager, QueryEngine
from sentence_transformers import SentenceTransformer

# Initialize components
store = VectorStore()
index_manager = IndexManager(store)
query_engine = QueryEngine(index_manager)
model = SentenceTransformer('all-MiniLM-L6-v2')

# Add vectors
text = "Example text"
vector = model.encode(text)
store.add("doc1", vector)

# Search
query_vector = model.encode("Search query")
results = query_engine.search(query_vector, k=5)
for id, similarity in results:
    print(f"Match {id}: {similarity:.4f}")

Performance

  • Fast query response times (typically < 10ms)
  • Efficient memory usage
  • Linear scaling with dataset size
  • Support for concurrent queries

Technical Details

  • Uses Sentence Transformers for text embedding
  • 384-dimensional vectors by default
  • Cosine similarity for vector comparison
  • In-memory storage with optional persistence
  • Rich CLI interface with progress indicators

Requirements

  • Python 3.8+
  • numpy
  • scipy
  • scikit-learn
  • sentence-transformers
  • click
  • rich

License

MIT License

Author

Torin Etheridge

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vecstream-0.1.1.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vecstream-0.1.1-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file vecstream-0.1.1.tar.gz.

File metadata

  • Download URL: vecstream-0.1.1.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for vecstream-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8483ded65a7ddc6bd1a5e3e0ec051c505f5e7d34177d571edd9b74df294316a0
MD5 250ae6438ec045793a095c5b1a0b1790
BLAKE2b-256 9b2ac8744f0d134e3a0763f2f7cfe6aca80d67f527fdc611117568446df2e3cc

See more details on using hashes here.

File details

Details for the file vecstream-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: vecstream-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for vecstream-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9325a455ccc0c68367ae6310453e555fe1ec993a78d43b34452ed9bcd0e9e342
MD5 f24a132030b194da1b63d9c1702201b5
BLAKE2b-256 258d8a1830c3d2f06fe16282f53119fd1dbab860c1a947f5e411740c1054ea3b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page