Skip to main content

A lightweight, efficient vector database with similarity search capabilities

Project description

VecStream

A lightweight, efficient vector database with similarity search capabilities, optimized for machine learning applications.

Features

  • 🚀 Fast in-memory vector storage and retrieval
  • 🔍 Semantic similarity search using cosine similarity
  • 📊 Built-in text embedding using Sentence Transformers
  • 💻 Clean CLI interface for easy interaction
  • 🛠 Python API for programmatic access

Installation

pip install vecstream

CLI Usage

VecStream provides a command-line interface for common operations:

Add a text entry

vecstream add "Your text here" text_id_1

Search for similar entries

vecstream search "Query text" --k 5 --threshold 0.5

Get vector by ID

vecstream get text_id_1

Remove vector

vecstream remove text_id_1

Show database info

vecstream info

Clear database

vecstream clear

Python API Usage

from vecstream import VectorStore, IndexManager, QueryEngine
from sentence_transformers import SentenceTransformer

# Initialize components
store = VectorStore()
index_manager = IndexManager(store)
query_engine = QueryEngine(index_manager)
model = SentenceTransformer('all-MiniLM-L6-v2')

# Add vectors
text = "Example text"
vector = model.encode(text)
store.add("doc1", vector)

# Search
query_vector = model.encode("Search query")
results = query_engine.search(query_vector, k=5)
for id, similarity in results:
    print(f"Match {id}: {similarity:.4f}")

Performance

  • Fast query response times (typically < 10ms)
  • Efficient memory usage
  • Linear scaling with dataset size
  • Support for concurrent queries

Technical Details

  • Uses Sentence Transformers for text embedding
  • 384-dimensional vectors by default
  • Cosine similarity for vector comparison
  • In-memory storage with optional persistence
  • Rich CLI interface with progress indicators

Requirements

  • Python 3.8+
  • numpy
  • scipy
  • scikit-learn
  • sentence-transformers
  • click
  • rich

License

MIT License

Author

Torin Etheridge

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vecstream-0.1.0.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vecstream-0.1.0-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file vecstream-0.1.0.tar.gz.

File metadata

  • Download URL: vecstream-0.1.0.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for vecstream-0.1.0.tar.gz
Algorithm Hash digest
SHA256 baa1468ba43d389bb125815d50e8f59ff772089486400d192d39b6b4c33b57d1
MD5 e3e89e1457e2c1f59152cab5c3ca9559
BLAKE2b-256 bcd2da8a5c859b4a65a10b9a2db0d6db987393ddd7ca4f3b3950c4f69bfc431c

See more details on using hashes here.

File details

Details for the file vecstream-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vecstream-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for vecstream-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6f0e1896188623e64811b23833681e3c6f6eaf4772556210202d11988d1ef028
MD5 aff7c63b3381ee0e061c0e0f50158707
BLAKE2b-256 deefc1d72c1515858b0e37c15a2c447f08c5f52d59972ee10bd2bcb4c990b332

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page