A high-performance, local-first embedding utility for RAG and GraphRAG.
Project description
local-vectors
A python package for generating text vector embeddings locally with huggingface transformers.
Installation
Install via pip:
pip install local-vectors
Or using uv for faster dependency management:
uv add local-vectors
Usage
local-vectors is designed for high-performance, local-first embedding tasks. It automatically detects your hardware (supporting MPS for Apple Silicon and CUDA for NVIDIA GPUs) to ensure optimal performance without manual configuration.
1. Basic Embedding
Generate high-quality FP32 embeddings using standard sentence-transformer models.
from local_vectors import LocalEmbedder, detect_device
# Automatically selects CUDA, MPS, or CPU
device = detect_device()
client = LocalEmbedder("sentence-transformers/all-MiniLM-L6-v2", device=device)
text = "The quick brown fox jumps over the lazy dog."
embeddings = client.embed_text(text)
print(f"Dimension: {client.model_metadata['dims']}")
# Output: Dimension: 384
2. Binary Quantization
For large-scale retrieval, local-vectors supports binary embeddings. This reduces storage requirements significantly while maintaining high search accuracy via Hamming distance.
# Generate bit-packed binary vectors
binary_dict = client.embed_text(text, to_binary=True)
print(f"Binary dimension: {client.model_metadata['binary_dims']}")
# This returns a uint8 array suitable for Hamming distance search
3. Vector Database Integration (LanceDB)
The package includes a built-in wrapper for LanceDB, allowing you to manage and search your embeddings locally without a server.
from local_vectors import LanceDBConnection
import pyarrow as pa
# Initialize local database
db = LanceDBConnection("./my_vectors")
# Define a schema for FP32 vectors
schema = pa.schema([
pa.field("text", pa.string()),
pa.field("vector_full", pa.list_(pa.float32(), 384)),
])
db.create_table("documents", schema=schema)
# Batch embed and update
sentences = ["Machine learning is fascinating.", "I love AI."]
data = [
{"text": s, "vector_full": client.embed_text(s)[0]["vector_full"]}
for s in sentences
]
db.update_table("documents", data=data, mode="append")
4. Efficient Semantic Search
You can perform semantic searches using standard cosine similarity or lightning-fast Hamming distance for binary vectors.
# Search FP32 table
results = db.search_table(
table_name="documents",
query_vector=client.embed_text("AI and tech")[0]["vector_full"],
metric="cosine",
top_k=2
)
for r in results:
print(f"Text: {r['text']}, Score: {r['_distance']}")
Pro-Tips for Production
- Lazy Loading: Models are downloaded on-the-fly and cached locally in
~/.cache/local-vectors. - Hardware Acceleration: If you are running on a machine with a dedicated GPU,
local-vectorswill prioritize it automatically to speed up batch processing. - Data Types: Use the
to_binary=Trueflag when dealing with millions of documents to keep your memory footprint low.
Cache & Model Management
To keep performance high, local-vectors caches model weights and metadata locally. If you need to switch a model version or clear the cache, use the refresh_model method:
# This will delete local cached data for the current model and re-initialize
client.refresh_model()
Warning: If you refresh a model and the vector dimensions change (e.g., switching from a 384-dim to a 768-dim model), existing LanceDB tables using the old dimensions will become incompatible. You will need to create a new table or migrate your data.
Collaborative Filtering or RAG?
While this package is perfect for Retrieval-Augmented Generation (RAG), you can also use these local embeddings for Recommendation Systems. By embedding user interaction history alongside document content, you can calculate similarities locally to suggest related content without ever sending user data to an external API.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file local_vectors-0.1.0.tar.gz.
File metadata
- Download URL: local_vectors-0.1.0.tar.gz
- Upload date:
- Size: 19.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
886953147670eb87e0f5cf02d0d1d8590cc98af4ef2587afbd5263637aa7025e
|
|
| MD5 |
7f71b17f34aab03ccff282c0721b1c37
|
|
| BLAKE2b-256 |
c34a56270508a32c1f8f2f3dcd42ab67a739415d55c8107dbe4ae46ac8b6d40d
|
File details
Details for the file local_vectors-0.1.0-py3-none-any.whl.
File metadata
- Download URL: local_vectors-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba86a7b5554085a72cddfc6f8556e7451784e855d086e0ed9d04c41b3ce76dab
|
|
| MD5 |
e2e2e29dc9dc3575840eee72fac67d11
|
|
| BLAKE2b-256 |
999779f8d6403edc945cdf2edb6a3a97b81f4034a9bde52eaa84a5bbf544d753
|