A Rust-based in-memory vector store with fastembed
Project description
FastEmbed VectorStore
A high-performance, Rust-based in-memory vector store with FastEmbed integration for Python applications.
Overview
FastEmbed VectorStore is a lightweight, fast vector database that leverages the power of Rust and the FastEmbed library to provide efficient text embedding and similarity search capabilities. It's designed for applications that need quick semantic search without the overhead of external database systems.
Features
- 🚀 High Performance: Built in Rust with Python bindings for optimal speed
- 🧠 Multiple Embedding Models: Support for 30+ pre-trained embedding models including BGE, Nomic, GTE, and more
- 💾 In-Memory Storage: Fast in-memory vector storage with persistence capabilities
- 🔍 Similarity Search: Cosine similarity-based search with customizable result limits
- 💾 Save/Load: Persist and restore vector stores to/from JSON files
- 🐍 Python Integration: Seamless Python API with PyO3 bindings
Supported Embedding Models
The library supports a wide variety of embedding models:
- BGE Models: BGEBaseENV15, BGELargeENV15, BGESmallENV15 (with quantized variants)
- Nomic Models: NomicEmbedTextV1, NomicEmbedTextV15 (with quantized variants)
- GTE Models: GTEBaseENV15, GTELargeENV15 (with quantized variants)
- Multilingual Models: MultilingualE5Small, MultilingualE5Base, MultilingualE5Large
- Specialized Models: ClipVitB32, JinaEmbeddingsV2BaseCode, ModernBertEmbedLarge
- And many more...
Installation
Prerequisites
- Python 3.8 or higher
- Rust toolchain (to build from source)
Install from PyPI
pip install fastembed-vectorstore
From Source
- Clone the repository:
git clone https://github.com/sauravniraula/fastembed_vectorstore.git
cd fastembed_vectorstore
- Install the package:
maturin develop
Quick Start
from fastembed_vectorstore import FastembedVectorstore, FastembedEmbeddingModel
# Initialize with a model
model = FastembedEmbeddingModel.BGESmallENV15
vectorstore = FastembedVectorstore(model)
# Optional Configurations
# vectorstore = FastembedVectorstore(
# model,
# show_download_progress=False, # default: True
# cache_directory="fastembed_cache", # default: fastembed_cache
# )
# Add documents
documents = [
"The quick brown fox jumps over the lazy dog",
"A quick brown dog jumps over the lazy fox",
"The lazy fox sleeps while the quick brown dog watches",
"Python is a programming language",
"Rust is a systems programming language"
]
# Embed and store documents
success = vectorstore.embed_documents(documents)
print(f"Documents embedded: {success}")
# Search for similar documents
query = "What is Python?"
results = vectorstore.search(query, n=3)
for doc, similarity in results:
print(f"Document: {doc}")
print(f"Similarity: {similarity:.4f}")
print("---")
# Save the vector store
vectorstore.save("my_vectorstore.json")
# Load the vector store later
loaded_vectorstore = FastembedVectorstore.load(model, "my_vectorstore.json")
# Optional Configurations
# loaded_vectorstore = FastembedVectorstore.load(
# model,
# "my_vectorstore.json",
# show_download_progress=False, # default: True
# cache_directory="fastembed_cache", # default: fastembed_cache
# )
API Reference
FastembedEmbeddingModel
Enum containing all supported embedding models. Choose based on your use case:
- Small models: Faster, lower memory usage (e.g.,
BGESmallENV15) - Base models: Balanced performance (e.g.,
BGEBaseENV15) - Large models: Higher quality embeddings (e.g.,
BGELargeENV15) - Quantized models: Reduced memory usage (e.g.,
BGESmallENV15Q)
FastembedVectorstore
Constructor
vectorstore = FastembedVectorstore(
model: FastembedEmbeddingModel,
show_download_progress: bool | None = ...,
cache_directory: str | os.PathLike[str] | None = ...,
)
Args:
model: Embedding model to use.show_download_progress: Whether to show model download progress. Defaults to True.cache_directory: Directory to cache/download model files. Defaults to./fastembed.
Methods
embed_documents(documents: List[str]) -> bool
Embeds a list of documents and stores them in the vector store.
search(query: str, n: int) -> List[Tuple[str, float]]
Searches for the most similar documents to the query. Returns a list of tuples containing (document, similarity_score).
save(path: str) -> bool
Saves the vector store to a JSON file.
load(model: FastembedEmbeddingModel, path: str) -> FastembedVectorstore
Loads a vector store from a JSON file.
Performance Considerations
- Memory Usage: All embeddings are stored in memory, so consider the size of your document collection
- Model Selection: Smaller models are faster but may have lower quality embeddings
- Batch Processing: The
embed_documentsmethod processes documents in batches for efficiency
Use Cases
- Semantic Search: Find documents similar to a query
- Document Clustering: Group similar documents together
- Recommendation Systems: Find similar items or content
- Question Answering: Retrieve relevant context for Q&A systems
- Content Discovery: Help users find related content
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Author
- Saurav Niraula - sauravniraula
- Email: developmentsaurav@gmail.com
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastembed_vectorstore-0.5.0.tar.gz.
File metadata
- Download URL: fastembed_vectorstore-0.5.0.tar.gz
- Upload date:
- Size: 34.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9bf3ad38d3b913464f60bfc90ab958480df77cb474bbb0ec4f539613464338d
|
|
| MD5 |
a15c68d2d8b328a254d62c3e30d24d40
|
|
| BLAKE2b-256 |
94234bb4e3587c65c683aa4d43c8172fdee0c6492917aa02f4d8749ead6708b2
|
File details
Details for the file fastembed_vectorstore-0.5.0-cp314-cp314t-win_amd64.whl.
File metadata
- Download URL: fastembed_vectorstore-0.5.0-cp314-cp314t-win_amd64.whl
- Upload date:
- Size: 8.8 MB
- Tags: CPython 3.14t, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2353d6332ee0ffe7d81f99c67075bf7c9bfc85a1d84057b5593441c53a99a288
|
|
| MD5 |
473eb3afa2dbf2c646accdab75357ea4
|
|
| BLAKE2b-256 |
3a8e6d1c51c7e6475a144c3f7dce0bc88be452723e9743d528d6a143fb56c4db
|
File details
Details for the file fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 11.8 MB
- Tags: CPython 3.14t, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd56cf96ea4700bf6c710cf8a296cc01531c5e3ec993f1943c9730be1387fe4a
|
|
| MD5 |
f70cc8b3aea0cd41ca68544e8f6545cd
|
|
| BLAKE2b-256 |
32e1d86cc1a40c000aa8b72d432452080d0b66661bf9cfc9c3946cde9b94a313
|
File details
Details for the file fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 11.1 MB
- Tags: CPython 3.14t, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b22a954f0d10ca50e1befa5258f407f66259d5267b507350e5f79eda76ab815d
|
|
| MD5 |
9add3cb3164f568aaa9cad3d64a7b2a7
|
|
| BLAKE2b-256 |
b21459c0afe7f53865ac08606fcd089ea0faaf03bf15f63e4ed438182d5e42c9
|
File details
Details for the file fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_11_0_arm64.whl.
File metadata
- Download URL: fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_11_0_arm64.whl
- Upload date:
- Size: 9.0 MB
- Tags: CPython 3.14t, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
675555cd3a314e2f1f6cb6a25cd2bedb4c3d9ca90ba5f833acc875fb03582fab
|
|
| MD5 |
abc5fb22b22d66a9afbb6ccc9617309c
|
|
| BLAKE2b-256 |
3b96b6751ae00c49645e815a4e8059072be13d19d1138249a063051e4f2afa65
|
File details
Details for the file fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_10_12_x86_64.whl.
File metadata
- Download URL: fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_10_12_x86_64.whl
- Upload date:
- Size: 10.0 MB
- Tags: CPython 3.14t, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ca7606729803276e93e5f01439d109e9c98b40e9cbf3e76737e08be7ebd3969
|
|
| MD5 |
95b986c3b1b3d8d1d7b290d5116ccdf3
|
|
| BLAKE2b-256 |
362806401d03fe7366c776e7206cbdc4c3448d2beae2b71ba4f9a3dbb6ca5023
|
File details
Details for the file fastembed_vectorstore-0.5.0-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: fastembed_vectorstore-0.5.0-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 8.8 MB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e67c88f2311c37c43315b4254baf2034263e746e4091936e8c7b40f20989a305
|
|
| MD5 |
1102515492f2cef2adef908db81034c0
|
|
| BLAKE2b-256 |
66b43c69009d924a51e0b0283fcb23da0456e08cec7482d0682bf7c5bf4fe442
|
File details
Details for the file fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 11.8 MB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
984a1e176c04653f0e433e469475c08449a0a702a60464195d98ab28e09eb75b
|
|
| MD5 |
87ff598d726d83f8fe4b969b48a50616
|
|
| BLAKE2b-256 |
57204611cc2b82fbb427ec4fe428f70de20392f9e87412de2d087763c5654698
|
File details
Details for the file fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 11.1 MB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31c843ea2dfff66c5ceb809f24f75d875275bcfaee9e93e44ceef86c2af510c5
|
|
| MD5 |
6ae8d7c0a287809dd92cb4d575ffb3c2
|
|
| BLAKE2b-256 |
022c6bc7824e9a1a4c7703e258142f58a17799c605e89ce326d0ae58e7798554
|
File details
Details for the file fastembed_vectorstore-0.5.0-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: fastembed_vectorstore-0.5.0-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 9.0 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1c5e63d59880f673262e924b74708e0e8e9419da8fac9846bed8781d6bcd77d
|
|
| MD5 |
fffb6cf9e9b2ea9f8abeb5d4ecbf4b91
|
|
| BLAKE2b-256 |
863b224306030fd32b46f46bd7a928a1cca2a56db6fef242b2b5c609cdf231ed
|
File details
Details for the file fastembed_vectorstore-0.5.0-cp38-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: fastembed_vectorstore-0.5.0-cp38-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 10.0 MB
- Tags: CPython 3.8+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05f3e172c7da84dcd09fafb3c5ec6d757c2947e773e768a1c3daa22b940ab6ee
|
|
| MD5 |
c754330221b1fa940097851260ebf531
|
|
| BLAKE2b-256 |
54ee5e031c882502f60f8c370671b1f154b08554c386eee6eae472d02b564980
|