Skip to main content

A Rust-based in-memory vector store with fastembed

Project description

FastEmbed VectorStore

GitHub

A high-performance, Rust-based in-memory vector store with FastEmbed integration for Python applications.

Overview

FastEmbed VectorStore is a lightweight, fast vector database that leverages the power of Rust and the FastEmbed library to provide efficient text embedding and similarity search capabilities. It's designed for applications that need quick semantic search without the overhead of external database systems.

Features

  • 🚀 High Performance: Built in Rust with Python bindings for optimal speed
  • 🧠 Multiple Embedding Models: Support for 30+ pre-trained embedding models including BGE, Nomic, GTE, and more
  • 💾 In-Memory Storage: Fast in-memory vector storage with persistence capabilities
  • 🔍 Similarity Search: Cosine similarity-based search with customizable result limits
  • 💾 Save/Load: Persist and restore vector stores to/from JSON files
  • 🐍 Python Integration: Seamless Python API with PyO3 bindings

Supported Embedding Models

The library supports a wide variety of embedding models:

  • BGE Models: BGEBaseENV15, BGELargeENV15, BGESmallENV15 (with quantized variants)
  • Nomic Models: NomicEmbedTextV1, NomicEmbedTextV15 (with quantized variants)
  • GTE Models: GTEBaseENV15, GTELargeENV15 (with quantized variants)
  • Multilingual Models: MultilingualE5Small, MultilingualE5Base, MultilingualE5Large
  • Specialized Models: ClipVitB32, JinaEmbeddingsV2BaseCode, ModernBertEmbedLarge
  • And many more...

Installation

Prerequisites

  • Python 3.8 or higher
  • Rust toolchain (to build from source)

Install from PyPI

pip install fastembed-vectorstore

From Source

  1. Clone the repository:
git clone https://github.com/sauravniraula/fastembed_vectorstore.git
cd fastembed_vectorstore
  1. Install the package:
maturin develop

Quick Start

from fastembed_vectorstore import FastembedVectorstore, FastembedEmbeddingModel

# Initialize with a model
model = FastembedEmbeddingModel.BGESmallENV15
vectorstore = FastembedVectorstore(model)

# Optional Configurations
# vectorstore = FastembedVectorstore(
#     model,
#     show_download_progress=False,           # default: True
#     cache_directory="fastembed_cache",      # default: fastembed_cache
# )

# Add documents
documents = [
    "The quick brown fox jumps over the lazy dog",
    "A quick brown dog jumps over the lazy fox",
    "The lazy fox sleeps while the quick brown dog watches",
    "Python is a programming language",
    "Rust is a systems programming language"
]

# Embed and store documents
success = vectorstore.embed_documents(documents)
print(f"Documents embedded: {success}")

# Search for similar documents
query = "What is Python?"
results = vectorstore.search(query, n=3)

for doc, similarity in results:
    print(f"Document: {doc}")
    print(f"Similarity: {similarity:.4f}")
    print("---")

# Save the vector store
vectorstore.save("my_vectorstore.json")

# Load the vector store later
loaded_vectorstore = FastembedVectorstore.load(model, "my_vectorstore.json")

# Optional Configurations
# loaded_vectorstore = FastembedVectorstore.load(
#     model,
#     "my_vectorstore.json",
#     show_download_progress=False,           # default: True
#     cache_directory="fastembed_cache",      # default: fastembed_cache
# )

API Reference

FastembedEmbeddingModel

Enum containing all supported embedding models. Choose based on your use case:

  • Small models: Faster, lower memory usage (e.g., BGESmallENV15)
  • Base models: Balanced performance (e.g., BGEBaseENV15)
  • Large models: Higher quality embeddings (e.g., BGELargeENV15)
  • Quantized models: Reduced memory usage (e.g., BGESmallENV15Q)

FastembedVectorstore

Constructor

vectorstore = FastembedVectorstore(
    model: FastembedEmbeddingModel,
    show_download_progress: bool | None = ...,
    cache_directory: str | os.PathLike[str] | None = ...,
)

Args:

  • model: Embedding model to use.
  • show_download_progress: Whether to show model download progress. Defaults to True.
  • cache_directory: Directory to cache/download model files. Defaults to ./fastembed.

Methods

embed_documents(documents: List[str]) -> bool

Embeds a list of documents and stores them in the vector store.

search(query: str, n: int) -> List[Tuple[str, float]]

Searches for the most similar documents to the query. Returns a list of tuples containing (document, similarity_score).

save(path: str) -> bool

Saves the vector store to a JSON file.

load(model: FastembedEmbeddingModel, path: str) -> FastembedVectorstore

Loads a vector store from a JSON file.

Performance Considerations

  • Memory Usage: All embeddings are stored in memory, so consider the size of your document collection
  • Model Selection: Smaller models are faster but may have lower quality embeddings
  • Batch Processing: The embed_documents method processes documents in batches for efficiency

Use Cases

  • Semantic Search: Find documents similar to a query
  • Document Clustering: Group similar documents together
  • Recommendation Systems: Find similar items or content
  • Question Answering: Retrieve relevant context for Q&A systems
  • Content Discovery: Help users find related content

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Author

Acknowledgments

  • Built with FastEmbed for efficient text embeddings
  • Uses PyO3 for Python-Rust bindings
  • Inspired by the need for fast, lightweight vector storage solutions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastembed_vectorstore-0.5.0.tar.gz (34.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fastembed_vectorstore-0.5.0-cp314-cp314t-win_amd64.whl (8.8 MB view details)

Uploaded CPython 3.14tWindows x86-64

fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.8 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ x86-64

fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (11.1 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ ARM64

fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_11_0_arm64.whl (9.0 MB view details)

Uploaded CPython 3.14tmacOS 11.0+ ARM64

fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_10_12_x86_64.whl (10.0 MB view details)

Uploaded CPython 3.14tmacOS 10.12+ x86-64

fastembed_vectorstore-0.5.0-cp38-abi3-win_amd64.whl (8.8 MB view details)

Uploaded CPython 3.8+Windows x86-64

fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.8 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (11.1 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

fastembed_vectorstore-0.5.0-cp38-abi3-macosx_11_0_arm64.whl (9.0 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

fastembed_vectorstore-0.5.0-cp38-abi3-macosx_10_12_x86_64.whl (10.0 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file fastembed_vectorstore-0.5.0.tar.gz.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0.tar.gz
  • Upload date:
  • Size: 34.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0.tar.gz
Algorithm Hash digest
SHA256 b9bf3ad38d3b913464f60bfc90ab958480df77cb474bbb0ec4f539613464338d
MD5 a15c68d2d8b328a254d62c3e30d24d40
BLAKE2b-256 94234bb4e3587c65c683aa4d43c8172fdee0c6492917aa02f4d8749ead6708b2

See more details on using hashes here.

File details

Details for the file fastembed_vectorstore-0.5.0-cp314-cp314t-win_amd64.whl.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0-cp314-cp314t-win_amd64.whl
  • Upload date:
  • Size: 8.8 MB
  • Tags: CPython 3.14t, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0-cp314-cp314t-win_amd64.whl
Algorithm Hash digest
SHA256 2353d6332ee0ffe7d81f99c67075bf7c9bfc85a1d84057b5593441c53a99a288
MD5 473eb3afa2dbf2c646accdab75357ea4
BLAKE2b-256 3a8e6d1c51c7e6475a144c3f7dce0bc88be452723e9743d528d6a143fb56c4db

See more details on using hashes here.

File details

Details for the file fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 11.8 MB
  • Tags: CPython 3.14t, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fd56cf96ea4700bf6c710cf8a296cc01531c5e3ec993f1943c9730be1387fe4a
MD5 f70cc8b3aea0cd41ca68544e8f6545cd
BLAKE2b-256 32e1d86cc1a40c000aa8b72d432452080d0b66661bf9cfc9c3946cde9b94a313

See more details on using hashes here.

File details

Details for the file fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.14t, manylinux: glibc 2.17+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b22a954f0d10ca50e1befa5258f407f66259d5267b507350e5f79eda76ab815d
MD5 9add3cb3164f568aaa9cad3d64a7b2a7
BLAKE2b-256 b21459c0afe7f53865ac08606fcd089ea0faaf03bf15f63e4ed438182d5e42c9

See more details on using hashes here.

File details

Details for the file fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_11_0_arm64.whl.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 9.0 MB
  • Tags: CPython 3.14t, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 675555cd3a314e2f1f6cb6a25cd2bedb4c3d9ca90ba5f833acc875fb03582fab
MD5 abc5fb22b22d66a9afbb6ccc9617309c
BLAKE2b-256 3b96b6751ae00c49645e815a4e8059072be13d19d1138249a063051e4f2afa65

See more details on using hashes here.

File details

Details for the file fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_10_12_x86_64.whl.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_10_12_x86_64.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: CPython 3.14t, macOS 10.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0-cp314-cp314t-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 0ca7606729803276e93e5f01439d109e9c98b40e9cbf3e76737e08be7ebd3969
MD5 95b986c3b1b3d8d1d7b290d5116ccdf3
BLAKE2b-256 362806401d03fe7366c776e7206cbdc4c3448d2beae2b71ba4f9a3dbb6ca5023

See more details on using hashes here.

File details

Details for the file fastembed_vectorstore-0.5.0-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 8.8 MB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 e67c88f2311c37c43315b4254baf2034263e746e4091936e8c7b40f20989a305
MD5 1102515492f2cef2adef908db81034c0
BLAKE2b-256 66b43c69009d924a51e0b0283fcb23da0456e08cec7482d0682bf7c5bf4fe442

See more details on using hashes here.

File details

Details for the file fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 11.8 MB
  • Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 984a1e176c04653f0e433e469475c08449a0a702a60464195d98ab28e09eb75b
MD5 87ff598d726d83f8fe4b969b48a50616
BLAKE2b-256 57204611cc2b82fbb427ec4fe428f70de20392f9e87412de2d087763c5654698

See more details on using hashes here.

File details

Details for the file fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.8+, manylinux: glibc 2.17+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 31c843ea2dfff66c5ceb809f24f75d875275bcfaee9e93e44ceef86c2af510c5
MD5 6ae8d7c0a287809dd92cb4d575ffb3c2
BLAKE2b-256 022c6bc7824e9a1a4c7703e258142f58a17799c605e89ce326d0ae58e7798554

See more details on using hashes here.

File details

Details for the file fastembed_vectorstore-0.5.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0-cp38-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 9.0 MB
  • Tags: CPython 3.8+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d1c5e63d59880f673262e924b74708e0e8e9419da8fac9846bed8781d6bcd77d
MD5 fffb6cf9e9b2ea9f8abeb5d4ecbf4b91
BLAKE2b-256 863b224306030fd32b46f46bd7a928a1cca2a56db6fef242b2b5c609cdf231ed

See more details on using hashes here.

File details

Details for the file fastembed_vectorstore-0.5.0-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

  • Download URL: fastembed_vectorstore-0.5.0-cp38-abi3-macosx_10_12_x86_64.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: CPython 3.8+, macOS 10.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fastembed_vectorstore-0.5.0-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 05f3e172c7da84dcd09fafb3c5ec6d757c2947e773e768a1c3daa22b940ab6ee
MD5 c754330221b1fa940097851260ebf531
BLAKE2b-256 54ee5e031c882502f60f8c370671b1f154b08554c386eee6eae472d02b564980

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page