Skip to main content

Embed anything at lightning speed

Project description

Downloads gpu Open in Colab roadmap MkDocs

Highly Performant, Modular and Memory Safe
Ingestion, Inference and Indexing in Rust 🦀
Python docs »
Rust docs »
Benchmarks · FAQ · Adapters . Collaborations . Notebooks

EmbedAnything is a minimalist, yet highly performant, modular, lightning-fast, lightweight, multisource, multimodal, and local embedding pipeline built in Rust. Whether you're working with text, images, audio, PDFs, websites, or other media, EmbedAnything streamlines the process of generating embeddings from various sources and seamlessly streaming (memory-efficient-indexing) them to a vector database. It supports dense, sparse, ONNX, model2vec and late-interaction embeddings, offering flexibility for a wide range of use cases.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. How to add custom model and chunk size

🚀 Key Features

  • No Dependency on Pytorch: Easy to deploy on cloud, comes with low memory footprint.
  • Highly Modular : Choose any vectorDB adapter for RAG, with 1 line 1 word of code
  • Candle Backend : Supports BERT, Jina, ColPali, Splade, ModernBERT, Reranker, Qwen
  • ONNX Backend: Supports BERT, Jina, ColPali, ColBERT Splade, Reranker, ModernBERT, Qwen
  • Cloud Embedding Models:: Supports OpenAI, Cohere, and Gemini.
  • MultiModality : Works with text sources like PDFs, txt, md, Images JPG and Audio, .WAV
  • GPU support : Hardware acceleration on GPU as well.
  • Chunking : In-built chunking methods like semantic, late-chunking
  • Vector Streaming: Separate file processing, Indexing and Inferencing on different threads, reduces latency.

💡What is Vector Streaming

Embedding models are computationally expensive and time-consuming. By separating document preprocessing from model inference, you can significantly reduce pipeline latency and improve throughput.

Vector streaming transforms a sequential bottleneck into an efficient, concurrent workflow.

The embedding process happens separetly from the main process, so as to maintain high performance enabled by rust MPSC, and no memory leak as embeddings are directly saved to vector database. Find our blog.

EmbedAnythingXWeaviate

🦀 Why Embed Anything

➡️Faster execution.
➡️No Pytorch Dependency, thus low-memory footprint and easy to deploy on cloud.
➡️True multithreading
➡️Running embedding models locally and efficiently
➡️In-built chunking methods like semantic, late-chunking
➡️Supports range of models, Dense, Sparse, Late-interaction, ReRanker, ModernBert.
➡️Memory Management: Rust enforces memory management simultaneously, preventing memory leaks and crashes that can plague other languages

🍓 Our Past Collaborations:

We have collaborated with reputed enterprise like Elastic, Weaviate, SingleStore, Milvus and Analytics Vidya Datahours

You can get in touch with us for further collaborations.

Benchmarks

Inference Speed benchmarks.

Only measures embedding model inference speed, on onnx-runtime. Code

Benchmarks with other fromeworks coming soon!! 🚀

⭐ Supported Models

We support any hugging-face models on Candle. And We also support ONNX runtime for BERT and ColPali.

How to add custom model on candle: from_pretrained_hf

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="model link from huggingface"
)
config = TextEmbedConfig(chunk_size=1000, batch_size=32)
data = embed_anything.embed_file("file_address", embedder=model, config=config)
Model HF link
Jina Jina Models
Bert All Bert based models
CLIP openai/clip-*
Whisper OpenAI Whisper models
ColPali starlight-ai/colpali-v1.2-merged-onnx
Colbert answerdotai/answerai-colbert-small-v1, jinaai/jina-colbert-v2 and more
Splade Splade Models and other Splade like models
Model2Vec model2vec, minishlab/potion-base-8M
Qwen3-Embedding Qwen/Qwen3-Embedding-0.6B
Reranker Jina Reranker Models, Xenova/bge-reranker, Qwen/Qwen3-Reranker-4B

Splade Models:

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.SparseBert, "prithivida/Splade_PP_en_v1"
)

ONNX-Runtime: from_pretrained_onnx

BERT

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_id="onnx_model_link"
)

ColPali

model: ColpaliModel = ColpaliModel.from_pretrained_onnx("starlight-ai/colpali-v1.2-merged-onnx", None)

Colbert

sentences = [
"The quick brown fox jumps over the lazy dog", 
"The cat is sleeping on the mat", "The dog is barking at the moon", 
"I love pizza", 
"The dog is sitting in the park"]

model = ColbertModel.from_pretrained_onnx("jinaai/jina-colbert-v2", path_in_repo="onnx/model.onnx")
embeddings = model.embed(sentences, batch_size=2)

ModernBERT

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

ReRankers

reranker = Reranker.from_pretrained("jinaai/jina-reranker-v1-turbo-en", dtype=Dtype.F16)

results: list[RerankerResult] = reranker.rerank(["What is the capital of France?"], ["France is a country in Europe.", "Paris is the capital of France."], 2)

Embed 4

# Initialize the model once
model: EmbeddingModel = EmbeddingModel.from_pretrained_cloud(
    WhichModel.CohereVision, model_id="embed-v4.0"
)

Qwen 3 - Embedding

# Initialize the model once
model:EmbeddingModel = EmbeddingModel.from_pretrained_hf(
    WhichModel.Qwen3, model_id="Qwen/Qwen3-Embedding-0.6B"
)

For Semantic Chunking

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="sentence-transformers/all-MiniLM-L12-v2"
)

# with semantic encoder
semantic_encoder = EmbeddingModel.from_pretrained_hf(WhichModel.Jina, model_id = "jinaai/jina-embeddings-v2-small-en")
config = TextEmbedConfig(chunk_size=1000, batch_size=32, splitting_strategy = "semantic", semantic_encoder=semantic_encoder)

For late-chunking

config = TextEmbedConfig(
    chunk_size=1000,
    batch_size=8,
    splitting_strategy="sentence",
    late_chunking=True,
)

# Embed a single file
data: list[EmbedData] = model.embed_file("test_files/attention.pdf", config=config)

🧑‍🚀 Getting Started

💚 Installation

pip install embed-anything

For GPUs and using special models like ColPali

pip install embed-anything-gpu

🚧❌ If it shows cuda error while running on windowns, run the following command:

os.add_dll_directory("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/bin")

📒 Notebooks

End-to-End Retrieval and Reranking using VectorDB Adapters
ColPali-Onnx
Adapters
Qwen3- Embedings
Benchmarks

Usage

➡️ Usage For 0.3 and later version

model = EmbeddingModel.from_pretrained_local(
    WhichModel.Bert, model_id="Hugging_face_link"
)
data = embed_anything.embed_file("test_files/test.pdf", embedder=model)

Using ONNX Models

To use ONNX models, you can either use the ONNXModel enum or the model_id from the Hugging Face model.

Using the above method is best to ensure that the model works correctly as these models are tested. But if you want to use other models, like finetuned models, you can use the hf_model_id and path_in_repo to load the model like below.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Jina, hf_model_id = "jinaai/jina-embeddings-v2-small-en", path_in_repo="model.onnx"
)

To see all the ONNX models supported with model_name, see here

⁉️FAQ

Do I need to know rust to use or contribute to embedanything?

The answer is No. EmbedAnything provides you pyo3 bindings, so you can run any function in python without any issues. To contibute you should check out our guidelines and python folder example of adapters.

How is it different from fastembed?

We provide both backends, candle and onnx. On top of it we also give an end-to-end pipeline, that is you can ingest different data-types and index to any vector database, and inference any model. Fastembed is just an onnx-wrapper.

We've received quite a few questions about why we're using Candle.

One of the main reasons is that Candle doesn't require any specific ONNX format models, which means it can work seamlessly with any Hugging Face model. This flexibility has been a key factor for us. However, we also recognize that we’ve been compromising a bit on speed in favor of that flexibility.

🚧 Contributing to EmbedAnything

First of all, thank you for taking the time to contribute to this project. We truly appreciate your contributions, whether it's bug reports, feature suggestions, or pull requests. Your time and effort are highly valued in this project. 🚀

This document provides guidelines and best practices to help you to contribute effectively. These are meant to serve as guidelines, not strict rules. We encourage you to use your best judgment and feel comfortable proposing changes to this document through a pull request.

  • Roadmap
  • Quick Start
  • Guidelines
  • 🏎️ RoadMap

    Accomplishments

    One of the aims of EmbedAnything is to allow AI engineers to easily use state of the art embedding models on typical files and documents. A lot has already been accomplished here and these are the formats that we support right now and a few more have to be done.

    🖼️ Modalities and Source

    We’re excited to share that we've expanded our platform to support multiple modalities, including:

    • Audio files

    • Markdowns

    • Websites

    • Images

    • Videos

    • Graph

    This gives you the flexibility to work with various data types all in one place! 🌐

    ⚙️ Performance

    We now support both candle and Onnx backend
    ➡️ Support for GGUF models

    🫐Embeddings:

    We had multimodality from day one for our infrastructure. We have already included it for websites, images and audios but we want to expand it further to.

    ➡️ Graph embedding -- build deepwalks embeddings depth first and word to vec
    ➡️ Video Embedding
    ➡️ Yolo Clip

    🌊Expansion to other Vector Adapters

    We currently support a wide range of vector databases for streaming embeddings, including:

    • Elastic: thanks to amazing and active Elastic team for the contribution
    • Weaviate
    • Pinecone
    • Qdrant
    • Milvus
    • Chroma

    How to add an adpters: https://starlight-search.com/blog/2024/02/25/adapter-development-guide.md

    💥 Create WASM demos to integrate embedanything directly to the browser.

    💜 Add support for ingestion from remote sources

    ➡️ Support for S3 bucket
    ➡️ Support for azure storage
    ➡️ Support for google drive/dropbox

    But we're not stopping there! We're actively working to expand this list.

    Want to Contribute? If you’d like to add support for your favorite vector database, we’d love to have your help! Check out our contribution.md for guidelines, or feel free to reach out directly turingatverge@gmail.com . Let's build something amazing together! 💡

    A big Thank you to all our StarGazers

    Star History

    Star History Chart

    Project details


    Download files

    Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

    Source Distribution

    embed_anything-0.6.6.tar.gz (998.5 kB view details)

    Uploaded Source

    Built Distributions

    If you're not sure about the file name format, learn more about wheel file names.

    embed_anything-0.6.6-cp314-cp314-win_amd64.whl (17.9 MB view details)

    Uploaded CPython 3.14Windows x86-64

    embed_anything-0.6.6-cp314-cp314-manylinux_2_34_x86_64.whl (19.3 MB view details)

    Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

    embed_anything-0.6.6-cp314-cp314-macosx_11_0_arm64.whl (18.8 MB view details)

    Uploaded CPython 3.14macOS 11.0+ ARM64

    embed_anything-0.6.6-cp313-cp313-win_amd64.whl (17.9 MB view details)

    Uploaded CPython 3.13Windows x86-64

    embed_anything-0.6.6-cp313-cp313-macosx_11_0_arm64.whl (18.8 MB view details)

    Uploaded CPython 3.13macOS 11.0+ ARM64

    embed_anything-0.6.6-cp312-cp312-win_amd64.whl (17.9 MB view details)

    Uploaded CPython 3.12Windows x86-64

    embed_anything-0.6.6-cp312-cp312-manylinux_2_34_x86_64.whl (19.3 MB view details)

    Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

    embed_anything-0.6.6-cp312-cp312-macosx_11_0_arm64.whl (18.8 MB view details)

    Uploaded CPython 3.12macOS 11.0+ ARM64

    embed_anything-0.6.6-cp311-cp311-win_amd64.whl (17.9 MB view details)

    Uploaded CPython 3.11Windows x86-64

    embed_anything-0.6.6-cp311-cp311-manylinux_2_34_x86_64.whl (19.3 MB view details)

    Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

    embed_anything-0.6.6-cp311-cp311-macosx_11_0_arm64.whl (18.8 MB view details)

    Uploaded CPython 3.11macOS 11.0+ ARM64

    embed_anything-0.6.6-cp310-cp310-win_amd64.whl (17.9 MB view details)

    Uploaded CPython 3.10Windows x86-64

    embed_anything-0.6.6-cp310-cp310-manylinux_2_34_x86_64.whl (19.3 MB view details)

    Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

    embed_anything-0.6.6-cp39-cp39-win_amd64.whl (17.9 MB view details)

    Uploaded CPython 3.9Windows x86-64

    embed_anything-0.6.6-cp39-cp39-manylinux_2_34_x86_64.whl (19.3 MB view details)

    Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

    File details

    Details for the file embed_anything-0.6.6.tar.gz.

    File metadata

    • Download URL: embed_anything-0.6.6.tar.gz
    • Upload date:
    • Size: 998.5 kB
    • Tags: Source
    • Uploaded using Trusted Publishing? Yes
    • Uploaded via: maturin/1.9.6

    File hashes

    Hashes for embed_anything-0.6.6.tar.gz
    Algorithm Hash digest
    SHA256 43eaa6e3f0a173f343da22dfafb770fee31708958d79e7968a90d027d18721e3
    MD5 cf1ae3e1f0bce4e78e96d013ade0626f
    BLAKE2b-256 cca5438ee439f219ef3c29476ddb050836770c3e4d0c22cef24ff434b207e3dd

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp314-cp314-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp314-cp314-win_amd64.whl
    Algorithm Hash digest
    SHA256 37f2df9ded7c2549f5b0112d9a7b27e701dcb219b07b9b8ba053f18e99d4bee7
    MD5 270b53fbf396e2c418c037da54b42665
    BLAKE2b-256 d595164e7b08658b1b31d536db665ad6dff5e8e85e70f0d1e9e254f8ad4097fd

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp314-cp314-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp314-cp314-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 dc7062dc3ca88014bf5d7e24fb8de0287ff83a9cf339cc5ece1f2ff9a568187b
    MD5 2863ad774cab2616234c49be2f9face2
    BLAKE2b-256 d257ecd259227d6510ac020713e38760273c06ada996c820681b7fcf2430f73c

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp314-cp314-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp314-cp314-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 723e15577b09dc5ad8f4c9b40020c4177b7f227ae898c19158ed8e7ba819444b
    MD5 79a96448beb2e3c9182752c661f42f43
    BLAKE2b-256 fa84b9d7e88ff4e071e2ff8e7bfd54fc8fe24c7bdc0422ccf9c9e7e669d0ce9e

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp313-cp313-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp313-cp313-win_amd64.whl
    Algorithm Hash digest
    SHA256 1c2bf239a372f874d4cafff7d43ef25a5bd0f92f9f1330ba5fd3b4f47e1c3a4d
    MD5 0930d42df08fa3c25a18ff2a445aba37
    BLAKE2b-256 de92dad5f1b424cbe806cb142c6a51cd0ec49207e14e35d599b89963b17b827f

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp313-cp313-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp313-cp313-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 9d37d25b4b20cc4016ca0fcb2f5c2bd972fac10739acdea89a0e679ede30142b
    MD5 6438601dcf6154ec90c9a6c12a4ffb63
    BLAKE2b-256 c0cc6835559ff76b63a6ec2e28318dc7987db615597693c6ad47ef7a5a041d33

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp312-cp312-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp312-cp312-win_amd64.whl
    Algorithm Hash digest
    SHA256 cbbd2b013dfb39768e1212d9e96adf7fbef710a4f2718c569117491b825df776
    MD5 a7f0bdea9c920d1318ff784f41312580
    BLAKE2b-256 2d4ab6f7e746353b33ff397ba1a1f229098c65e17c93397172b9238adad8abce

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp312-cp312-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp312-cp312-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 07fa624eeb91ef9064d068a1740cc6ab516693359bbccd72d89f26c8efff57ed
    MD5 05a2213d8b3507581665d23ea122a2da
    BLAKE2b-256 7c0f77468c9da656c362df4f8dd1820d5216a041761da92061e6b53d219d7399

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp312-cp312-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp312-cp312-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 81d5a5955fd7a67701a9ab310a723a8359bc9b83538712f96d0e4cd41541b553
    MD5 8e5e5b2000cdc827d724e4cb35059d18
    BLAKE2b-256 f9acbfd094ed8ac3b5b8bb1cc8d1bf30edb665d7a7f6c63504d35351d71c5732

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp311-cp311-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp311-cp311-win_amd64.whl
    Algorithm Hash digest
    SHA256 ddba04b7a423304e0427688a48d1eb9f0767458377f4f80200c78abe5038bca4
    MD5 0e722ee65d24a60fdd2404a65ee14329
    BLAKE2b-256 efc7e233a63ee2579a42e34e8b5da78003b9d009357895de6a52ab3594255eda

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp311-cp311-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp311-cp311-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 6d48f3aae3164e6d5629685e4e0cceb8a093b9232f07ca9e50302bfb7f3a4f9d
    MD5 92141c1fb66f90f25c39b732280c3d4a
    BLAKE2b-256 6aed8bab81c1803d41ba581bb1a594b6e6238499636a5e3431332aafbfb8b1db

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp311-cp311-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp311-cp311-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 2c0b10ad5c60718763c077929b7d1c8990e13c4cb246ac36a4d3dfcda994dde6
    MD5 57cd08d0f3deee3f11a5fdbf9e65ad13
    BLAKE2b-256 49ba17cdc42c88888a9a8921f75c6ceb080eeb88a46a645febc2a5a5de526abc

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp310-cp310-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp310-cp310-win_amd64.whl
    Algorithm Hash digest
    SHA256 82e26e54c991278c177b786e0597063a37b669479c2f843cbadd2e26ed5868ae
    MD5 71e9af86284f1b56f865a119f03efefe
    BLAKE2b-256 7979bb9ac8f26f227e1f2a6322fa257d7d5a40bbcf0baaf5c960307889895a57

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp310-cp310-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp310-cp310-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 30b297e5f57f9eb7ffbbf31e536c162e65fa1ccb48a70e0e2167accd3a6d7794
    MD5 534fb1010428fa949cfd83e7cacdcf44
    BLAKE2b-256 0a5b99c10bce60235029152cf0a2a75b3280cfdcbbdc80acfd8958a93239abe5

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp39-cp39-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp39-cp39-win_amd64.whl
    Algorithm Hash digest
    SHA256 c958b42b9309fb3eee66bd679d48abb8f3e2c0ccce4c6f20bc6bd82c10af0512
    MD5 c3377dbb7b01896733118f3d1bca34c8
    BLAKE2b-256 d013d71b463a3dd5f234146498b0aaed41352ef8261656c1b13d0190a52048d3

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.6-cp39-cp39-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.6-cp39-cp39-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 7e91f91c1260bb169c0f56aa8bba2608334a88e253bd6a6c1285aaeff41ad5f9
    MD5 36b91d3fd62b9a205d9a769358e0fc2d
    BLAKE2b-256 dea85401e14dd3b1631ee06b84b7196a6d6c2a14456eaaa49968a88d67ed260a

    See more details on using hashes here.

    Supported by

    AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page