Skip to main content

Embed anything at lightning speed

Project description

Downloads gpu Open in Colab package roadmap roadmap

Inference, Ingestion, and Indexing – supercharged by Rust 🦀
Python docs »
Rust docs »
View Demo · Benches · Vector Streaming Adapters . Search in Audio Space

EmbedAnything is a minimalist, highly performant, lightning-fast, lightweight, multisource, multimodal, and local embedding pipeline built in Rust. Whether you're working with text, images, audio, PDFs, websites, or other media, EmbedAnything streamlines the process of generating embeddings from various sources and seamlessly streaming (memory-efficient-indexing) them to a vector database. It supports dense, sparse, ONNX and late-interaction embeddings, offering flexibility for a wide range of use cases.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. How to add custom model and chunk size

🚀 Key Features

  • Local Embedding : Works with local embedding models like BERT and JINA
  • ONNX Models: Works with ONNX models for BERT and ColPali
  • ColPali : Support for ColPali in GPU version both on ONNX and Candle
  • Splade : Support for sparse embeddings for hybrid
  • ReRankers : Support for ReRanking Models for better RAG.
  • ColBERT : Support for ColBert on ONNX
  • ModernBERT: Increase your token length to 8K
  • Cloud Embedding Models:: Supports OpenAI and Cohere.
  • MultiModality : Works with text sources like PDFs, txt, md, Images JPG and Audio, .WAV
  • Rust : All the file processing is done in rust for speed and efficiency
  • GPU support : We have taken care of hardware acceleration on GPU as well.
  • Python Interface: Packaged as a Python library for seamless integration into your existing projects.
  • Vector Streaming: Continuously create and stream embeddings if you have low resource.

💡What is Vector Streaming

Vector Streaming enables you to process and generate embeddings for files and stream them, so if you have 10 GB of file, it can continuously generate embeddings Chunk by Chunk, that you can segment semantically, and store them in the vector database of your choice, Thus it eliminates bulk embeddings storage on RAM at once. The embedding process happens separetly from the main process, so as to maintain high performance enabled by rust MPSC.

EmbedAnythingXWeaviate

🦀 Why Embed Anything

➡️Faster execution.
➡️Memory Management: Rust enforces memory management simultaneously, preventing memory leaks and crashes that can plague other languages
➡️True multithreading
➡️Running embedding models locally and efficiently
➡️Candle allows inferences on CUDA-enabled GPUs right out of the box.
➡️Decrease the memory usage of EmbedAnything.
➡️Supports range of models, Dense, Sparse, Late-interaction, ReRanker, ModernBert.

⭐ Supported Models

We support any hugging-face models on Candle. And We also support ONNX runtime for BERT and ColPali.

How to add custom model on candle: from_pretrained_hf

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="model link from huggingface"
)
config = TextEmbedConfig(chunk_size=200, batch_size=32)
data = embed_anything.embed_file("file_address", embedder=model, config=config)
Model HF link
Jina Jina Models
Bert All Bert based models
CLIP openai/clip-*
Whisper OpenAI Whisper models
ColPali starlight-ai/colpali-v1.2-merged-onnx
Colbert answerdotai/answerai-colbert-small-v1, jinaai/jina-colbert-v2 and more
Splade Splade Models and other Splade like models
Reranker Jina Reranker Models, Xenova/bge-reranker

Splade Models:

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.SparseBert, "prithivida/Splade_PP_en_v1"
)

ONNX-Runtime: from_pretrained_onnx

BERT

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_id="onnx_model_link"
)

ColPali

model: ColpaliModel = ColpaliModel.from_pretrained_onnx("starlight-ai/colpali-v1.2-merged-onnx", None)

Colbert

sentences = [
"The quick brown fox jumps over the lazy dog", 
"The cat is sleeping on the mat", "The dog is barking at the moon", 
"I love pizza", 
"The dog is sitting in the park"]

model = ColbertModel.from_pretrained_onnx("jinaai/jina-colbert-v2", path_in_repo="onnx/model.onnx")
embeddings = model.embed(sentences, batch_size=2)

ModernBERT

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

ReRankers

reranker = Reranker.from_pretrained("jinaai/jina-reranker-v1-turbo-en", dtype=Dtype.F16)

results: list[RerankerResult] = reranker.rerank(["What is the capital of France?"], ["France is a country in Europe.", "Paris is the capital of France."], 2)

For Semantic Chunking

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="sentence-transformers/all-MiniLM-L12-v2"
)

# with semantic encoder
semantic_encoder = EmbeddingModel.from_pretrained_hf(WhichModel.Jina, model_id = "jinaai/jina-embeddings-v2-small-en")
config = TextEmbedConfig(chunk_size=256, batch_size=32, splitting_strategy = "semantic", semantic_encoder=semantic_encoder)

🧑‍🚀 Getting Started

💚 Installation

pip install embed-anything

For GPUs and using special models like ColPali

pip install embed-anything-gpu

Usage

➡️ Usage For 0.3 and later version

To use local embedding: we support Bert and Jina

model = EmbeddingModel.from_pretrained_local(
    WhichModel.Bert, model_id="Hugging_face_link"
)
data = embed_anything.embed_file("test_files/test.pdf", embedder=model)

For multimodal embedding: we support CLIP

Requirements Directory with pictures you want to search for example we have test_files with images of cat, dogs etc

import embed_anything
from embed_anything import EmbedData
model = embed_anything.EmbeddingModel.from_pretrained_local(
    embed_anything.WhichModel.Clip,
    model_id="openai/clip-vit-base-patch16",
    # revision="refs/pr/15",
)
data: list[EmbedData] = embed_anything.embed_directory("test_files", embedder=model)
embeddings = np.array([data.embedding for data in data])
query = ["Photo of a monkey?"]
query_embedding = np.array(
    embed_anything.embed_query(query, embedder=model)[0].embedding
)
similarities = np.dot(embeddings, query_embedding)
max_index = np.argmax(similarities)
Image.open(data[max_index].text).show()

Audio Embedding using Whisper

requirements: Audio .wav files.

import embed_anything
from embed_anything import (
    AudioDecoderModel,
    EmbeddingModel,
    embed_audio_file,
    TextEmbedConfig,
)
# choose any whisper or distilwhisper model from https://huggingface.co/distil-whisper or https://huggingface.co/collections/openai/whisper-release-6501bba2cf999715fd953013
audio_decoder = AudioDecoderModel.from_pretrained_hf(
    "openai/whisper-tiny.en", revision="main", model_type="tiny-en", quantized=False
)
embedder = EmbeddingModel.from_pretrained_hf(
    embed_anything.WhichModel.Bert,
    model_id="sentence-transformers/all-MiniLM-L6-v2",
    revision="main",
)
config = TextEmbedConfig(chunk_size=200, batch_size=32)
data = embed_anything.embed_audio_file(
    "test_files/audio/samples_hp0.wav",
    audio_decoder=audio_decoder,
    embedder=embedder,
    text_embed_config=config,
)
print(data[0].metadata)

Using ONNX Models

To use ONNX models, you can either use the ONNXModel enum or the model_id from the Hugging Face model.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_name = ONNXModel.AllMiniLML6V2Q
)

For some models, you can also specify the dtype to use for the model.

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

Using the above method is best to ensure that the model works correctly as these models are tested. But if you want to use other models, like finetuned models, you can use the hf_model_id and path_in_repo to load the model like below.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Jina, hf_model_id = "jinaai/jina-embeddings-v2-small-en", path_in_repo="model.onnx"
)

To see all the ONNX models supported with model_name, see here

🚧 Contributing to EmbedAnything

First of all, thank you for taking the time to contribute to this project. We truly appreciate your contributions, whether it's bug reports, feature suggestions, or pull requests. Your time and effort are highly valued in this project. 🚀

This document provides guidelines and best practices to help you to contribute effectively. These are meant to serve as guidelines, not strict rules. We encourage you to use your best judgment and feel comfortable proposing changes to this document through a pull request.

  • Roadmap
  • Quick Start
  • Guidelines
  • 🏎️ RoadMap

    Accomplishments

    One of the aims of EmbedAnything is to allow AI engineers to easily use state of the art embedding models on typical files and documents. A lot has already been accomplished here and these are the formats that we support right now and a few more have to be done.

    Adding Fine-tuning

    One of the major goals of this year is to add finetuning these models on your data. Like a simple sentence transformer does.

    🖼️ Modalities and Source

    We’re excited to share that we've expanded our platform to support multiple modalities, including:

    • Audio files

    • Markdowns

    • Websites

    • Images

    • Videos

    • Graph

    This gives you the flexibility to work with various data types all in one place! 🌐

    💜 Product

    We’ve rolled out some major updates in version 0.3 to improve both functionality and performance. Here’s what’s new:

    • Semantic Chunking: Optimized chunking strategy for better Retrieval-Augmented Generation (RAG) workflows.

    • Streaming for Efficient Indexing: We’ve introduced streaming for memory-efficient indexing in vector databases. Want to know more? Check out our article on this feature here: https://www.analyticsvidhya.com/blog/2024/09/vector-streaming/

    • Zero-Shot Applications: Explore our zero-shot application demos to see the power of these updates in action.

    • Intuitive Functions: Version 0.3 includes a complete refactor for more intuitive functions, making the platform easier to use.

    • Chunkwise Streaming: Instead of file-by-file streaming, we now support chunkwise streaming, allowing for more flexible and efficient data processing.

    Check out the latest release : and see how these features can supercharge your GenerativeAI pipeline! ✨

    🚀Coming Soon

    ⚙️ Performance

    We've received quite a few questions about why we're using Candle, so here's a quick explanation:

    One of the main reasons is that Candle doesn't require any specific ONNX format models, which means it can work seamlessly with any Hugging Face model. This flexibility has been a key factor for us. However, we also recognize that we’ve been compromising a bit on speed in favor of that flexibility.

    What’s Next? To address this, we’re excited to announce that we’re introducing Candle-ONNX along with our previous framework on hugging-face ,

    ➡️ Support for GGUF models

    • Significantly faster performance
    • Stay tuned for these exciting updates! 🚀

    🫐Embeddings:

    We had multimodality from day one for our infrastructure. We have already included it for websites, images and audios but we want to expand it further to.

    ☑️Graph embedding -- build deepwalks embeddings depth first and word to vec
    ☑️Video Embedding
    ☑️ Yolo Clip

    🌊Expansion to other Vector Adapters

    We currently support a wide range of vector databases for streaming embeddings, including:

    • Elastic: thanks to amazing and active Elastic team for the contribution
    • Weaviate
    • Pinecone

    But we're not stopping there! We're actively working to expand this list.

    Want to Contribute? If you’d like to add support for your favorite vector database, we’d love to have your help! Check out our contribution.md for guidelines, or feel free to reach out directly starlight-search@proton.me. Let's build something amazing together! 💡

    Project details


    Download files

    Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

    Source Distribution

    embed_anything-0.5.3.tar.gz (940.0 kB view details)

    Uploaded Source

    Built Distributions

    embed_anything-0.5.3-cp313-cp313-win_amd64.whl (17.3 MB view details)

    Uploaded CPython 3.13 Windows x86-64

    embed_anything-0.5.3-cp313-cp313-manylinux_2_34_x86_64.whl (21.9 MB view details)

    Uploaded CPython 3.13 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.3-cp313-cp313-macosx_11_0_arm64.whl (13.9 MB view details)

    Uploaded CPython 3.13 macOS 11.0+ ARM64

    embed_anything-0.5.3-cp312-cp312-win_amd64.whl (17.3 MB view details)

    Uploaded CPython 3.12 Windows x86-64

    embed_anything-0.5.3-cp312-cp312-manylinux_2_34_x86_64.whl (21.9 MB view details)

    Uploaded CPython 3.12 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.3-cp312-cp312-macosx_11_0_arm64.whl (13.9 MB view details)

    Uploaded CPython 3.12 macOS 11.0+ ARM64

    embed_anything-0.5.3-cp311-cp311-win_amd64.whl (17.3 MB view details)

    Uploaded CPython 3.11 Windows x86-64

    embed_anything-0.5.3-cp311-cp311-manylinux_2_34_x86_64.whl (21.9 MB view details)

    Uploaded CPython 3.11 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.3-cp311-cp311-macosx_11_0_arm64.whl (13.9 MB view details)

    Uploaded CPython 3.11 macOS 11.0+ ARM64

    embed_anything-0.5.3-cp310-cp310-win_amd64.whl (17.3 MB view details)

    Uploaded CPython 3.10 Windows x86-64

    embed_anything-0.5.3-cp310-cp310-manylinux_2_34_x86_64.whl (21.9 MB view details)

    Uploaded CPython 3.10 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.3-cp39-cp39-win_amd64.whl (17.3 MB view details)

    Uploaded CPython 3.9 Windows x86-64

    embed_anything-0.5.3-cp39-cp39-manylinux_2_34_x86_64.whl (21.9 MB view details)

    Uploaded CPython 3.9 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.3-cp38-cp38-win_amd64.whl (17.3 MB view details)

    Uploaded CPython 3.8 Windows x86-64

    File details

    Details for the file embed_anything-0.5.3.tar.gz.

    File metadata

    • Download URL: embed_anything-0.5.3.tar.gz
    • Upload date:
    • Size: 940.0 kB
    • Tags: Source
    • Uploaded using Trusted Publishing? Yes
    • Uploaded via: maturin/1.8.2

    File hashes

    Hashes for embed_anything-0.5.3.tar.gz
    Algorithm Hash digest
    SHA256 5d95dcba48da61b54a5285983aa80eafe5cc277cdb8478031b012e145f0609ba
    MD5 8b6b791ec094f331df5052ca8b5a3150
    BLAKE2b-256 c0e61debe24a89146979e6126048214950273a03076f0f32cefb8d2ef4f24531

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp313-cp313-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp313-cp313-win_amd64.whl
    Algorithm Hash digest
    SHA256 aa71e3cbb86a6a2e3b83c60d9a4953118497a03fc4dae3f4cbd8209a0de83cf9
    MD5 950ecedf8cd9378b77853dce9029d8c6
    BLAKE2b-256 cf47098096031e3793e38777a3b7e78d0f99a30b881a6c3d174d2e0133666b82

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp313-cp313-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp313-cp313-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 cadc9dfb8909b02cf91bde2ac2fc12780b2a8e980b735add4ef885b3488102f8
    MD5 5f7723ae879cbba2350c2ebfeaa17970
    BLAKE2b-256 24010663712f3e7bf923a97decdfe3a9dabf5b7bf184baed2a079117ac78814e

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp313-cp313-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp313-cp313-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 ea663ca614db8b446993aac5b9aeaf85377cb81dde166f1ef6675ee64a2345b2
    MD5 4017488e77cc4df59e5fa8be98796f41
    BLAKE2b-256 6d6bd0f6d41b5a2e8d4b9ded994f3a870a8fdc8dc46eb18af85dd63dee8b7bc5

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp312-cp312-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp312-cp312-win_amd64.whl
    Algorithm Hash digest
    SHA256 218162f7ad8742ec9db84e2a38a2d39b82a8a9db3fe6a5854f3eaa3c702b0b63
    MD5 03d99c4811f99eada54ba8dd42e20fc6
    BLAKE2b-256 16b1262be9e43cbfdf6a7a7e4b04863671a19e47da9583e7ab245fd5300aa62b

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp312-cp312-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp312-cp312-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 08d70f76998b742b9f2cb5520eadaeabe817e97723338db78e97e8796aa07c01
    MD5 f1bd999f25360406d69979e33aef7768
    BLAKE2b-256 d37c6bc5afced98b2b701c46be8c9c096363df8051b4c75360219f09e06badec

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp312-cp312-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp312-cp312-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 236558ac8d3292036f337c5400a25f393eb5eb57326979819124d78e1f4ca99a
    MD5 43c2e31f35cc354e15451fe9dca12f22
    BLAKE2b-256 2e54ca592ea2cb023fd06c800994a5e77043befed3010f56718949db36ced619

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp311-cp311-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp311-cp311-win_amd64.whl
    Algorithm Hash digest
    SHA256 8b47aaf079a517d7676c54e8da7e7dee2084afdb11654ede8191a7f4e7a40c12
    MD5 108c6e0db00f25c5e8be47fb38b7ebca
    BLAKE2b-256 db03160db94fa03b04c399eb8f1aafa29aecebc32ba037d3506f065dcfd9e1cd

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp311-cp311-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp311-cp311-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 7e5aec445a53efd0c2e44c1213f7acad243783325b7aa61e353f233c5a5a7373
    MD5 40867e098ab85efd3739afb39a20e625
    BLAKE2b-256 41704fad2399179535bc6e0c56dae02de1477d5cc55ba317a5778dcac49861ec

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp311-cp311-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp311-cp311-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 9e94515f706beb6a50b08283e6b9b77fb09febdac2fc979bc5037ae877d36776
    MD5 7d90882c517c93b5e64f8cb37fe0e005
    BLAKE2b-256 6cdf3b8b5683e4d04fe25832b135a424cdaaa3501acd937834f81a1fb514114d

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp310-cp310-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp310-cp310-win_amd64.whl
    Algorithm Hash digest
    SHA256 993ffdb53aa8cf5eeaca49a4fb2341b586d7f66979bcbf68428e8ee963a1d275
    MD5 df382a12d9922707f6e4b8e3b76adc9a
    BLAKE2b-256 91d96a936541a81c99226c954f8fd415c596204147a8df7cbc1876cc0fafa3b9

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp310-cp310-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp310-cp310-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 4629fc97c180d10b49bc8e14107f6c19a823b49c6212b4a86a3bd21cc4a47396
    MD5 308dee7bed5266e028709d0e151b6498
    BLAKE2b-256 f7af2829bf8a41c3ce425c0f3f2252810516a6d9d57ac23b04ca07092015da5e

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp39-cp39-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp39-cp39-win_amd64.whl
    Algorithm Hash digest
    SHA256 bdd775e0e132f9ca636881ba891b75bc24d9608575a7e6a14b7a96e09740cb8b
    MD5 44cba6ab998afa65a01a62a7ac961c16
    BLAKE2b-256 58744cd5a32adbae74acc489d63386b0ad81973e9d28c480e42deb414a2b1bee

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp39-cp39-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp39-cp39-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 8d23697bfbf8fd148c53ab26c90f024fa09f5a5d557fc9aba1aed001af169c17
    MD5 9cf05b9031d8026ba007fadfb075121e
    BLAKE2b-256 c0c9696590d45d22278934f64e2be0b73be1e7e64ff9fa40734939882bd9bf73

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.3-cp38-cp38-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.3-cp38-cp38-win_amd64.whl
    Algorithm Hash digest
    SHA256 853ff4761ec3f7ea6b446f8a5c9a9bb42dcb42a1b369f72461893093ff926168
    MD5 abd7fba9a4bfdd34afaf9cffe17d2cc4
    BLAKE2b-256 3543896e88cc1faca93376f5ed2b0aaae1c9699471e94e2657ba938707ce35dc

    See more details on using hashes here.

    Supported by

    AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page