Skip to main content

Embed anything at lightning speed

Project description

Downloads gpu Open in Colab package roadmap roadmap

Inference, Ingestion, and Indexing – supercharged by Rust 🦀
Python docs »
Rust docs »
View Demo · Benches · Vector Streaming Adapters . Search in Audio Space

EmbedAnything is a minimalist, highly performant, lightning-fast, lightweight, multisource, multimodal, and local embedding pipeline built in Rust. Whether you're working with text, images, audio, PDFs, websites, or other media, EmbedAnything streamlines the process of generating embeddings from various sources and seamlessly streaming (memory-efficient-indexing) them to a vector database. It supports dense, sparse, ONNX and late-interaction embeddings, offering flexibility for a wide range of use cases.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. How to add custom model and chunk size

🚀 Key Features

  • Local Embedding : Works with local embedding models like BERT and JINA
  • ONNX Models: Works with ONNX models for BERT and ColPali
  • ColPali : Support for ColPali in GPU version both on ONNX and Candle
  • Splade : Support for sparse embeddings for hybrid
  • ReRankers : Support for ReRanking Models for better RAG.
  • ColBERT : Support for ColBert on ONNX
  • ModernBERT: Increase your token length to 8K
  • Cloud Embedding Models:: Supports OpenAI and Cohere.
  • MultiModality : Works with text sources like PDFs, txt, md, Images JPG and Audio, .WAV
  • Rust : All the file processing is done in rust for speed and efficiency
  • GPU support : We have taken care of hardware acceleration on GPU as well.
  • Python Interface: Packaged as a Python library for seamless integration into your existing projects.
  • Vector Streaming: Continuously create and stream embeddings if you have low resource.

💡What is Vector Streaming

Vector Streaming enables you to process and generate embeddings for files and stream them, so if you have 10 GB of file, it can continuously generate embeddings Chunk by Chunk, that you can segment semantically, and store them in the vector database of your choice, Thus it eliminates bulk embeddings storage on RAM at once. The embedding process happens separetly from the main process, so as to maintain high performance enabled by rust MPSC.

EmbedAnythingXWeaviate

🦀 Why Embed Anything

➡️Faster execution.
➡️Memory Management: Rust enforces memory management simultaneously, preventing memory leaks and crashes that can plague other languages
➡️True multithreading
➡️Running embedding models locally and efficiently
➡️Candle allows inferences on CUDA-enabled GPUs right out of the box.
➡️Decrease the memory usage of EmbedAnything.
➡️Supports range of models, Dense, Sparse, Late-interaction, ReRanker, ModernBert.

⭐ Supported Models

We support any hugging-face models on Candle. And We also support ONNX runtime for BERT and ColPali.

How to add custom model on candle: from_pretrained_hf

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="model link from huggingface"
)
config = TextEmbedConfig(chunk_size=200, batch_size=32)
data = embed_anything.embed_file("file_address", embedder=model, config=config)
Model HF link
Jina Jina Models
Bert All Bert based models
CLIP openai/clip-*
Whisper OpenAI Whisper models
ColPali starlight-ai/colpali-v1.2-merged-onnx
Colbert answerdotai/answerai-colbert-small-v1, jinaai/jina-colbert-v2 and more
Splade Splade Models and other Splade like models
Reranker Jina Reranker Models, Xenova/bge-reranker

Splade Models:

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.SparseBert, "prithivida/Splade_PP_en_v1"
)

ONNX-Runtime: from_pretrained_onnx

BERT

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_id="onnx_model_link"
)

ColPali

model: ColpaliModel = ColpaliModel.from_pretrained_onnx("starlight-ai/colpali-v1.2-merged-onnx", None)

Colbert

sentences = [
"The quick brown fox jumps over the lazy dog", 
"The cat is sleeping on the mat", "The dog is barking at the moon", 
"I love pizza", 
"The dog is sitting in the park"]

model = ColbertModel.from_pretrained_onnx("jinaai/jina-colbert-v2", path_in_repo="onnx/model.onnx")
embeddings = model.embed(sentences, batch_size=2)

ModernBERT

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

ReRankers

reranker = Reranker.from_pretrained("jinaai/jina-reranker-v1-turbo-en", dtype=Dtype.F16)

results: list[RerankerResult] = reranker.rerank(["What is the capital of France?"], ["France is a country in Europe.", "Paris is the capital of France."], 2)

For Semantic Chunking

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="sentence-transformers/all-MiniLM-L12-v2"
)

# with semantic encoder
semantic_encoder = EmbeddingModel.from_pretrained_hf(WhichModel.Jina, model_id = "jinaai/jina-embeddings-v2-small-en")
config = TextEmbedConfig(chunk_size=256, batch_size=32, splitting_strategy = "semantic", semantic_encoder=semantic_encoder)

🧑‍🚀 Getting Started

💚 Installation

pip install embed-anything

For GPUs and using special models like ColPali

pip install embed-anything-gpu

Usage

➡️ Usage For 0.3 and later version

To use local embedding: we support Bert and Jina

model = EmbeddingModel.from_pretrained_local(
    WhichModel.Bert, model_id="Hugging_face_link"
)
data = embed_anything.embed_file("test_files/test.pdf", embedder=model)

For multimodal embedding: we support CLIP

Requirements Directory with pictures you want to search for example we have test_files with images of cat, dogs etc

import embed_anything
from embed_anything import EmbedData
model = embed_anything.EmbeddingModel.from_pretrained_local(
    embed_anything.WhichModel.Clip,
    model_id="openai/clip-vit-base-patch16",
    # revision="refs/pr/15",
)
data: list[EmbedData] = embed_anything.embed_directory("test_files", embedder=model)
embeddings = np.array([data.embedding for data in data])
query = ["Photo of a monkey?"]
query_embedding = np.array(
    embed_anything.embed_query(query, embedder=model)[0].embedding
)
similarities = np.dot(embeddings, query_embedding)
max_index = np.argmax(similarities)
Image.open(data[max_index].text).show()

Audio Embedding using Whisper

requirements: Audio .wav files.

import embed_anything
from embed_anything import (
    AudioDecoderModel,
    EmbeddingModel,
    embed_audio_file,
    TextEmbedConfig,
)
# choose any whisper or distilwhisper model from https://huggingface.co/distil-whisper or https://huggingface.co/collections/openai/whisper-release-6501bba2cf999715fd953013
audio_decoder = AudioDecoderModel.from_pretrained_hf(
    "openai/whisper-tiny.en", revision="main", model_type="tiny-en", quantized=False
)
embedder = EmbeddingModel.from_pretrained_hf(
    embed_anything.WhichModel.Bert,
    model_id="sentence-transformers/all-MiniLM-L6-v2",
    revision="main",
)
config = TextEmbedConfig(chunk_size=200, batch_size=32)
data = embed_anything.embed_audio_file(
    "test_files/audio/samples_hp0.wav",
    audio_decoder=audio_decoder,
    embedder=embedder,
    text_embed_config=config,
)
print(data[0].metadata)

Using ONNX Models

To use ONNX models, you can either use the ONNXModel enum or the model_id from the Hugging Face model.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_name = ONNXModel.AllMiniLML6V2Q
)

For some models, you can also specify the dtype to use for the model.

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

Using the above method is best to ensure that the model works correctly as these models are tested. But if you want to use other models, like finetuned models, you can use the hf_model_id and path_in_repo to load the model like below.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Jina, hf_model_id = "jinaai/jina-embeddings-v2-small-en", path_in_repo="model.onnx"
)

To see all the ONNX models supported with model_name, see here

🚧 Contributing to EmbedAnything

First of all, thank you for taking the time to contribute to this project. We truly appreciate your contributions, whether it's bug reports, feature suggestions, or pull requests. Your time and effort are highly valued in this project. 🚀

This document provides guidelines and best practices to help you to contribute effectively. These are meant to serve as guidelines, not strict rules. We encourage you to use your best judgment and feel comfortable proposing changes to this document through a pull request.

  • Roadmap
  • Quick Start
  • Guidelines
  • 🏎️ RoadMap

    Accomplishments

    One of the aims of EmbedAnything is to allow AI engineers to easily use state of the art embedding models on typical files and documents. A lot has already been accomplished here and these are the formats that we support right now and a few more have to be done.

    Adding Fine-tuning

    One of the major goals of this year is to add finetuning these models on your data. Like a simple sentence transformer does.

    🖼️ Modalities and Source

    We’re excited to share that we've expanded our platform to support multiple modalities, including:

    • Audio files

    • Markdowns

    • Websites

    • Images

    • Videos

    • Graph

    This gives you the flexibility to work with various data types all in one place! 🌐

    💜 Product

    We’ve rolled out some major updates in version 0.3 to improve both functionality and performance. Here’s what’s new:

    • Semantic Chunking: Optimized chunking strategy for better Retrieval-Augmented Generation (RAG) workflows.

    • Streaming for Efficient Indexing: We’ve introduced streaming for memory-efficient indexing in vector databases. Want to know more? Check out our article on this feature here: https://www.analyticsvidhya.com/blog/2024/09/vector-streaming/

    • Zero-Shot Applications: Explore our zero-shot application demos to see the power of these updates in action.

    • Intuitive Functions: Version 0.3 includes a complete refactor for more intuitive functions, making the platform easier to use.

    • Chunkwise Streaming: Instead of file-by-file streaming, we now support chunkwise streaming, allowing for more flexible and efficient data processing.

    Check out the latest release : and see how these features can supercharge your GenerativeAI pipeline! ✨

    🚀Coming Soon

    ⚙️ Performance

    We've received quite a few questions about why we're using Candle, so here's a quick explanation:

    One of the main reasons is that Candle doesn't require any specific ONNX format models, which means it can work seamlessly with any Hugging Face model. This flexibility has been a key factor for us. However, we also recognize that we’ve been compromising a bit on speed in favor of that flexibility.

    What’s Next? To address this, we’re excited to announce that we’re introducing Candle-ONNX along with our previous framework on hugging-face ,

    ➡️ Support for GGUF models

    • Significantly faster performance
    • Stay tuned for these exciting updates! 🚀

    🫐Embeddings:

    We had multimodality from day one for our infrastructure. We have already included it for websites, images and audios but we want to expand it further to.

    ☑️Graph embedding -- build deepwalks embeddings depth first and word to vec
    ☑️Video Embedding
    ☑️ Yolo Clip

    🌊Expansion to other Vector Adapters

    We currently support a wide range of vector databases for streaming embeddings, including:

    • Elastic: thanks to amazing and active Elastic team for the contribution
    • Weaviate
    • Pinecone

    But we're not stopping there! We're actively working to expand this list.

    Want to Contribute? If you’d like to add support for your favorite vector database, we’d love to have your help! Check out our contribution.md for guidelines, or feel free to reach out directly starlight-search@proton.me. Let's build something amazing together! 💡

    Project details


    Download files

    Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

    Source Distribution

    embed_anything-0.5.2.tar.gz (926.7 kB view details)

    Uploaded Source

    Built Distributions

    embed_anything-0.5.2-cp313-cp313-macosx_11_0_arm64.whl (12.0 MB view details)

    Uploaded CPython 3.13 macOS 11.0+ ARM64

    embed_anything-0.5.2-cp312-cp312-win_amd64.whl (15.4 MB view details)

    Uploaded CPython 3.12 Windows x86-64

    embed_anything-0.5.2-cp312-cp312-manylinux_2_34_x86_64.whl (19.3 MB view details)

    Uploaded CPython 3.12 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.2-cp312-cp312-macosx_11_0_arm64.whl (12.0 MB view details)

    Uploaded CPython 3.12 macOS 11.0+ ARM64

    embed_anything-0.5.2-cp311-cp311-win_amd64.whl (15.4 MB view details)

    Uploaded CPython 3.11 Windows x86-64

    embed_anything-0.5.2-cp311-cp311-manylinux_2_34_x86_64.whl (19.3 MB view details)

    Uploaded CPython 3.11 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.2-cp311-cp311-macosx_11_0_arm64.whl (12.0 MB view details)

    Uploaded CPython 3.11 macOS 11.0+ ARM64

    embed_anything-0.5.2-cp310-cp310-win_amd64.whl (15.4 MB view details)

    Uploaded CPython 3.10 Windows x86-64

    embed_anything-0.5.2-cp310-cp310-manylinux_2_34_x86_64.whl (19.3 MB view details)

    Uploaded CPython 3.10 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.2-cp39-cp39-win_amd64.whl (15.4 MB view details)

    Uploaded CPython 3.9 Windows x86-64

    embed_anything-0.5.2-cp39-cp39-manylinux_2_34_x86_64.whl (19.3 MB view details)

    Uploaded CPython 3.9 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.2-cp38-cp38-win_amd64.whl (15.4 MB view details)

    Uploaded CPython 3.8 Windows x86-64

    File details

    Details for the file embed_anything-0.5.2.tar.gz.

    File metadata

    • Download URL: embed_anything-0.5.2.tar.gz
    • Upload date:
    • Size: 926.7 kB
    • Tags: Source
    • Uploaded using Trusted Publishing? Yes
    • Uploaded via: maturin/1.8.1

    File hashes

    Hashes for embed_anything-0.5.2.tar.gz
    Algorithm Hash digest
    SHA256 1637a346f62a765719b641fbf1fea19afea81d20d5e4f91099ed8ada8ec514b6
    MD5 f31503b6054599018f6c72f25598f2b9
    BLAKE2b-256 0521612d9fb2c51519d2cd6b5250f7ed67da2a7bfbadb7bbd55e5868eb8acacf

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp313-cp313-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp313-cp313-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 7e2b1e55712126181c8b114ff16cb25f96947b513ca3f7fcda31c8bae9e9159c
    MD5 0790f92abde466703fbf30f4e7f3e3fb
    BLAKE2b-256 53f434e42180be58402327754e70e276ccb824848d343ca0c53044153fb3d298

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp312-cp312-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp312-cp312-win_amd64.whl
    Algorithm Hash digest
    SHA256 039b07bf58d6a2c0213f23dec85182dd9f476a6a423b46e1adfb2c695f8bc315
    MD5 8c83e6a006eb8e6fb1c030b732b74d06
    BLAKE2b-256 ea2233b304b689c7e2062302b06744ca4d7992e3fbf445dccd8415dfc87b75a2

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp312-cp312-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp312-cp312-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 2c79cd64854f3f9055bee61bd953e47614c712b549daf89c013754d3c65f7e1f
    MD5 e540b13a7db259f8527680c1b8aa8349
    BLAKE2b-256 c4272161636d6ef4018bd9fa5e6880c491fc8a296f17c0254369c772295bd15d

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp312-cp312-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp312-cp312-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 df5ceddeb4ad74fb93ff81a0afe224f4f818e7a2c50960ac336af93a9088ccea
    MD5 293c302872980c5cec463204578706eb
    BLAKE2b-256 fbdfef53d42c4b5d4da64016d06a10cdadb1fd098c3bcebc4cb6248eb95b4475

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp311-cp311-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp311-cp311-win_amd64.whl
    Algorithm Hash digest
    SHA256 48a3333dc0fd47946b09ebf1a1c8039db92b40d95962a23d4f878c7201359cbe
    MD5 ce160cd30789ef33a7a25f9f196150b8
    BLAKE2b-256 196a8eee13d424219e2a068419b0f0af9dd8556a8c4585ca6be4c17c0bdd0e76

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp311-cp311-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp311-cp311-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 bdc9ace07d9d292a2e7679e38aeb4e80198334d1af57116bea036f086427f1e3
    MD5 75e74d8eca3c70434b4cfdc985bf16d1
    BLAKE2b-256 e31673037280f3d4f35cbfbab1e4852f682cc5469c355cb3d702e3e7785d7f1e

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp311-cp311-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp311-cp311-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 12dfa03342ddd026f00fda376937c577a05277b235924ff69037260d76054b4d
    MD5 d9b528c0769e883aa204bed35cc611be
    BLAKE2b-256 647ba620ea115c33d80b6d102a23cafe9cba08eeee573aaabca424f9ab934865

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp310-cp310-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp310-cp310-win_amd64.whl
    Algorithm Hash digest
    SHA256 2a2d282019c84f0239052667d1199c864169a0c9bb41ac959e71715cd71bf9aa
    MD5 fdc78d4bbf4df38a21cd6e5870898222
    BLAKE2b-256 2fa05e1a17be3d50caff3103749db3980c83561206d141bf4dafea28f85506c8

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp310-cp310-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp310-cp310-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 83898dfe7f253b091b4ce3962dde2d51a70a50baa3dadc9a5b5d8c7a4e68e353
    MD5 168979d94d13e0ddfb01213a15baded5
    BLAKE2b-256 eecde61b67a1a92a9e82ae5395e36337fda84ff3210f237ea255b95184e730bc

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp39-cp39-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp39-cp39-win_amd64.whl
    Algorithm Hash digest
    SHA256 b06fb910c3faa132fef3baba99d2780fd99bad2a21200e143f758f35cd4ceb39
    MD5 b7951e1a090f9d5ab012f75d736473b3
    BLAKE2b-256 1a918828242a092485bbe03d52bcd7dbd48836d3b16ee8384d80088b4572c063

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp39-cp39-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp39-cp39-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 f6143857a1be135efd51c715cc13140bcaf3d9f3576ac34de95a85377c588269
    MD5 d372010285736613276789e004d79daf
    BLAKE2b-256 e75c847501226b65c1780105a0c627b773f28c686d4ab3e2f31700ef11b4a958

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.2-cp38-cp38-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.2-cp38-cp38-win_amd64.whl
    Algorithm Hash digest
    SHA256 dbef35ac1d3ec1b9eb0950cfc905465fdc032941a24b7bdc7506478d888a2493
    MD5 e9e589f4efe5c903b79d17db0d5756dd
    BLAKE2b-256 b93e278c0ffc79846ed7f5b5b12bea27411efb639fe90aefd894354e12a6b789

    See more details on using hashes here.

    Supported by

    AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page