Skip to main content

Embed anything at lightning speed

Project description

Downloads gpu Open in Colab package roadmap roadmap

Inference, Ingestion, and Indexing – supercharged by Rust 🦀
Python docs »
Rust docs »
View Demo · Benches · Vector Streaming Adapters . Search in Audio Space

EmbedAnything is a minimalist, highly performant, lightning-fast, lightweight, multisource, multimodal, and local embedding pipeline built in Rust. Whether you're working with text, images, audio, PDFs, websites, or other media, EmbedAnything streamlines the process of generating embeddings from various sources and seamlessly streaming (memory-efficient-indexing) them to a vector database. It supports dense, sparse, ONNX and late-interaction embeddings, offering flexibility for a wide range of use cases.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. How to add custom model and chunk size

🚀 Key Features

  • Local Embedding : Works with local embedding models like BERT and JINA
  • ONNX Models: Works with ONNX models for BERT and ColPali
  • ColPali : Support for ColPali in GPU version both on ONNX and Candle
  • Splade : Support for sparse embeddings for hybrid
  • ReRankers : Support for ReRanking Models for better RAG.
  • ColBERT : Support for ColBert on ONNX
  • ModernBERT: Increase your token length to 8K
  • Cloud Embedding Models:: Supports OpenAI and Cohere.
  • MultiModality : Works with text sources like PDFs, txt, md, Images JPG and Audio, .WAV
  • Rust : All the file processing is done in rust for speed and efficiency
  • GPU support : We have taken care of hardware acceleration on GPU as well.
  • Python Interface: Packaged as a Python library for seamless integration into your existing projects.
  • Vector Streaming: Continuously create and stream embeddings if you have low resource.

💡What is Vector Streaming

Vector Streaming enables you to process and generate embeddings for files and stream them, so if you have 10 GB of file, it can continuously generate embeddings Chunk by Chunk, that you can segment semantically, and store them in the vector database of your choice, Thus it eliminates bulk embeddings storage on RAM at once. The embedding process happens separetly from the main process, so as to maintain high performance enabled by rust MPSC.

EmbedAnythingXWeaviate

🦀 Why Embed Anything

➡️Faster execution.
➡️Memory Management: Rust enforces memory management simultaneously, preventing memory leaks and crashes that can plague other languages
➡️True multithreading
➡️Running embedding models locally and efficiently
➡️Candle allows inferences on CUDA-enabled GPUs right out of the box.
➡️Decrease the memory usage of EmbedAnything.
➡️Supports range of models, Dense, Sparse, Late-interaction, ReRanker, ModernBert.

⭐ Supported Models

We support any hugging-face models on Candle. And We also support ONNX runtime for BERT and ColPali.

How to add custom model on candle: from_pretrained_hf

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="model link from huggingface"
)
config = TextEmbedConfig(chunk_size=200, batch_size=32)
data = embed_anything.embed_file("file_address", embedder=model, config=config)
Model HF link
Jina Jina Models
Bert All Bert based models
CLIP openai/clip-*
Whisper OpenAI Whisper models
ColPali starlight-ai/colpali-v1.2-merged-onnx
Colbert answerdotai/answerai-colbert-small-v1, jinaai/jina-colbert-v2 and more
Splade Splade Models and other Splade like models
Reranker Jina Reranker Models, Xenova/bge-reranker

Splade Models:

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.SparseBert, "prithivida/Splade_PP_en_v1"
)

ONNX-Runtime: from_pretrained_onnx

BERT

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_id="onnx_model_link"
)

ColPali

model: ColpaliModel = ColpaliModel.from_pretrained_onnx("starlight-ai/colpali-v1.2-merged-onnx", None)

Colbert

sentences = [
"The quick brown fox jumps over the lazy dog", 
"The cat is sleeping on the mat", "The dog is barking at the moon", 
"I love pizza", 
"The dog is sitting in the park"]

model = ColbertModel.from_pretrained_onnx("jinaai/jina-colbert-v2", path_in_repo="onnx/model.onnx")
embeddings = model.embed(sentences, batch_size=2)

ModernBERT

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

ReRankers

reranker = Reranker.from_pretrained("jinaai/jina-reranker-v1-turbo-en", dtype=Dtype.F16)

results: list[RerankerResult] = reranker.rerank(["What is the capital of France?"], ["France is a country in Europe.", "Paris is the capital of France."], 2)

For Semantic Chunking

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="sentence-transformers/all-MiniLM-L12-v2"
)

# with semantic encoder
semantic_encoder = EmbeddingModel.from_pretrained_hf(WhichModel.Jina, model_id = "jinaai/jina-embeddings-v2-small-en")
config = TextEmbedConfig(chunk_size=256, batch_size=32, splitting_strategy = "semantic", semantic_encoder=semantic_encoder)

🧑‍🚀 Getting Started

💚 Installation

pip install embed-anything

For GPUs and using special models like ColPali

pip install embed-anything-gpu

Usage

➡️ Usage For 0.3 and later version

To use local embedding: we support Bert and Jina

model = EmbeddingModel.from_pretrained_local(
    WhichModel.Bert, model_id="Hugging_face_link"
)
data = embed_anything.embed_file("test_files/test.pdf", embedder=model)

For multimodal embedding: we support CLIP

Requirements Directory with pictures you want to search for example we have test_files with images of cat, dogs etc

import embed_anything
from embed_anything import EmbedData
model = embed_anything.EmbeddingModel.from_pretrained_local(
    embed_anything.WhichModel.Clip,
    model_id="openai/clip-vit-base-patch16",
    # revision="refs/pr/15",
)
data: list[EmbedData] = embed_anything.embed_directory("test_files", embedder=model)
embeddings = np.array([data.embedding for data in data])
query = ["Photo of a monkey?"]
query_embedding = np.array(
    embed_anything.embed_query(query, embedder=model)[0].embedding
)
similarities = np.dot(embeddings, query_embedding)
max_index = np.argmax(similarities)
Image.open(data[max_index].text).show()

Audio Embedding using Whisper

requirements: Audio .wav files.

import embed_anything
from embed_anything import (
    AudioDecoderModel,
    EmbeddingModel,
    embed_audio_file,
    TextEmbedConfig,
)
# choose any whisper or distilwhisper model from https://huggingface.co/distil-whisper or https://huggingface.co/collections/openai/whisper-release-6501bba2cf999715fd953013
audio_decoder = AudioDecoderModel.from_pretrained_hf(
    "openai/whisper-tiny.en", revision="main", model_type="tiny-en", quantized=False
)
embedder = EmbeddingModel.from_pretrained_hf(
    embed_anything.WhichModel.Bert,
    model_id="sentence-transformers/all-MiniLM-L6-v2",
    revision="main",
)
config = TextEmbedConfig(chunk_size=200, batch_size=32)
data = embed_anything.embed_audio_file(
    "test_files/audio/samples_hp0.wav",
    audio_decoder=audio_decoder,
    embedder=embedder,
    text_embed_config=config,
)
print(data[0].metadata)

Using ONNX Models

To use ONNX models, you can either use the ONNXModel enum or the model_id from the Hugging Face model.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_name = ONNXModel.AllMiniLML6V2Q
)

For some models, you can also specify the dtype to use for the model.

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

Using the above method is best to ensure that the model works correctly as these models are tested. But if you want to use other models, like finetuned models, you can use the hf_model_id and path_in_repo to load the model like below.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Jina, hf_model_id = "jinaai/jina-embeddings-v2-small-en", path_in_repo="model.onnx"
)

To see all the ONNX models supported with model_name, see here

🚧 Contributing to EmbedAnything

First of all, thank you for taking the time to contribute to this project. We truly appreciate your contributions, whether it's bug reports, feature suggestions, or pull requests. Your time and effort are highly valued in this project. 🚀

This document provides guidelines and best practices to help you to contribute effectively. These are meant to serve as guidelines, not strict rules. We encourage you to use your best judgment and feel comfortable proposing changes to this document through a pull request.

  • Roadmap
  • Quick Start
  • Guidelines
  • 🏎️ RoadMap

    Accomplishments

    One of the aims of EmbedAnything is to allow AI engineers to easily use state of the art embedding models on typical files and documents. A lot has already been accomplished here and these are the formats that we support right now and a few more have to be done.

    Adding Fine-tuning

    One of the major goals of this year is to add finetuning these models on your data. Like a simple sentence transformer does.

    🖼️ Modalities and Source

    We’re excited to share that we've expanded our platform to support multiple modalities, including:

    • Audio files

    • Markdowns

    • Websites

    • Images

    • Videos

    • Graph

    This gives you the flexibility to work with various data types all in one place! 🌐

    💜 Product

    We’ve rolled out some major updates in version 0.3 to improve both functionality and performance. Here’s what’s new:

    • Semantic Chunking: Optimized chunking strategy for better Retrieval-Augmented Generation (RAG) workflows.

    • Streaming for Efficient Indexing: We’ve introduced streaming for memory-efficient indexing in vector databases. Want to know more? Check out our article on this feature here: https://www.analyticsvidhya.com/blog/2024/09/vector-streaming/

    • Zero-Shot Applications: Explore our zero-shot application demos to see the power of these updates in action.

    • Intuitive Functions: Version 0.3 includes a complete refactor for more intuitive functions, making the platform easier to use.

    • Chunkwise Streaming: Instead of file-by-file streaming, we now support chunkwise streaming, allowing for more flexible and efficient data processing.

    Check out the latest release : and see how these features can supercharge your GenerativeAI pipeline! ✨

    🚀Coming Soon

    ⚙️ Performance

    We've received quite a few questions about why we're using Candle, so here's a quick explanation:

    One of the main reasons is that Candle doesn't require any specific ONNX format models, which means it can work seamlessly with any Hugging Face model. This flexibility has been a key factor for us. However, we also recognize that we’ve been compromising a bit on speed in favor of that flexibility.

    What’s Next? To address this, we’re excited to announce that we’re introducing Candle-ONNX along with our previous framework on hugging-face ,

    ➡️ Support for GGUF models

    • Significantly faster performance
    • Stay tuned for these exciting updates! 🚀

    🫐Embeddings:

    We had multimodality from day one for our infrastructure. We have already included it for websites, images and audios but we want to expand it further to.

    ☑️Graph embedding -- build deepwalks embeddings depth first and word to vec
    ☑️Video Embedding
    ☑️ Yolo Clip

    🌊Expansion to other Vector Adapters

    We currently support a wide range of vector databases for streaming embeddings, including:

    • Elastic: thanks to amazing and active Elastic team for the contribution
    • Weaviate
    • Pinecone

    But we're not stopping there! We're actively working to expand this list.

    Want to Contribute? If you’d like to add support for your favorite vector database, we’d love to have your help! Check out our contribution.md for guidelines, or feel free to reach out directly starlight-search@proton.me. Let's build something amazing together! 💡

    Project details


    Download files

    Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

    Source Distribution

    embed_anything_gpu-0.5.3.tar.gz (939.9 kB view details)

    Uploaded Source

    Built Distributions

    embed_anything_gpu-0.5.3-cp313-cp313-win_amd64.whl (16.1 MB view details)

    Uploaded CPython 3.13 Windows x86-64

    embed_anything_gpu-0.5.3-cp312-cp312-win_amd64.whl (16.1 MB view details)

    Uploaded CPython 3.12 Windows x86-64

    embed_anything_gpu-0.5.3-cp312-cp312-manylinux_2_31_x86_64.whl (19.5 MB view details)

    Uploaded CPython 3.12 manylinux: glibc 2.31+ x86-64

    embed_anything_gpu-0.5.3-cp311-cp311-win_amd64.whl (16.1 MB view details)

    Uploaded CPython 3.11 Windows x86-64

    embed_anything_gpu-0.5.3-cp311-cp311-manylinux_2_31_x86_64.whl (19.5 MB view details)

    Uploaded CPython 3.11 manylinux: glibc 2.31+ x86-64

    embed_anything_gpu-0.5.3-cp311-cp311-macosx_11_0_arm64.whl (14.1 MB view details)

    Uploaded CPython 3.11 macOS 11.0+ ARM64

    embed_anything_gpu-0.5.3-cp310-cp310-win_amd64.whl (16.1 MB view details)

    Uploaded CPython 3.10 Windows x86-64

    embed_anything_gpu-0.5.3-cp310-cp310-manylinux_2_31_x86_64.whl (19.5 MB view details)

    Uploaded CPython 3.10 manylinux: glibc 2.31+ x86-64

    embed_anything_gpu-0.5.3-cp310-cp310-macosx_11_0_arm64.whl (14.1 MB view details)

    Uploaded CPython 3.10 macOS 11.0+ ARM64

    embed_anything_gpu-0.5.3-cp39-cp39-win_amd64.whl (16.1 MB view details)

    Uploaded CPython 3.9 Windows x86-64

    embed_anything_gpu-0.5.3-cp39-cp39-manylinux_2_31_x86_64.whl (19.5 MB view details)

    Uploaded CPython 3.9 manylinux: glibc 2.31+ x86-64

    embed_anything_gpu-0.5.3-cp39-cp39-macosx_11_0_arm64.whl (14.1 MB view details)

    Uploaded CPython 3.9 macOS 11.0+ ARM64

    embed_anything_gpu-0.5.3-cp38-cp38-win_amd64.whl (16.1 MB view details)

    Uploaded CPython 3.8 Windows x86-64

    embed_anything_gpu-0.5.3-cp38-cp38-manylinux_2_31_x86_64.whl (19.5 MB view details)

    Uploaded CPython 3.8 manylinux: glibc 2.31+ x86-64

    File details

    Details for the file embed_anything_gpu-0.5.3.tar.gz.

    File metadata

    • Download URL: embed_anything_gpu-0.5.3.tar.gz
    • Upload date:
    • Size: 939.9 kB
    • Tags: Source
    • Uploaded using Trusted Publishing? No
    • Uploaded via: maturin/1.8.2

    File hashes

    Hashes for embed_anything_gpu-0.5.3.tar.gz
    Algorithm Hash digest
    SHA256 2dcc245c8c47b2410ada63c79db147eeba2e9e252d80be0f8faa337210f34b16
    MD5 24c76c48a5ba2da317dab841ef62c092
    BLAKE2b-256 eaadbba4d40beca57f5ce343488da1d618acb03132886cb61406c478e7781671

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp313-cp313-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp313-cp313-win_amd64.whl
    Algorithm Hash digest
    SHA256 e1001fd9e8d9273064881ba9871af3aa7d030801e55f88fba35f806404dd9d26
    MD5 dba851ebbc37b0cfb2953893e05d57a5
    BLAKE2b-256 e9ba9b0d0b463726a347040e912813c8e5ac79d4aa0dd5f4ed10cd620663d631

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp312-cp312-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp312-cp312-win_amd64.whl
    Algorithm Hash digest
    SHA256 1a3e58ee6532ecb8c0a6b1cbff998ae86af918a6fbfaa5a2e1f57472a4846dc5
    MD5 be0ab19f51225ee3764ec96ef2cdd08e
    BLAKE2b-256 4366641d4aa0cc29c7a0101393d44692f435ca58a1fe30514cd8ec46202e3eea

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp312-cp312-manylinux_2_31_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp312-cp312-manylinux_2_31_x86_64.whl
    Algorithm Hash digest
    SHA256 99222617f93a5c26dceaeb724e3c96df2da8a6f9e17763d43cdcd5553c07f23a
    MD5 7361501095373ef5c020da1ba7c6eccd
    BLAKE2b-256 4d05687b6c0dbe5495f8a995efb7229eb9229cad94d4994825ad47a83c2d743d

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp311-cp311-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp311-cp311-win_amd64.whl
    Algorithm Hash digest
    SHA256 27b1f9736fe19b64e0e53cbf71620cb027ea83f5e4b0a94bc671332a2c45c903
    MD5 d154b47343fbac8e42213149218939d4
    BLAKE2b-256 96158e4ed626e9c6908c209676769ef9a52c5c0d9164d535b945568eccefa32b

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp311-cp311-manylinux_2_31_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp311-cp311-manylinux_2_31_x86_64.whl
    Algorithm Hash digest
    SHA256 86e11e6a16cf8204fbde294004afc1d7186027abe4d999c48e8fdf2a4a8f8c0d
    MD5 2d19cd56193dce6284b5e57373b854cd
    BLAKE2b-256 e6ad6651e75765178d2b6cbfdb76d34fc39b07bf57b1f22127a5573c3cfd0a17

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp311-cp311-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp311-cp311-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 f867fc3d7926042da2072e639b95fb565e787573e6c87000ec356467161368aa
    MD5 60dc859bfef6f72164d7d95b5fc6900b
    BLAKE2b-256 babec0eabd1d1ce17cd2361ca33a01d985d8b3c43a35ae1775f3908cd0563c7e

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp310-cp310-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp310-cp310-win_amd64.whl
    Algorithm Hash digest
    SHA256 b40d640615db09e9290492305ef9767f8ec8fff96612ce48b409044a99cd41e3
    MD5 bc3468889673317d39c67db896208487
    BLAKE2b-256 eb254973a63fcec44a5ead4c29653fad13c36086553287cda41d99bf8c38ca8b

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp310-cp310-manylinux_2_31_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp310-cp310-manylinux_2_31_x86_64.whl
    Algorithm Hash digest
    SHA256 5c9f22c8c8c167b824ab48bb7cd80ade5a0ae3206e72b7e205643054fdfea158
    MD5 8d740dbdf5ffd4131fce82f2f8b76bfe
    BLAKE2b-256 dbc29d679b3f156dfeac0005aefacbcc758632e67f0015989b0bac45b36e601f

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp310-cp310-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp310-cp310-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 c58fa39f29e07482276e4f40ddfaa1f13452333845bad80c282bdfca940a10a6
    MD5 2e89bb3c137a70e068df70ef0b0ac485
    BLAKE2b-256 d0d75c14b44d5ee92fc75df7c9941e1d23385795110f8cc8848f381cd3169723

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp39-cp39-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp39-cp39-win_amd64.whl
    Algorithm Hash digest
    SHA256 774f8ae3a887d90f5c2f95382f0c0ee5886be66575a2b9c853e362db21c93a3e
    MD5 007e65876941a273b0343d7bd551d580
    BLAKE2b-256 05c1c41d39b941466c044cf51f590bec5edd46c4e92da28c8ded8ca09755bc94

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp39-cp39-manylinux_2_31_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp39-cp39-manylinux_2_31_x86_64.whl
    Algorithm Hash digest
    SHA256 af294322c7c0d2eef042cdba09b7d8dc0ad4a02c8350a4dbb912df32c704a6f9
    MD5 a68cde237afd9380e8e6cd16b4e9dcae
    BLAKE2b-256 7cba68a2a7d917ad2700b6d84a5623eff283618e6f4926b60cdbcb57716507cf

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp39-cp39-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp39-cp39-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 38d4e342e8712be66f56e85ca99ef223d21ad2643fd0ce0b46c70086fc2c4f1e
    MD5 c4d445fed4159361264355437a635fc1
    BLAKE2b-256 da7be1f3ffd7debedcfdd9e690a6b8819e0c4b836976b68064de4aaa17edf5ed

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp38-cp38-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp38-cp38-win_amd64.whl
    Algorithm Hash digest
    SHA256 43367a38022a4f4cb7cb8109781c4ebe1a23ff8e78a0d7c3985c0d641b8e10ae
    MD5 d7f7c09abeb0b9b0a6355fc9bfebc379
    BLAKE2b-256 d4ab26a5373f66593c06c56f996d1741f5369fb26794576353b15f4e1e812a47

    See more details on using hashes here.

    File details

    Details for the file embed_anything_gpu-0.5.3-cp38-cp38-manylinux_2_31_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything_gpu-0.5.3-cp38-cp38-manylinux_2_31_x86_64.whl
    Algorithm Hash digest
    SHA256 538e54a089b6d8bc28e3b3c5d39cd20b342a6d6b630f0ad402204ff8559b3683
    MD5 88e6a6f964ebb7cf5090b88637c70087
    BLAKE2b-256 659ea637699ac5c15bea70a13d0e343ba170e972cd7cf847876542fb5cb013a3

    See more details on using hashes here.

    Supported by

    AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page