Skip to main content

Embed anything at lightning speed

Project description

Downloads gpu Open in Colab roadmap MkDocs

Inference, Ingestion, and Indexing in Rust 🦀
Python docs »
Rust docs »
Benchmarks · FAQ · Adapters . Collaborations

EmbedAnything is a minimalist, yet highly performant, lightning-fast, lightweight, multisource, multimodal, and local embedding pipeline built in Rust. Whether you're working with text, images, audio, PDFs, websites, or other media, EmbedAnything streamlines the process of generating embeddings from various sources and seamlessly streaming (memory-efficient-indexing) them to a vector database. It supports dense, sparse, ONNX, model2vec and late-interaction embeddings, offering flexibility for a wide range of use cases.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. How to add custom model and chunk size

🚀 Key Features

  • Candle Backend : Supports BERT, Jina, ColPali, Splade, ModernBERT
  • ONNX Backend: Supports BERT, Jina, ColPali, ColBERT Splade, Reranker, ModernBERT
  • Cloud Embedding Models:: Supports OpenAI and Cohere.
  • MultiModality : Works with text sources like PDFs, txt, md, Images JPG and Audio, .WAV
  • Rust : All the file processing is done in rust for speed and efficiency
  • GPU support : We have taken care of hardware acceleration on GPU as well.
  • Python Interface: Packaged as a Python library for seamless integration into your existing projects.
  • Vector Streaming: Continuously create and stream embeddings if you have low resource.
  • No Dependency on Pytorch Easy to deploy on cloud, as it comes with low memory footprint.

💡What is Vector Streaming

Vector Streaming enables you to process and generate embeddings for files and stream them, so if you have 10 GB of file, it can continuously generate embeddings Chunk by Chunk, that you can segment semantically, and store them in the vector database of your choice, Thus it eliminates bulk embeddings storage on RAM at once.

The embedding process happens separetly from the main process, so as to maintain high performance enabled by rust MPSC, and no memory leak as embeddings are directly saved to vector database. Find our blog.

EmbedAnythingXWeaviate

🦀 Why Embed Anything

➡️Faster execution.
➡️No Pytorch Dependency, thus low-memory footprint and easy to deploy on cloud.
➡️Memory Management: Rust enforces memory management simultaneously, preventing memory leaks and crashes that can plague other languages
➡️True multithreading
➡️Running embedding models locally and efficiently
➡️Candle allows inferences on CUDA-enabled GPUs right out of the box.
➡️Decrease the memory usage.
➡️Supports range of models, Dense, Sparse, Late-interaction, ReRanker, ModernBert.

🍓 Our Past Collaborations:

We have collaborated with reputed enterprise like Elastic, Weaviate, SingleStore and Datahours

You can get in touch with us for further collaborations.

Benchmarks

Only measures embedding model inference speed, on onnx-runtime. Code

⭐ Supported Models

We support any hugging-face models on Candle. And We also support ONNX runtime for BERT and ColPali.

How to add custom model on candle: from_pretrained_hf

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="model link from huggingface"
)
config = TextEmbedConfig(chunk_size=1000, batch_size=32)
data = embed_anything.embed_file("file_address", embedder=model, config=config)
Model HF link
Jina Jina Models
Bert All Bert based models
CLIP openai/clip-*
Whisper OpenAI Whisper models
ColPali starlight-ai/colpali-v1.2-merged-onnx
Colbert answerdotai/answerai-colbert-small-v1, jinaai/jina-colbert-v2 and more
Splade Splade Models and other Splade like models
Reranker Jina Reranker Models, Xenova/bge-reranker
Model2Vec model2vec, minishlab/potion-base-8M

Splade Models:

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.SparseBert, "prithivida/Splade_PP_en_v1"
)

ONNX-Runtime: from_pretrained_onnx

BERT

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_id="onnx_model_link"
)

ColPali

model: ColpaliModel = ColpaliModel.from_pretrained_onnx("starlight-ai/colpali-v1.2-merged-onnx", None)

Colbert

sentences = [
"The quick brown fox jumps over the lazy dog", 
"The cat is sleeping on the mat", "The dog is barking at the moon", 
"I love pizza", 
"The dog is sitting in the park"]

model = ColbertModel.from_pretrained_onnx("jinaai/jina-colbert-v2", path_in_repo="onnx/model.onnx")
embeddings = model.embed(sentences, batch_size=2)

ModernBERT

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

ReRankers

reranker = Reranker.from_pretrained("jinaai/jina-reranker-v1-turbo-en", dtype=Dtype.F16)

results: list[RerankerResult] = reranker.rerank(["What is the capital of France?"], ["France is a country in Europe.", "Paris is the capital of France."], 2)

Embed 4

# Initialize the model once
model: EmbeddingModel = EmbeddingModel.from_pretrained_cloud(
    WhichModel.CohereVision, model_id="embed-v4.0"
)

For Semantic Chunking

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="sentence-transformers/all-MiniLM-L12-v2"
)

# with semantic encoder
semantic_encoder = EmbeddingModel.from_pretrained_hf(WhichModel.Jina, model_id = "jinaai/jina-embeddings-v2-small-en")
config = TextEmbedConfig(chunk_size=1000, batch_size=32, splitting_strategy = "semantic", semantic_encoder=semantic_encoder)

For late-chunking

config = TextEmbedConfig(
    chunk_size=1000,
    batch_size=8,
    splitting_strategy="sentence",
    late_chunking=True,
)

# Embed a single file
data: list[EmbedData] = model.embed_file("test_files/attention.pdf", config=config)

🧑‍🚀 Getting Started

💚 Installation

pip install embed-anything

For GPUs and using special models like ColPali

pip install embed-anything-gpu

Usage

➡️ Usage For 0.3 and later version

To use local embedding: we support Bert and Jina

model = EmbeddingModel.from_pretrained_local(
    WhichModel.Bert, model_id="Hugging_face_link"
)
data = embed_anything.embed_file("test_files/test.pdf", embedder=model)

For multimodal embedding: we support CLIP

Requirements Directory with pictures you want to search for example we have test_files with images of cat, dogs etc

import embed_anything
from embed_anything import EmbedData
model = embed_anything.EmbeddingModel.from_pretrained_local(
    embed_anything.WhichModel.Clip,
    model_id="openai/clip-vit-base-patch16",
    # revision="refs/pr/15",
)
data: list[EmbedData] = embed_anything.embed_image_directory("test_files", embedder=model)
embeddings = np.array([data.embedding for data in data])
query = ["Photo of a monkey?"]
query_embedding = np.array(
    embed_anything.embed_query(query, embedder=model)[0].embedding
)
similarities = np.dot(embeddings, query_embedding)
max_index = np.argmax(similarities)
Image.open(data[max_index].text).show()

Audio Embedding using Whisper

requirements: Audio .wav files.

import embed_anything
from embed_anything import (
    AudioDecoderModel,
    EmbeddingModel,
    embed_audio_file,
    TextEmbedConfig,
)
# choose any whisper or distilwhisper model from https://huggingface.co/distil-whisper or https://huggingface.co/collections/openai/whisper-release-6501bba2cf999715fd953013
audio_decoder = AudioDecoderModel.from_pretrained_hf(
    "openai/whisper-tiny.en", revision="main", model_type="tiny-en", quantized=False
)
embedder = EmbeddingModel.from_pretrained_hf(
    embed_anything.WhichModel.Bert,
    model_id="sentence-transformers/all-MiniLM-L6-v2",
    revision="main",
)
config = TextEmbedConfig(chunk_size=1000, batch_size=32)
data = embed_anything.embed_audio_file(
    "test_files/audio/samples_hp0.wav",
    audio_decoder=audio_decoder,
    embedder=embedder,
    text_embed_config=config,
)
print(data[0].metadata)

Using ONNX Models

To use ONNX models, you can either use the ONNXModel enum or the model_id from the Hugging Face model.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_name = ONNXModel.AllMiniLML6V2Q
)

For some models, you can also specify the dtype to use for the model.

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

Using the above method is best to ensure that the model works correctly as these models are tested. But if you want to use other models, like finetuned models, you can use the hf_model_id and path_in_repo to load the model like below.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Jina, hf_model_id = "jinaai/jina-embeddings-v2-small-en", path_in_repo="model.onnx"
)

To see all the ONNX models supported with model_name, see here

⁉️FAQ

Do I need to know rust to use or contribute to embedanything?

The answer is No. EmbedAnything provides you pyo3 bindings, so you can run any function in python without any issues. To contibute you should check out our guidelines and python folder example of adapters.

How is it different from fastembed?

We provide both backends, candle and onnx. On top of it we also give an end-to-end pipeline, that is you can ingest different data-types and index to any vector database, and inference any model. Fastembed is just an onnx-wrapper.

We've received quite a few questions about why we're using Candle.

One of the main reasons is that Candle doesn't require any specific ONNX format models, which means it can work seamlessly with any Hugging Face model. This flexibility has been a key factor for us. However, we also recognize that we’ve been compromising a bit on speed in favor of that flexibility.

🚧 Contributing to EmbedAnything

First of all, thank you for taking the time to contribute to this project. We truly appreciate your contributions, whether it's bug reports, feature suggestions, or pull requests. Your time and effort are highly valued in this project. 🚀

This document provides guidelines and best practices to help you to contribute effectively. These are meant to serve as guidelines, not strict rules. We encourage you to use your best judgment and feel comfortable proposing changes to this document through a pull request.

  • Roadmap
  • Quick Start
  • Guidelines
  • 🏎️ RoadMap

    Accomplishments

    One of the aims of EmbedAnything is to allow AI engineers to easily use state of the art embedding models on typical files and documents. A lot has already been accomplished here and these are the formats that we support right now and a few more have to be done.

    Adding Fine-tuning

    One of the major goals of this year is to add finetuning these models on your data. Like a simple sentence transformer does.

    🖼️ Modalities and Source

    We’re excited to share that we've expanded our platform to support multiple modalities, including:

    • Audio files

    • Markdowns

    • Websites

    • Images

    • Videos

    • Graph

    This gives you the flexibility to work with various data types all in one place! 🌐

    ⚙️ Performance

    We now support both candle and Onnx backend
    ➡️ Support for GGUF models

    🫐Embeddings:

    We had multimodality from day one for our infrastructure. We have already included it for websites, images and audios but we want to expand it further to.

    ➡️ Graph embedding -- build deepwalks embeddings depth first and word to vec
    ➡️ Video Embedding
    ➡️ Yolo Clip

    🌊Expansion to other Vector Adapters

    We currently support a wide range of vector databases for streaming embeddings, including:

    • Elastic: thanks to amazing and active Elastic team for the contribution
    • Weaviate
    • Pinecone
    • Qdrant
    • Milvus
    • Chroma

    How to add an adpters: https://starlight-search.com/blog/2024/02/25/adapter-development-guide.md

    💥 Create WASM demos to integrate embedanything directly to the browser.

    💜 Add support for ingestion from remote sources

    ➡️ Support for S3 bucket
    ➡️ Support for azure storage
    ➡️ Support for google drive/dropbox

    But we're not stopping there! We're actively working to expand this list.

    Want to Contribute? If you’d like to add support for your favorite vector database, we’d love to have your help! Check out our contribution.md for guidelines, or feel free to reach out directly starlight-search@proton.me. Let's build something amazing together! 💡

    A big Thank you to all our StarGazers

    Star History

    Star History Chart

    Project details


    Download files

    Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

    Source Distribution

    embed_anything-0.6.3.tar.gz (953.5 kB view details)

    Uploaded Source

    Built Distributions

    If you're not sure about the file name format, learn more about wheel file names.

    embed_anything-0.6.3-cp313-cp313-win_amd64.whl (17.7 MB view details)

    Uploaded CPython 3.13Windows x86-64

    embed_anything-0.6.3-cp313-cp313-manylinux_2_34_x86_64.whl (21.9 MB view details)

    Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

    embed_anything-0.6.3-cp313-cp313-macosx_11_0_arm64.whl (19.7 MB view details)

    Uploaded CPython 3.13macOS 11.0+ ARM64

    embed_anything-0.6.3-cp312-cp312-win_amd64.whl (17.7 MB view details)

    Uploaded CPython 3.12Windows x86-64

    embed_anything-0.6.3-cp312-cp312-manylinux_2_34_x86_64.whl (21.9 MB view details)

    Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

    embed_anything-0.6.3-cp312-cp312-macosx_11_0_arm64.whl (19.7 MB view details)

    Uploaded CPython 3.12macOS 11.0+ ARM64

    embed_anything-0.6.3-cp311-cp311-win_amd64.whl (17.7 MB view details)

    Uploaded CPython 3.11Windows x86-64

    embed_anything-0.6.3-cp311-cp311-manylinux_2_34_x86_64.whl (21.9 MB view details)

    Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

    embed_anything-0.6.3-cp311-cp311-macosx_11_0_arm64.whl (19.7 MB view details)

    Uploaded CPython 3.11macOS 11.0+ ARM64

    embed_anything-0.6.3-cp310-cp310-win_amd64.whl (17.7 MB view details)

    Uploaded CPython 3.10Windows x86-64

    embed_anything-0.6.3-cp310-cp310-manylinux_2_34_x86_64.whl (21.9 MB view details)

    Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

    embed_anything-0.6.3-cp39-cp39-win_amd64.whl (17.7 MB view details)

    Uploaded CPython 3.9Windows x86-64

    embed_anything-0.6.3-cp39-cp39-manylinux_2_34_x86_64.whl (21.9 MB view details)

    Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

    File details

    Details for the file embed_anything-0.6.3.tar.gz.

    File metadata

    • Download URL: embed_anything-0.6.3.tar.gz
    • Upload date:
    • Size: 953.5 kB
    • Tags: Source
    • Uploaded using Trusted Publishing? Yes
    • Uploaded via: maturin/1.8.6

    File hashes

    Hashes for embed_anything-0.6.3.tar.gz
    Algorithm Hash digest
    SHA256 22b27e0a22141cae62d8afd573aa75643a222d8ce75bd29d1d02bcaaf7fb0f27
    MD5 aa0dbd12ff1fb0208995718e17b93357
    BLAKE2b-256 4aa571a1b7dc5a11d26a46660797362d22f3484948b9d3699dbab32f22e80d1e

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp313-cp313-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp313-cp313-win_amd64.whl
    Algorithm Hash digest
    SHA256 fe18bf921f54abbe81f67ed8e65296e5bd1eb2a3f059ca14deada60c6b5a7d9f
    MD5 e8e92519d10411a7b3f3d8beec750b51
    BLAKE2b-256 1ab1e83b5e4ebd0e09a13e8687905d345a743f986f68f0bc3f84b6054d7fb749

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp313-cp313-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp313-cp313-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 48ed7deef414f259ebec89baad12e9be260a601d0ecb434d56a8a272ee258976
    MD5 da001ff707b4dc5954e7b170d6462782
    BLAKE2b-256 ca1cb67dc0bd24bee1abc047978b9ca603dd9221eabaa19c43882fa5fc27f300

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp313-cp313-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp313-cp313-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 8906632310a4fce507ea425666141de450fe8c496f707f43b1d488de3816a5a2
    MD5 e49ad96e9d42a9684a78b913d3a3bf6b
    BLAKE2b-256 20a2a108a5a83c9e2ed31264b847d05e320e28789452c408969ad5b14c87d5c6

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp312-cp312-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp312-cp312-win_amd64.whl
    Algorithm Hash digest
    SHA256 dbcf52fc32c5d276203c34d33e3e4b38101ee5d41dd8e0b7b6e101ee70232a7e
    MD5 9a04dd183c27f949d74669df9f9b0bc6
    BLAKE2b-256 909b85b3c5cccc5bbf23d91a0eedc19a64b5f0ebe4e65d1bf6f9c6354c00a44d

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp312-cp312-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp312-cp312-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 78780abf5804e77e7ae42e2b8b66ed647be7e94fc9a4beb2f59c587c47248bcf
    MD5 a7fa02717307ae3e10908433fcd62b05
    BLAKE2b-256 e71230131f7b6366386e96d12a3136800da20223e7b62bbf9df791bb25c09c36

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp312-cp312-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp312-cp312-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 a6592eca6afbc248929fabe4ea62a469c5849c4ce99115751822614c0e33037b
    MD5 1ddbdf68512e3f680f73b5235e62052a
    BLAKE2b-256 89fb27fd5209cc25e746fc855a9fb6da74e31f624baca2734b1ee3eb2780d7dc

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp311-cp311-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp311-cp311-win_amd64.whl
    Algorithm Hash digest
    SHA256 564ebd65ccc7941ee69ae8db2def7de0a02dde12749d136b886d97fa31e9c1c4
    MD5 4b32b639273a4c687480be34720ea938
    BLAKE2b-256 94ba2d607857e21b55ebb4ab4e2420c1e80c9f13986912a8374d17f5df01d5a8

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp311-cp311-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp311-cp311-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 3ca8eb18a672db076a4e33d660acdaf0af0c45caa7874fe05858f7ec358a6992
    MD5 1f8e3a16a5dbb4847533990339e3c4d3
    BLAKE2b-256 30fd6a77dd5989bf216c1d3541e77119efc0e397939857019a98cf49265c8ebe

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp311-cp311-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp311-cp311-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 58930675ae32254057d9d5a1a980791ca5bd31507d6925c8f95870bf79f32849
    MD5 93fe816fc5bbb4d3c7866c74c614d1b0
    BLAKE2b-256 55f55dddd390ccc7582537fa2719a22cb8ff61406405dc8177a8a25f9c9bed05

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp310-cp310-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp310-cp310-win_amd64.whl
    Algorithm Hash digest
    SHA256 90a0d40c0f2efe760b6912911a436b16c515628b8b0fcf9cb5ffd2b71892d2b0
    MD5 b9c0c2311b9e7ddfd73c788e308b2c74
    BLAKE2b-256 23a4c6bff87106b11421408e11490bfd280ffa51cf501185a5c6c656af2acff1

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp310-cp310-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp310-cp310-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 ae0114db3c5b7dde969a0af8f04b6d34a53a24b11b0192b3f67d90e82e232e56
    MD5 70a21455a311a62067977f9c8c584936
    BLAKE2b-256 7f136ad76b77825746073f14b133fab42f15d88f16e167abbde8db25123bb3b1

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp39-cp39-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp39-cp39-win_amd64.whl
    Algorithm Hash digest
    SHA256 3e6436335149f409438f6057b4672a551edb7acaa44f4a6af8d4b8ec62848042
    MD5 391964d2e9d184ee508f7ee503bd4e08
    BLAKE2b-256 a87d1e549c7eecafd98176f75d0dd234466f8641634f5fda631e355b791f48cc

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.6.3-cp39-cp39-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.6.3-cp39-cp39-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 0933847005502ebc94e3fce2494118d75e257b78b9b9ab33a2ba14543bf54a97
    MD5 c283586c8c77358cbef369a47dce3536
    BLAKE2b-256 6f386e14842dc4560903c6e3e64af3b2f72c398184ec497a345e9bc2c95eb8b2

    See more details on using hashes here.

    Supported by

    AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page