Skip to main content

Embed anything at lightning speed

Project description

Downloads gpu Open in Colab roadmap MkDocs

Inference, Ingestion, and Indexing in Rust 🦀
Python docs »
Rust docs »
Benchmarks · FAQ · Adapters . Collaborations

EmbedAnything is a minimalist, highly performant, lightning-fast, lightweight, multisource, multimodal, and local embedding pipeline built in Rust. Whether you're working with text, images, audio, PDFs, websites, or other media, EmbedAnything streamlines the process of generating embeddings from various sources and seamlessly streaming (memory-efficient-indexing) them to a vector database. It supports dense, sparse, ONNX and late-interaction embeddings, offering flexibility for a wide range of use cases.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. How to add custom model and chunk size

🚀 Key Features

  • Candle Backend : Supports BERT, Jina, ColPali, Splade, ModernBERT
  • ONNX Backend: Supports BERT, Jina, ColPali, ColBERT Splade, Reranker, ModernBERT
  • Cloud Embedding Models:: Supports OpenAI and Cohere.
  • MultiModality : Works with text sources like PDFs, txt, md, Images JPG and Audio, .WAV
  • Rust : All the file processing is done in rust for speed and efficiency
  • GPU support : We have taken care of hardware acceleration on GPU as well.
  • Python Interface: Packaged as a Python library for seamless integration into your existing projects.
  • Vector Streaming: Continuously create and stream embeddings if you have low resource.

💡What is Vector Streaming

Vector Streaming enables you to process and generate embeddings for files and stream them, so if you have 10 GB of file, it can continuously generate embeddings Chunk by Chunk, that you can segment semantically, and store them in the vector database of your choice, Thus it eliminates bulk embeddings storage on RAM at once. The embedding process happens separetly from the main process, so as to maintain high performance enabled by rust MPSC. Find our blog.

EmbedAnythingXWeaviate

🦀 Why Embed Anything

➡️Faster execution.
➡️Memory Management: Rust enforces memory management simultaneously, preventing memory leaks and crashes that can plague other languages
➡️True multithreading
➡️Running embedding models locally and efficiently
➡️Candle allows inferences on CUDA-enabled GPUs right out of the box.
➡️Decrease the memory usage of EmbedAnything.
➡️Supports range of models, Dense, Sparse, Late-interaction, ReRanker, ModernBert.

🍓 Our Past Collaborations:

We have collaborated with reputed enterprise like Elastic, Weaviate, SingleStore and Datahours

You can get in touch with us for further collaborations.

Benchmarks

Only measures embedding model inference speed, on onnx-runtime. Code

⭐ Supported Models

We support any hugging-face models on Candle. And We also support ONNX runtime for BERT and ColPali.

How to add custom model on candle: from_pretrained_hf

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="model link from huggingface"
)
config = TextEmbedConfig(chunk_size=1000, batch_size=32)
data = embed_anything.embed_file("file_address", embedder=model, config=config)
Model HF link
Jina Jina Models
Bert All Bert based models
CLIP openai/clip-*
Whisper OpenAI Whisper models
ColPali starlight-ai/colpali-v1.2-merged-onnx
Colbert answerdotai/answerai-colbert-small-v1, jinaai/jina-colbert-v2 and more
Splade Splade Models and other Splade like models
Reranker Jina Reranker Models, Xenova/bge-reranker

Splade Models:

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.SparseBert, "prithivida/Splade_PP_en_v1"
)

ONNX-Runtime: from_pretrained_onnx

BERT

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_id="onnx_model_link"
)

ColPali

model: ColpaliModel = ColpaliModel.from_pretrained_onnx("starlight-ai/colpali-v1.2-merged-onnx", None)

Colbert

sentences = [
"The quick brown fox jumps over the lazy dog", 
"The cat is sleeping on the mat", "The dog is barking at the moon", 
"I love pizza", 
"The dog is sitting in the park"]

model = ColbertModel.from_pretrained_onnx("jinaai/jina-colbert-v2", path_in_repo="onnx/model.onnx")
embeddings = model.embed(sentences, batch_size=2)

ModernBERT

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

ReRankers

reranker = Reranker.from_pretrained("jinaai/jina-reranker-v1-turbo-en", dtype=Dtype.F16)

results: list[RerankerResult] = reranker.rerank(["What is the capital of France?"], ["France is a country in Europe.", "Paris is the capital of France."], 2)

For Semantic Chunking

model = EmbeddingModel.from_pretrained_hf(
    WhichModel.Bert, model_id="sentence-transformers/all-MiniLM-L12-v2"
)

# with semantic encoder
semantic_encoder = EmbeddingModel.from_pretrained_hf(WhichModel.Jina, model_id = "jinaai/jina-embeddings-v2-small-en")
config = TextEmbedConfig(chunk_size=1000, batch_size=32, splitting_strategy = "semantic", semantic_encoder=semantic_encoder)

🧑‍🚀 Getting Started

💚 Installation

pip install embed-anything

For GPUs and using special models like ColPali

pip install embed-anything-gpu

Usage

➡️ Usage For 0.3 and later version

To use local embedding: we support Bert and Jina

model = EmbeddingModel.from_pretrained_local(
    WhichModel.Bert, model_id="Hugging_face_link"
)
data = embed_anything.embed_file("test_files/test.pdf", embedder=model)

For multimodal embedding: we support CLIP

Requirements Directory with pictures you want to search for example we have test_files with images of cat, dogs etc

import embed_anything
from embed_anything import EmbedData
model = embed_anything.EmbeddingModel.from_pretrained_local(
    embed_anything.WhichModel.Clip,
    model_id="openai/clip-vit-base-patch16",
    # revision="refs/pr/15",
)
data: list[EmbedData] = embed_anything.embed_directory("test_files", embedder=model)
embeddings = np.array([data.embedding for data in data])
query = ["Photo of a monkey?"]
query_embedding = np.array(
    embed_anything.embed_query(query, embedder=model)[0].embedding
)
similarities = np.dot(embeddings, query_embedding)
max_index = np.argmax(similarities)
Image.open(data[max_index].text).show()

Audio Embedding using Whisper

requirements: Audio .wav files.

import embed_anything
from embed_anything import (
    AudioDecoderModel,
    EmbeddingModel,
    embed_audio_file,
    TextEmbedConfig,
)
# choose any whisper or distilwhisper model from https://huggingface.co/distil-whisper or https://huggingface.co/collections/openai/whisper-release-6501bba2cf999715fd953013
audio_decoder = AudioDecoderModel.from_pretrained_hf(
    "openai/whisper-tiny.en", revision="main", model_type="tiny-en", quantized=False
)
embedder = EmbeddingModel.from_pretrained_hf(
    embed_anything.WhichModel.Bert,
    model_id="sentence-transformers/all-MiniLM-L6-v2",
    revision="main",
)
config = TextEmbedConfig(chunk_size=1000, batch_size=32)
data = embed_anything.embed_audio_file(
    "test_files/audio/samples_hp0.wav",
    audio_decoder=audio_decoder,
    embedder=embedder,
    text_embed_config=config,
)
print(data[0].metadata)

Using ONNX Models

To use ONNX models, you can either use the ONNXModel enum or the model_id from the Hugging Face model.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Bert, model_name = ONNXModel.AllMiniLML6V2Q
)

For some models, you can also specify the dtype to use for the model.

model = EmbeddingModel.from_pretrained_onnx(
    WhichModel.Bert, ONNXModel.ModernBERTBase, dtype = Dtype.Q4F16
)

Using the above method is best to ensure that the model works correctly as these models are tested. But if you want to use other models, like finetuned models, you can use the hf_model_id and path_in_repo to load the model like below.

model = EmbeddingModel.from_pretrained_onnx(
  WhichModel.Jina, hf_model_id = "jinaai/jina-embeddings-v2-small-en", path_in_repo="model.onnx"
)

To see all the ONNX models supported with model_name, see here

⁉️FAQ

Do I need to know rust to use or contribute to embedanything?

The answer is No. EmbedAnything provides you pyo3 bindings, so you can run any function in python without any issues. To contibute you should check out our guidelines and python folder example of adapters.

How is it different from fastembed?

We provide both backends, candle and onnx. On top of it we also give an end-to-end pipeline, that is you can ingest different data-types and index to any vector database, and inference any model. Fastembed is just an onnx-wrapper.

We've received quite a few questions about why we're using Candle.

One of the main reasons is that Candle doesn't require any specific ONNX format models, which means it can work seamlessly with any Hugging Face model. This flexibility has been a key factor for us. However, we also recognize that we’ve been compromising a bit on speed in favor of that flexibility.

🚧 Contributing to EmbedAnything

First of all, thank you for taking the time to contribute to this project. We truly appreciate your contributions, whether it's bug reports, feature suggestions, or pull requests. Your time and effort are highly valued in this project. 🚀

This document provides guidelines and best practices to help you to contribute effectively. These are meant to serve as guidelines, not strict rules. We encourage you to use your best judgment and feel comfortable proposing changes to this document through a pull request.

  • Roadmap
  • Quick Start
  • Guidelines
  • 🏎️ RoadMap

    Accomplishments

    One of the aims of EmbedAnything is to allow AI engineers to easily use state of the art embedding models on typical files and documents. A lot has already been accomplished here and these are the formats that we support right now and a few more have to be done.

    Adding Fine-tuning

    One of the major goals of this year is to add finetuning these models on your data. Like a simple sentence transformer does.

    🖼️ Modalities and Source

    We’re excited to share that we've expanded our platform to support multiple modalities, including:

    • Audio files

    • Markdowns

    • Websites

    • Images

    • Videos

    • Graph

    This gives you the flexibility to work with various data types all in one place! 🌐

    💜 Product

    We’ve rolled out some major updates in version 0.3 to improve both functionality and performance. Here’s what’s new:

    • Semantic Chunking: Optimized chunking strategy for better Retrieval-Augmented Generation (RAG) workflows.

    • Streaming for Efficient Indexing: We’ve introduced streaming for memory-efficient indexing in vector databases. Want to know more? Check out our article on this feature here: https://www.analyticsvidhya.com/blog/2024/09/vector-streaming/

    • Zero-Shot Applications: Explore our zero-shot application demos to see the power of these updates in action.

    • Intuitive Functions: Version 0.3 includes a complete refactor for more intuitive functions, making the platform easier to use.

    • Chunkwise Streaming: Instead of file-by-file streaming, we now support chunkwise streaming, allowing for more flexible and efficient data processing.

    Check out the latest release : and see how these features can supercharge your GenerativeAI pipeline! ✨

    🚀Coming Soon

    ⚙️ Performance

    We now support ONNX as well

    ➡️ Support for GGUF models

    • Significantly faster performance
    • Stay tuned for these exciting updates! 🚀

    🫐Embeddings:

    We had multimodality from day one for our infrastructure. We have already included it for websites, images and audios but we want to expand it further to.

    ☑️Graph embedding -- build deepwalks embeddings depth first and word to vec
    ☑️Video Embedding
    ☑️ Yolo Clip

    🌊Expansion to other Vector Adapters

    We currently support a wide range of vector databases for streaming embeddings, including:

    • Elastic: thanks to amazing and active Elastic team for the contribution
    • Weaviate
    • Pinecone

    How to add an adpters: https://starlight-search.com/blog/2024/02/25/adapter-development-guide.md

    But we're not stopping there! We're actively working to expand this list.

    Want to Contribute? If you’d like to add support for your favorite vector database, we’d love to have your help! Check out our contribution.md for guidelines, or feel free to reach out directly starlight-search@proton.me. Let's build something amazing together! 💡

    A big Thank you to all our StarGazers

    Star History

    Star History Chart

    Project details


    Download files

    Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

    Source Distribution

    embed_anything-0.5.5.tar.gz (944.9 kB view details)

    Uploaded Source

    Built Distributions

    embed_anything-0.5.5-cp313-cp313-win_amd64.whl (17.1 MB view details)

    Uploaded CPython 3.13 Windows x86-64

    embed_anything-0.5.5-cp313-cp313-manylinux_2_34_x86_64.whl (18.9 MB view details)

    Uploaded CPython 3.13 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.5-cp313-cp313-macosx_11_0_arm64.whl (18.9 MB view details)

    Uploaded CPython 3.13 macOS 11.0+ ARM64

    embed_anything-0.5.5-cp312-cp312-win_amd64.whl (17.1 MB view details)

    Uploaded CPython 3.12 Windows x86-64

    embed_anything-0.5.5-cp312-cp312-manylinux_2_34_x86_64.whl (18.9 MB view details)

    Uploaded CPython 3.12 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.5-cp312-cp312-macosx_11_0_arm64.whl (18.9 MB view details)

    Uploaded CPython 3.12 macOS 11.0+ ARM64

    embed_anything-0.5.5-cp311-cp311-win_amd64.whl (17.1 MB view details)

    Uploaded CPython 3.11 Windows x86-64

    embed_anything-0.5.5-cp311-cp311-manylinux_2_34_x86_64.whl (18.9 MB view details)

    Uploaded CPython 3.11 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.5-cp311-cp311-macosx_11_0_arm64.whl (19.0 MB view details)

    Uploaded CPython 3.11 macOS 11.0+ ARM64

    embed_anything-0.5.5-cp310-cp310-win_amd64.whl (17.1 MB view details)

    Uploaded CPython 3.10 Windows x86-64

    embed_anything-0.5.5-cp310-cp310-manylinux_2_34_x86_64.whl (18.9 MB view details)

    Uploaded CPython 3.10 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.5-cp39-cp39-win_amd64.whl (17.1 MB view details)

    Uploaded CPython 3.9 Windows x86-64

    embed_anything-0.5.5-cp39-cp39-manylinux_2_34_x86_64.whl (18.9 MB view details)

    Uploaded CPython 3.9 manylinux: glibc 2.34+ x86-64

    embed_anything-0.5.5-cp38-cp38-win_amd64.whl (17.1 MB view details)

    Uploaded CPython 3.8 Windows x86-64

    File details

    Details for the file embed_anything-0.5.5.tar.gz.

    File metadata

    • Download URL: embed_anything-0.5.5.tar.gz
    • Upload date:
    • Size: 944.9 kB
    • Tags: Source
    • Uploaded using Trusted Publishing? Yes
    • Uploaded via: maturin/1.8.3

    File hashes

    Hashes for embed_anything-0.5.5.tar.gz
    Algorithm Hash digest
    SHA256 ca674e3378ce06d273df3b485882d7d5f751ac395115524bb8d20a6b1ac0a3c1
    MD5 9d2f1bf13f66617b62ef7dbdf05201b7
    BLAKE2b-256 72fed4458db65176057ddc1310fda26bf4743303f28f50f330bcd0d252d7d212

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp313-cp313-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp313-cp313-win_amd64.whl
    Algorithm Hash digest
    SHA256 adb46001ed10431636d3a160b920d1a83f0448512f2f0b1c2adea1757e21a960
    MD5 f03e02acf206eb4eef2381545f3fe54d
    BLAKE2b-256 5a980b54e2550922b773088bf5089652e59c0693faf2dec51e2cc08c1516f68c

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp313-cp313-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp313-cp313-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 92cc73ec658206bcaadaee417e485545d922959b23ee1b9bb3d2f5e8d39730c4
    MD5 f002c79342bffa268e3c95a4e9cd4037
    BLAKE2b-256 b8eca3d3080df37db5b516431056637e0dbeb55bfdbe0b6cf562263de4695f96

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp313-cp313-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp313-cp313-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 c68cb7cc803d9c1d4110a9474fbff8d1e3a126629a3fc9a26c48d02aca0de457
    MD5 47e463b5966cb81a24c7144e57c9b242
    BLAKE2b-256 c30073348e75fb0f5df9b8028573a2cc18c635abe9593c62dce4a3e70f49a3d4

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp312-cp312-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp312-cp312-win_amd64.whl
    Algorithm Hash digest
    SHA256 20807c38e695673b68db02b2664d47279475a7930c439df4994307e88b6b51fa
    MD5 65ceb647152b1c84252dbf2f9dcf749c
    BLAKE2b-256 724905385adbf10d6dbce2c3fa559c1021cafe1f09e799f998450e7bbb4f2b9d

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp312-cp312-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp312-cp312-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 4c427d33b21e7b78f195bcb5c2b5f5f50e08578356c2d732b770f89223e5e38b
    MD5 4757b47231f5c1edd9cee8ac697e2e05
    BLAKE2b-256 4d0816270739e3955c66945fae1475d3f54fe67bb665a26582c6c57758010cde

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp312-cp312-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp312-cp312-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 145fb51d62fc32516a72a85043a385d3d195b35c851e03396471adbfd85311d0
    MD5 7e221d354c8623ee558c43d20f879159
    BLAKE2b-256 1b7780f5c2700b9c3d724f130b62d562abc117839b6dcf5457108b25b63b190c

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp311-cp311-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp311-cp311-win_amd64.whl
    Algorithm Hash digest
    SHA256 bb7b47874005305bf18949276a0d5e3bc928a470eb2ffe04c2b7534bbd75d465
    MD5 a4dea9d40277e5dea5677d55b32a873b
    BLAKE2b-256 b3f9b18605afc3fe4d4030f4d4d141cce78b2a7f3df127d8bbe9e51b64fef312

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp311-cp311-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp311-cp311-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 72bf61d9d8c14ffbdaf3e2057d9447d291f7a8bfc1790d5bbdccb0cf8272cf6f
    MD5 3aa9c7e603ea3854eb28c299fc346f76
    BLAKE2b-256 99c389d8090e965bc138c63a428d17bda6a3e8c4be4118b3c17624cd47ebad6c

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp311-cp311-macosx_11_0_arm64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp311-cp311-macosx_11_0_arm64.whl
    Algorithm Hash digest
    SHA256 ed87786d3be587de0b986b4de21c7fe6e68b82008127b86601113d5b7ba5c29a
    MD5 d3b8600770dcdab7dbf0d45a866b3d06
    BLAKE2b-256 4390bd150ccca8560e1284f8bda3fbd2c95a82ca9c20f2bddc12caffb44bf401

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp310-cp310-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp310-cp310-win_amd64.whl
    Algorithm Hash digest
    SHA256 e3fd7407f8bd0ac6214e1af5718a3f74836979133f73e36231b1606fa8030d4b
    MD5 3b1a920cfd196c2e04b80a3c9c3d0c36
    BLAKE2b-256 3bb11477f829cc705d4adf600bb566d32232d900e91a2694a3e68f4ce8d64ee1

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp310-cp310-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp310-cp310-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 be3b882b8dc61dc481912a61a1b5561b21449afb7830f2105ab67e371a96ad0a
    MD5 918f13d8d1a16e46f4f545f06d3126f7
    BLAKE2b-256 db65889f993805defaaddba244a0be62f108c73517c1999e0343448d860abab1

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp39-cp39-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp39-cp39-win_amd64.whl
    Algorithm Hash digest
    SHA256 56cceaa370176d162c09c6ae01e4f0ccd68d06c9345142287a8395bd1694dad1
    MD5 511553d13fbd5ce6a89c6e6af4221cdd
    BLAKE2b-256 9deaa4adf3f0b5bd8f33ee90a6851d472fe8eb2426e359f80e6da6cfd6cde787

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp39-cp39-manylinux_2_34_x86_64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp39-cp39-manylinux_2_34_x86_64.whl
    Algorithm Hash digest
    SHA256 7cd580f66eabf4015a27c1f32dba394af03b4ab42221aaf496bda4c227084ac5
    MD5 93c37032f5f7aded4c8695124f54fdc6
    BLAKE2b-256 08d8c294ffd9adbd95fbb8d015991df0fc00469a6d4cae0a1a7958994d11d4c3

    See more details on using hashes here.

    File details

    Details for the file embed_anything-0.5.5-cp38-cp38-win_amd64.whl.

    File metadata

    File hashes

    Hashes for embed_anything-0.5.5-cp38-cp38-win_amd64.whl
    Algorithm Hash digest
    SHA256 fa783ae4d5edf72c75732b5bbbd576852275ae73d9021969c60778343622ecf4
    MD5 1f31bc11f53fc8bc8ca83ec01a4d50ef
    BLAKE2b-256 7d539f451329cf31ddb959f9117be708d89ae246d60faaa692f1c54b759b3325

    See more details on using hashes here.

    Supported by

    AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page