Skip to main content

Embed anything at lightning speed

Project description

Minimalist Framework for building local and multimodal embeddings built in Rust 🦀

Downloads Open in Colab license license license

EmbedAnything is a powerful python library designed to streamline the creation and management of embedding pipelines. Whether you're working with text, images, audio, or any other type of data., EmbedAnything makes it easy to generate embeddings from multiple sources and store them efficiently in a vector database.

🦀The Benefit of Rust for Speed

By using Rust for its core functionalities, EmbedAnything offers significant speed advantages:

➡️Rust is Compiled: Unlike Python, Rust compiles directly to machine code, resulting in faster execution.
➡️Memory Management: Rust enforces memory management simultaneously, preventing memory leaks and crashes that can plague other languages
➡️Rust achieves true multithreading.

🚀Why Candle?...

➡️Running language models or embedding models locally can be difficult, especially when you want to deploy a product that utilizes these models.
➡️If you use the transformers library from Hugging Face in Python, you will depend on PyTorch for tensor operations.
➡️ This, in turn, has a dependency on Libtorch, which means that you will need to include the entire Libtorch library with your product.
➡️Also, Candle allows inferences on CUDA-enabled GPUs right out of the box. We will soon post on how we use Candle to increase the performance and decrease the memory usage of EmbedAnything.

Examples

  1. Image Search: Open in Colab

Watch the demo

🚀 Key Features

  • Local Embedding Works with local embedding models like AllminiLM
  • MultiModality Works with text and image and will soon expand to audio
  • Python Interface: Packaged as a Python library for seamless integration into your existing projects.
  • Efficient: Optimized for speed and performance, with core functionality written in Rust.
  • Scalable: Store embeddings in a vector database for easy retrieval and scalability.
  • OpenAI Works with openai as well

💚 Installation

pip install embed-anything

🧑‍🚀 Getting Started

For local models

To use local embedding: we support Bert and Jina

import embed_anything
data = embed_anything.embed_file("filename.pdf", embeder= "Bert")
embeddings = np.array([data.embedding for data in data])

For multimodal embedding: we support CLIP

Requirements Directory with pictures you want to search for example we have test_files with images of cat, dogs etc

import embed_anything
data = embed_anything.embed_directory("test_files", embeder= "Clip")
embeddings = np.array([data.embedding for data in data])

query = "photo of a dog"
query_embedding = np.array(embed_anything.embed_query(query, embeder= "Clip")[0].embedding)
similarities = np.dot(embeddings, query_embedding)
max_index = np.argmax(similarities)
Image.open(data[max_index].text).show()

For OpenAI

  1. Please check if you already have the OpenAI key in the Environment variable.

If you are using embed-anything==0.1.7 version (latest version)

import embed_anything
data = embed_anything.embed_file("filename.pdf", embeder= "OpenAI")
embeddings = np.array([data.embedding for data in data])

🚧 Contributing to EmbedAnything

First of all, thank you for taking the time to contribute to this project. We truly appreciate your contributions, whether it's bug reports, feature suggestions, or pull requests. Your time and effort are highly valued in this project. 🚀

This document provides guidelines and best practices to help you to contribute effectively. These are meant to serve as guidelines, not strict rules. We encourage you to use your best judgment and feel comfortable proposing changes to this document through a pull request.

Table of Content:

  1. [Code of conduct]
  2. [Quick Start]
  3. [RoadMap]

RoadMap

☑️Graph embedding -- build deepwalks embeddings depth first and word to vec
☑️Add whisper for audio embeddings
☑️Zero-shot application
☑️Asynchronous chunks training

✔️ Code of Conduct:

Please read our [Code of Conduct] to understand the expectations we have for all contributors participating in this project. By participating, you agree to abide by our Code of Conduct.

🚀 Quick Start

You can quickly get started with contributing by searching for issues with the labels "Good First Issue" or "Help Needed" in the [Issues Section]. If you think you can contribute, comment on the issue and we will assign it to you.

To set up your development environment, please follow the steps mentioned below :

  1. Fork the repository and create a clone of the fork
  2. Create a branch for a feature or a bug you are working on in your fork
  3. If you are working with OpenAI make sure you have the keys

Contributing Guidelines

🔍 Reporting Bugs

  1. Title describing the issue clearly and concisely with relevant labels
  2. Provide a detailed description of the problem and the necessary steps to reproduce the issue.
  3. Include any relevant logs, screenshots, or other helpful information supporting the issue.

💡 New Feature or Suggesting Enhancements

☑️ ToDo

  • Vector Database Add functionalities to integrate with any Vector Database

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embed_anything-0.1.19.tar.gz (14.0 MB view details)

Uploaded Source

Built Distributions

embed_anything-0.1.19-cp312-none-win_amd64.whl (10.8 MB view details)

Uploaded CPython 3.12 Windows x86-64

embed_anything-0.1.19-cp312-cp312-manylinux_2_34_x86_64.whl (14.2 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.34+ x86-64

embed_anything-0.1.19-cp312-cp312-macosx_11_0_arm64.whl (7.2 MB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

embed_anything-0.1.19-cp312-cp312-macosx_10_12_x86_64.whl (7.4 MB view details)

Uploaded CPython 3.12 macOS 10.12+ x86-64

embed_anything-0.1.19-cp311-none-win_amd64.whl (10.8 MB view details)

Uploaded CPython 3.11 Windows x86-64

embed_anything-0.1.19-cp311-cp311-manylinux_2_34_x86_64.whl (14.2 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.34+ x86-64

embed_anything-0.1.19-cp311-cp311-macosx_11_0_arm64.whl (7.2 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

embed_anything-0.1.19-cp311-cp311-macosx_10_12_x86_64.whl (7.4 MB view details)

Uploaded CPython 3.11 macOS 10.12+ x86-64

embed_anything-0.1.19-cp310-none-win_amd64.whl (10.8 MB view details)

Uploaded CPython 3.10 Windows x86-64

embed_anything-0.1.19-cp310-cp310-manylinux_2_34_x86_64.whl (14.2 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.34+ x86-64

embed_anything-0.1.19-cp310-cp310-macosx_11_0_arm64.whl (7.2 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

embed_anything-0.1.19-cp39-none-win_amd64.whl (10.8 MB view details)

Uploaded CPython 3.9 Windows x86-64

embed_anything-0.1.19-cp39-cp39-manylinux_2_34_x86_64.whl (14.2 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.34+ x86-64

embed_anything-0.1.19-cp39-cp39-macosx_11_0_arm64.whl (7.2 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

embed_anything-0.1.19-cp38-none-win_amd64.whl (10.8 MB view details)

Uploaded CPython 3.8 Windows x86-64

File details

Details for the file embed_anything-0.1.19.tar.gz.

File metadata

  • Download URL: embed_anything-0.1.19.tar.gz
  • Upload date:
  • Size: 14.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.7.0

File hashes

Hashes for embed_anything-0.1.19.tar.gz
Algorithm Hash digest
SHA256 08ca3375683a0c661536c0c8c0fbb3a746875cb7479dfa4e1b89d4b615c171f7
MD5 4846f7484a37332b5401a74996aaf223
BLAKE2b-256 3d0f334b304e133464cde917fa95cb5911f3ae6873b753f315d3578b081676e9

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp312-none-win_amd64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp312-none-win_amd64.whl
Algorithm Hash digest
SHA256 94ee0eb88babd13a6e09656d8fa62cf065da0874ee2fb541787e297a2c6f5a11
MD5 805478629e4a1817a6df0b8846d99ebb
BLAKE2b-256 19f273d625b0d652119b5dcc5c86d8b05504a754e6f3aebf0976e42fd1447399

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 bc9ab59bddad500b22eba052f5be18a7a4179b84e04e0667fa3ccb48eb64223f
MD5 55dde61184e87e72d851d2a244cadb9f
BLAKE2b-256 48cf8d5fea968ee7e440582d3b9f030673f12b0522ccfa7bf8415bf760da4521

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 db00a1fd00b43a10fa5c90d5f26f07905389c296a911b74c188faddd181f707c
MD5 f6f1b546b40f4ab75a59bb8c07e03950
BLAKE2b-256 b8102f23024c4a3dfdd17b2423227f44f4f229356761aed50deda4e0ea1b2783

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 207a16d540db64e2579fda827f6564ff6ea56829ae0a352dd6bac193d895273e
MD5 bd36f5cfeb9f4b16d28b813072381a39
BLAKE2b-256 0ab6c7dca2120010bbab83e8ea3e38a8f7fcaf09441d7a8602b16cd83923539d

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp311-none-win_amd64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 faf85c8be8dfdd9e0fcd6561611b602a2f2cf09b25a2be8491079d544a009511
MD5 a940308b76b7c416aeedb03ea4089b33
BLAKE2b-256 6528edb1f651a9b48b9bb9faf4ccd119594579fd07bfa069bc13d42f315eae5a

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 5a5f1c93096dba513000a0ee7ae7847e6cf96d1fae5e6f9958fe43ac58209d1f
MD5 56007084a68e7a7c988b87b732397445
BLAKE2b-256 7d3f9aa912bda16d7406305ef431f601a90c845e95813d7aada13bb43db62a49

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ec328c1456f82f61c5dbf242dab4a57bd8bd8ee4eb485b43116837026b6de777
MD5 79ac617afad2a50561132ceea5d94d63
BLAKE2b-256 4f4bb7179b1b0059a799fc47e64ee06e8ed37d79197b20a5976e1dbce369ac5a

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 61ee814083f5c0f3f549ccc4e98fe765237e3597774e76e9397306dcf088efa9
MD5 97e2ad7ebcfbced4e7e17481a19b6d1c
BLAKE2b-256 205e299d14c6f65095a058effb45b6780767d6293bd556434b0cd9c57ac16e79

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 1b953b42002c3876c25ebb714a5fc92bec8d88f7158fc21068a4292547a0ef26
MD5 239cbeea5f9960b76b4630d3e9b732bf
BLAKE2b-256 f5c5fc0a7a762c6f208f66bca1f7c4593b2f2c805600e269572b1c3c7595a616

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 0009e3d18505d7d15225c32cd196c30e44e6482dfb136f69123ceb1ca1501247
MD5 1fef908d3246414c091a0a8a970a9f2b
BLAKE2b-256 c8bd7f15ed1c078f1125e3312de2082c82c567c7d12ba40a4609f6758482a9a5

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4548136a7b71faebbca2e1980978b2b438564834f7f7fb01be69e33523a16223
MD5 5abdf6005953638d63d8b6b0441e80d5
BLAKE2b-256 78fa6a13de932edd9a2d846d925c11010dda9d4900f384ba5861489b38d1d97c

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp39-none-win_amd64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 140921f763831005ceceeebc2bdcafceba5a2ff3fdbf3c779c4eae03836f8c9e
MD5 ba64182a4b43a449310c498e74a4d244
BLAKE2b-256 05e978af8851f883ee5d99cf5b5e2a4a91140f4e46e971f21a8b07853cc830e2

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 bed13c34d634df801a287f054f5b89a17cf8c6dcc108fe83e906c09d179ea35f
MD5 9b542ded5e8cf6bfd1eec791b7613ba0
BLAKE2b-256 71f9648fa151204adb09622b8cf4d39fe1e7d51087d27f3bfa4fb5ae028ce263

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4565ea404077ac3a8d385e506d36520abf8148f7834b1511857320de138f1e6a
MD5 e61e4e1920bb07494bfd774c30aa31aa
BLAKE2b-256 4e05e958198559890cbd0ea83a653fa21e058a92663cb09de22c523acdc9bae7

See more details on using hashes here.

File details

Details for the file embed_anything-0.1.19-cp38-none-win_amd64.whl.

File metadata

File hashes

Hashes for embed_anything-0.1.19-cp38-none-win_amd64.whl
Algorithm Hash digest
SHA256 1378403a124524a1e86f3c7da7498dff25487cfd08eabffd36ca46f98fdc6823
MD5 872fcae5b7dedfa6c1638820617a05b0
BLAKE2b-256 38450031426c814106b49780665da0cdd220ccc9a8e76183ee895eb6e20b9038

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page