Skip to main content

A simple toolkit for generating vector embeddings across multiple providers and models

Project description

EmbedKit

A unified interface for text and image embeddings, supporting multiple providers.

Installation

pip install embedkit

Usage

Text Embeddings

from embedkit import EmbedKit
from embedkit.classes import Model, CohereInputType

# Initialize with ColPali
kit = EmbedKit.colpali(
    model=Model.ColPali.V1_3,
    text_batch_size=16,  # Optional: process text in batches of 16
    image_batch_size=8,  # Optional: process images in batches of 8
)

# Get embeddings
result = kit.embed_text("Hello world")
print(result.model_provider)
print(result.input_type)
print(result.objects[0].embedding.shape)
print(result.objects[0].source_b64)

# Initialize with Cohere
kit = EmbedKit.cohere(
    model=Model.Cohere.EMBED_V4_0,
    api_key="your-api-key",
    text_input_type=CohereInputType.SEARCH_QUERY,  # or SEARCH_DOCUMENT
    text_batch_size=64,  # Optional: process text in batches of 64
    image_batch_size=8,  # Optional: process images in batches of 8
)

# Get embeddings
result = kit.embed_text("Hello world")
print(result.model_provider)
print(result.input_type)
print(result.objects[0].embedding.shape)
print(result.objects[0].source_b64)

Image Embeddings

from pathlib import Path

# Get embeddings for an image
image_path = Path("path/to/image.png")
result = kit.embed_image(image_path)

print(result.model_provider)
print(result.input_type)
print(result.objects[0].embedding.shape)
print(result.objects[0].source_b64)

PDF Embeddings

from pathlib import Path

# Get embeddings for a PDF
pdf_path = Path("path/to/document.pdf")
result = kit.embed_pdf(pdf_path)

print(result.model_provider)
print(result.input_type)
print(result.objects[0].embedding.shape)
print(result.objects[0].source_b64)

Response Format

The embedding methods return an EmbeddingResponse object with the following structure:

class EmbeddingResponse:
    model_name: str
    model_provider: str
    input_type: str
    objects: List[EmbeddingObject]

class EmbeddingObject:
    embedding: np.ndarray
    source_b64: Optional[str]

Supported Models

ColPali

  • Model.ColPali.V1_3

Cohere

  • Model.Cohere.EMBED_V4_0

Requirements

  • Python 3.10+

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embedkit-0.1.4.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

embedkit-0.1.4-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file embedkit-0.1.4.tar.gz.

File metadata

  • Download URL: embedkit-0.1.4.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for embedkit-0.1.4.tar.gz
Algorithm Hash digest
SHA256 37059e35423e144ea72575b32bb288452331a55450b6b97141d5dfcb1a13142a
MD5 de42bbd735ac931ed6a628bc00c97fb7
BLAKE2b-256 c4b179e7670c8ef35830e80b1055b0bff14c7c07c592beac9cd0eb2f8792dcb1

See more details on using hashes here.

File details

Details for the file embedkit-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: embedkit-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for embedkit-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 b2e1072d018de792f945de4beec7b4ae7cf63794dce738da797e3ee953603eef
MD5 898f86f9a5618053eb1a260f0ee47828
BLAKE2b-256 5949536b1e9952106000c1543e6dbc88b69e00a886d47f428a2167d0518457b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page