A simple toolkit for generating vector embeddings across multiple providers and models
Project description
EmbedKit
A unified interface for text and image embeddings, supporting multiple providers.
Installation
pip install embedkit
Quick Start
from embedkit import EmbedKit
from embedkit.classes import Model, CohereInputType, SnowflakeInputType
# Initialize a provider
kit = EmbedKit.cohere(
model=Model.Cohere.EMBED_V4_0,
api_key="your-api-key",
text_input_type=CohereInputType.SEARCH_QUERY,
)
# Get text embeddings
result = kit.embed_text("Hello world")
print(result.objects[0].embedding.shape) # 1D array
# Get image embeddings
result = kit.embed_image("path/to/image.png")
print(result.objects[0].embedding.shape) # 1D array
print(result.objects[0].source_b64) # Base64 encoded image
Supported Providers
Cohere
kit = EmbedKit.cohere(
model=Model.Cohere.EMBED_V4_0, # or EMBED_ENGLISH_V3_0, EMBED_MULTILINGUAL_V3_0, etc.
api_key="your-api-key",
text_input_type=CohereInputType.SEARCH_QUERY, # or SEARCH_DOCUMENT
)
Snowflake
kit = EmbedKit.snowflake(
model=Model.Snowflake.ARCTIC_EMBED_L_V2_0, # or ARCTIC_EMBED_M_V1_5
text_input_type=SnowflakeInputType.QUERY, # or DOCUMENT
)
ColPali
kit = EmbedKit.colpali(
model=Model.ColPali.COLPALI_V1_3, # or COLSMOL_256M, COLSMOL_500M
)
Jina
kit = EmbedKit.jina(
model=Model.Jina.CLIP_V2,
api_key="your-api-key",
)
Response Format
class EmbeddingResponse:
model_name: str
model_provider: str
input_type: str
objects: List[EmbeddingObject]
class EmbeddingObject:
embedding: np.ndarray # 1D array for everything except ColPali
source_b64: Optional[str] # Base64 encoded source for images and PDFs
Development
Running Tests
# Run all tests
pytest
# Run tests for specific providers
pytest -m cohere # Run only Cohere tests
pytest -m colpali # Run only ColPali tests
pytest -m jina # Run only Jina tests
pytest -m snowflake # Run only Snowflake tests
# Additional options
pytest -v # Verbose output
pytest -s # Show print statements
pytest -x # Stop on first failure
Requirements
- Python 3.10+
License
MIT
GitHub
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
embedkit-0.1.9.tar.gz
(1.6 MB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
embedkit-0.1.9-py3-none-any.whl
(15.4 kB
view details)
File details
Details for the file embedkit-0.1.9.tar.gz.
File metadata
- Download URL: embedkit-0.1.9.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c191363817dce9e92c163f6d3092d6478719821c46c6e69a72611295703dc161
|
|
| MD5 |
521e5e83d1c2b044efb6fd1183d8b155
|
|
| BLAKE2b-256 |
de8e532a166be15e21badec4ceeb5d8b23c68b1d99f788e3ca307bd17d2cb55b
|
File details
Details for the file embedkit-0.1.9-py3-none-any.whl.
File metadata
- Download URL: embedkit-0.1.9-py3-none-any.whl
- Upload date:
- Size: 15.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
463785799dc2f7a949f518b9b86a1fd5183c064f0497f3b3d562c6c5784ca56a
|
|
| MD5 |
d43995dbdc518a8a84ef51e7b998eb4f
|
|
| BLAKE2b-256 |
2a96bf5e796cd81816f8b644d03efad433b7988cbf75ae01aa965e2a90005ee2
|