A simple toolkit for generating vector embeddings across multiple providers and models
Project description
EmbedKit
A unified interface for text and image embeddings, supporting multiple providers.
Installation
pip install embedkit
Quick Start
from embedkit import EmbedKit
from embedkit.classes import Model, CohereInputType, SnowflakeInputType
# Initialize a provider
kit = EmbedKit.cohere(
model=Model.Cohere.EMBED_V4_0,
api_key="your-api-key",
text_input_type=CohereInputType.SEARCH_QUERY,
)
# Get text embeddings
result = kit.embed_text("Hello world")
print(result.objects[0].embedding.shape) # 1D array
# Get image embeddings
result = kit.embed_image("path/to/image.png")
print(result.objects[0].embedding.shape) # 1D array
print(result.objects[0].source_b64) # Base64 encoded image
Supported Providers
Cohere
kit = EmbedKit.cohere(
model=Model.Cohere.EMBED_V4_0, # or EMBED_ENGLISH_V3_0, EMBED_MULTILINGUAL_V3_0, etc.
api_key="your-api-key",
text_input_type=CohereInputType.SEARCH_QUERY, # or SEARCH_DOCUMENT
)
Snowflake
kit = EmbedKit.snowflake(
model=Model.Snowflake.ARCTIC_EMBED_L_V2_0, # or ARCTIC_EMBED_M_V1_5
text_input_type=SnowflakeInputType.QUERY, # or DOCUMENT
)
ColPali
kit = EmbedKit.colpali(
model=Model.ColPali.COLPALI_V1_3, # or COLSMOL_256M, COLSMOL_500M
)
Jina
kit = EmbedKit.jina(
model=Model.Jina.CLIP_V2,
api_key="your-api-key",
)
Response Format
class EmbeddingResponse:
model_name: str
model_provider: str
input_type: str
objects: List[EmbeddingObject]
class EmbeddingObject:
embedding: np.ndarray # 1D array for everything except ColPali
source_b64: Optional[str] # Base64 encoded source for images and PDFs
Development
Running Tests
# Run all tests
pytest
# Run tests for specific providers
pytest -m cohere # Run only Cohere tests
pytest -m colpali # Run only ColPali tests
pytest -m jina # Run only Jina tests
pytest -m snowflake # Run only Snowflake tests
# Additional options
pytest -v # Verbose output
pytest -s # Show print statements
pytest -x # Stop on first failure
Requirements
- Python 3.10+
License
MIT
GitHub
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
embedkit-0.1.8.tar.gz
(1.6 MB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
embedkit-0.1.8-py3-none-any.whl
(15.1 kB
view details)
File details
Details for the file embedkit-0.1.8.tar.gz.
File metadata
- Download URL: embedkit-0.1.8.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8660eb44be033b1e59e376deeb5d191ca49b5bc823fa7a5c3fb209c075a7622c
|
|
| MD5 |
f4f0d4cea476fe95aa8d08cf4ff62595
|
|
| BLAKE2b-256 |
322f40e7268223d6f89e5f24e6f5e960e8c43d2c86953d602a1d65f1531b7b54
|
File details
Details for the file embedkit-0.1.8-py3-none-any.whl.
File metadata
- Download URL: embedkit-0.1.8-py3-none-any.whl
- Upload date:
- Size: 15.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31f12efe865c3342cf0e8be2465cce5c8b750502f6a99989e7a914f7562d1c54
|
|
| MD5 |
09b2c3504ad6656e1a14dbc2572afe3f
|
|
| BLAKE2b-256 |
042467f6242e9fff588f9234cc6b2ee7f110beb66210a0fa198c1ce45ffdf5fb
|