Modal-powered embedding pipeline for Krira chunks with Pinecone upsert.
Project description
krira-embed
krira-embed provides a Modal-powered embedding pipeline for chunked text and upserts vectors to Pinecone.
Features
- Batch chunk ingestion from JSONL (
text, optionalmetadata, optionalid) - Distributed embedding jobs with Modal
- Pinecone upsert with deterministic ID fallback
- Simple Python client API:
KriraEmbedding
Requirements
- Python
>=3.10,<3.13 - Modal account and token (
MODAL_TOKEN_ID,MODAL_TOKEN_SECRET) - Pinecone API key (
PINECONE_API_KEY) - Chunked JSONL file (for example,
chunks.jsonl)
Installation
pip install krira-embed
Quickstart
from krira_embed import KriraEmbedding
client = KriraEmbedding(
chunk_file_path="chunks.jsonl",
pinecone_api_key="YOUR_PINECONE_API_KEY",
pinecone_index_name="YOUR_INDEX_NAME",
namespace="default",
)
result = client.embed(
worker_batch_size=12000,
parallel_jobs=6,
model_batch_size=768,
upsert_batch_size=200,
)
print(result)
Credentials model
- End users provide Pinecone credentials explicitly in code (
pinecone_api_key,pinecone_index_name). - The package does not read local
.envfor Pinecone credentials. - Modal credentials can still be supplied via environment variables (
MODAL_TOKEN_ID,MODAL_TOKEN_SECRET) or existing Modal auth setup.
Modal CLI usage
When using the Modal entrypoint directly, pass both values explicitly:
modal run main.py --index-name YOUR_INDEX_NAME --pinecone-api-key YOUR_PINECONE_API_KEY
Use .env only for your local Modal tokens if needed (see .env.example).
Local validation (maintainers)
python -m pip install --upgrade build twine
python -m build
python -m twine check dist/*
License
MIT License. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file krira_embed-0.1.0.tar.gz.
File metadata
- Download URL: krira_embed-0.1.0.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3198f07b89ae172bc229f2e878c8396516481d240ebc3bbc608c290639c85bcb
|
|
| MD5 |
6db16e5d6e4b4761fc88a914f03e7072
|
|
| BLAKE2b-256 |
c74444108f2fd71d3b18ad1d0c43f3e6dd46eeff1b4a21fabdb148fd3fc673b0
|
File details
Details for the file krira_embed-0.1.0-py3-none-any.whl.
File metadata
- Download URL: krira_embed-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3aee6db9dbeacacb5300855f98c8439afc681c8e14dc50acdc1b4a885c89933f
|
|
| MD5 |
3a578ebb408367ed657e5090c6de9594
|
|
| BLAKE2b-256 |
46a61a0f48ce5f5dc11910b40f4e8d009d77bdcf3179bc00977313299f58ff09
|