Minimal wrapper for Hugging Face's Text Embeddings Inference.
Project description
PyTEI
PyTEI is a minimal python interface for Hugging Face's Text Embeddings Inference.
PyTEI supports in-memory and persistent caching for text embeddings.
Installation
First, clone the git repository by running:
git clone https://github.com/daniel-gomm/PyTEI.git
Next, install this repository as python package using pip by running the following command from the root directory of this repository:
pip install .
Add the -e-flag in case you want to modify the code.
Usage
Prerequisite for using PyTEI is a running Text Embeddings Inference instance, for example a local docker container running TEI. Such a docker contain can be spun-up by running:
docker run --gpus all -p 8080:80 \
-v $PWD/data:/data \
--pull always ghcr.io/huggingface/text-embeddings-inference:1.6 \
--model-id Alibaba-NLP/gte-Qwen2-1.5B-instruct
TEI Client
For more details check out the Documentation.
Establish a connection to TEI through a TEIClient. The client gives you access to the text-embedding API of the TEI instance:
from pytei import TEIClient
client = TEIClient(url="127.0.0.1:8080/embed")
text_embedding = client.embed("Lorem Ipsum")
The default configuration uses in-memory caching of embeddings. For persistent caching use the DuckDBDataStore or implement your own caching solution by extending the DataStore base-class.
from pytei import TEIClient
from pytei.store import DuckDBEmbeddingStore
persistent_data_store = DuckDBEmbeddingStore(db_path="data/embedding_database.duckdb")
client = TEIClient(embedding_store=persistent_data_store, url="127.0.0.1:8080/embed")
text_embedding = client.embed("Lorem Ipsum")
For a more detailed description and the full description of the API check out the Documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pytei_client-0.1.0.tar.gz.
File metadata
- Download URL: pytei_client-0.1.0.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0dfdc77ad7c3f27448e8ea9f3271f605f48c55bfb334fde4365760ff54ab1537
|
|
| MD5 |
3302f2e71c5f8f22bd046d20394e8058
|
|
| BLAKE2b-256 |
0645986c54fcb93c9701d8c73259c444644f7a8f380fe4bc0df07e25bc575f75
|
Provenance
The following attestation bundles were made for pytei_client-0.1.0.tar.gz:
Publisher:
publish.yml on daniel-gomm/PyTEI
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pytei_client-0.1.0.tar.gz -
Subject digest:
0dfdc77ad7c3f27448e8ea9f3271f605f48c55bfb334fde4365760ff54ab1537 - Sigstore transparency entry: 163235074
- Sigstore integration time:
-
Permalink:
daniel-gomm/PyTEI@50202d832d65cc5dc07edcc66adfa1d6ff869eba -
Branch / Tag:
refs/tags/v0.1.0-alpha - Owner: https://github.com/daniel-gomm
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@50202d832d65cc5dc07edcc66adfa1d6ff869eba -
Trigger Event:
release
-
Statement type:
File details
Details for the file pytei_client-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pytei_client-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0947cb33a0ccf1f516691c667cc3b8424b3f1dee70575e4c82925431a386091b
|
|
| MD5 |
ad56bd572d4ef02128185ccbba7548a1
|
|
| BLAKE2b-256 |
f2ebbc581ab09da3ac3660dd3a93bbbf1d1fecca21beba087bbe0167a4fb3e22
|
Provenance
The following attestation bundles were made for pytei_client-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on daniel-gomm/PyTEI
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pytei_client-0.1.0-py3-none-any.whl -
Subject digest:
0947cb33a0ccf1f516691c667cc3b8424b3f1dee70575e4c82925431a386091b - Sigstore transparency entry: 163235075
- Sigstore integration time:
-
Permalink:
daniel-gomm/PyTEI@50202d832d65cc5dc07edcc66adfa1d6ff869eba -
Branch / Tag:
refs/tags/v0.1.0-alpha - Owner: https://github.com/daniel-gomm
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@50202d832d65cc5dc07edcc66adfa1d6ff869eba -
Trigger Event:
release
-
Statement type: