Skip to main content

Config-driven launcher for Hugging Face Text Embeddings Inference services

Project description

tei-serving

tei-serving is a config-driven launcher for Hugging Face Text Embeddings Inference (TEI). It wraps the TEI router with a small Python entrypoint that reads YAML, validates settings, applies local embedder compatibility patches when needed, and starts TEI with the generated CLI arguments.

What it does

Current capabilities:

  1. Load an embedder or reranker YAML config.
  2. Convert typed settings into TEI command-line arguments.
  3. Patch local SentenceTransformers embedder exports into a TEI-compatible temporary copy.
  4. Start the TEI router process.
  5. Leave reranker configs untouched by embedder-specific patching.

Core files:

Requirements

  • Python >=3.12,<3.13 for local development.
  • Docker for the production image.
  • A TEI base image containing text-embeddings-router.
  • GPU runtime when serving CUDA TEI images.

Installation

Production dependencies:

make install

Development environment:

make dev-install

Equivalent uv command:

uv sync --group dev --all-extras

Configuration

Embedder example:

kind: embedder

model:
  model-id: /models/testb
  dtype: float32
  pooling: cls

server:
  hostname: 0.0.0.0
  port: 8082

batching:
  max-batch-tokens: 16384
  max-client-batch-size: 32

Reranker example:

kind: reranker

model:
  model-id: BAAI/bge-reranker-base
  dtype: float16

server:
  hostname: 0.0.0.0
  port: 8081

batching:
  max-batch-tokens: 16384
  max-client-batch-size: 32

Running

Local CLI, assuming TEI is available on PATH:

tei-serving --config configs/embedder.yaml

Equivalent:

python -m tei_serving.main --config configs/embedder.yaml

Docker build:

docker build -t tei-serving:local .

Docker run with a mounted config:

docker run --gpus all --rm \
  -v /absolute/path/to/config.yaml:/config/config.yaml:ro \
  -v /absolute/path/to/models:/models:ro \
  tei-serving:local \
  --config /config/config.yaml

Local Embedder Patching

When kind: embedder and model.model-id points to a local directory, the runner copies the model into a private temporary tei-serving-patched-models-* directory and patches metadata files TEI parses strictly:

  • modules.json: normalizes SentenceTransformers module class paths.
  • sentence_bert_config.json: adds max_seq_length when it can be inferred from config.json.
  • 1_Pooling/config.json: adds explicit legacy pooling booleans.

The original model directory is never modified.

Development

make format
make lint
make type-check
make security
make test
make test-cov
make ci

Repository Layout

src/tei_serving/
  __init__.py      # runner and local embedder patching
  exceptions.py    # package exceptions
  main.py          # CLI entrypoint
  settings.py      # Pydantic settings and CLI serialization
configs/
  embedder.yaml    # example embedder config
  reranker.yaml    # example reranker config
tests/
  unit/            # unit tests for settings and runner behavior

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tei_serving-1.0.0.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tei_serving-1.0.0-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file tei_serving-1.0.0.tar.gz.

File metadata

  • Download URL: tei_serving-1.0.0.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for tei_serving-1.0.0.tar.gz
Algorithm Hash digest
SHA256 db94aa59ec22eec4e7c3330aa89d8aed7acaacdd9a05d662dcb814811b3c3d7f
MD5 371da21272871d1365489e8ea9bac1d5
BLAKE2b-256 d99a92d3d620c450afc1044c518bb9b0fbe253c2c699c9953a86c0e147488386

See more details on using hashes here.

File details

Details for the file tei_serving-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: tei_serving-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for tei_serving-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 89fb33420396edc80f29004d6bca200730bee4fcfa8d37f93426555afdb47750
MD5 c01a4418e19afa21873a8f61f07c7e1d
BLAKE2b-256 b18de4b3bf012bcb823afb4992d22f42918a5239f57e1bd1d660c50a8b6e081d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page