Skip to main content

Config-driven launcher for Hugging Face Text Embeddings Inference services

Project description

tei-serving

tei-serving is a config-driven launcher for Hugging Face Text Embeddings Inference (TEI). It wraps the TEI router with a small Python entrypoint that reads YAML, validates settings, applies local embedder compatibility patches when needed, and starts TEI with the generated CLI arguments.

What it does

Current capabilities:

  1. Load an embedder or reranker YAML config.
  2. Convert typed settings into TEI command-line arguments.
  3. Patch local SentenceTransformers embedder exports into a TEI-compatible temporary copy.
  4. Start the TEI router process.
  5. Leave reranker configs untouched by embedder-specific patching.

Core files:

Requirements

  • Python >=3.12,<3.13 for local development.
  • Docker for the production image.
  • A TEI base image containing text-embeddings-router.
  • GPU runtime when serving CUDA TEI images.

Installation

Production dependencies:

make install

Development environment:

make dev-install

Equivalent uv command:

uv sync --group dev --all-extras

Configuration

Embedder example:

kind: embedder

model:
  model-id: /models/testb
  dtype: float32
  pooling: cls

server:
  hostname: 0.0.0.0
  port: 8082

batching:
  max-batch-tokens: 16384
  max-client-batch-size: 32

Reranker example:

kind: reranker

model:
  model-id: BAAI/bge-reranker-base
  dtype: float16

server:
  hostname: 0.0.0.0
  port: 8081

batching:
  max-batch-tokens: 16384
  max-client-batch-size: 32

Running

Local CLI, assuming TEI is available on PATH:

tei-serving --config configs/embedder.yaml

Equivalent:

python -m tei_serving.main --config configs/embedder.yaml

Docker build:

docker build -t tei-serving:local .

Docker run with a mounted config:

docker run --gpus all --rm \
  -v /absolute/path/to/config.yaml:/config/config.yaml:ro \
  -v /absolute/path/to/models:/models:ro \
  tei-serving:local \
  --config /config/config.yaml

Local Embedder Patching

When kind: embedder, the runner prepares a temporary TEI-compatible copy before startup. Local model directories are copied directly. Hugging Face model IDs such as BAAI/bge-base-en-v1.5 are first downloaded with huggingface_hub.snapshot_download, then copied into the same private tei-serving-patched-models-* directory. Reranker configs skip this embedder-only patching.

The patch step updates metadata files TEI parses strictly:

  • modules.json: normalizes SentenceTransformers module class paths.
  • sentence_bert_config.json: adds max_seq_length when it can be inferred from config.json.
  • 1_Pooling/config.json: adds explicit legacy pooling booleans.

The original model directory is never modified.

Development

make format
make lint
make type-check
make security
make test
make test-cov
make ci

Repository Layout

src/tei_serving/
  __init__.py      # runner and local embedder patching
  exceptions.py    # package exceptions
  main.py          # CLI entrypoint
  settings.py      # Pydantic settings and CLI serialization
configs/
  embedder.yaml    # example embedder config
  reranker.yaml    # example reranker config
tests/
  unit/            # unit tests for settings and runner behavior

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tei_serving-1.1.0.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tei_serving-1.1.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file tei_serving-1.1.0.tar.gz.

File metadata

  • Download URL: tei_serving-1.1.0.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for tei_serving-1.1.0.tar.gz
Algorithm Hash digest
SHA256 a3f261ab2aeecc825e42cab7fe2399edb50c7c022e2bc5ede16e173f6ffe93fc
MD5 6c09869bf6a0cdd72f933daa8de6b2df
BLAKE2b-256 fb4e09f68ef3e024b1e342e1386ae8c8a9cb7567e1490d777290b0ea82ba2cb0

See more details on using hashes here.

File details

Details for the file tei_serving-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: tei_serving-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for tei_serving-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a48a413bc7295939596f4d8219a76b3ebd8a25cde5c57c38c93eca6bc62acb07
MD5 188407c2b30e88a5ad554af1bf43a4fa
BLAKE2b-256 0b4f67d457c309cd60cbff8d0b2ded6ed6cfa13b4b97d924594668c0f92beba7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page