Skip to main content

Light-weight server exposing sentence-transformer models for embeddings

Project description

Sentence Transformer Server

Lightweight FastAPI wrapper that exposes any sentence-transformers model over HTTP for easy embedding generation.

Quick Start

  1. Install the package from PyPI (Python 3.13+):
    pip install stserve
    
  2. Launch the server with the default MiniLM model (choose your preferred runner):
    stserve
    # or via uvx
    uvx stserve
    
  3. Request embeddings from another terminal:
    curl -X POST http://127.0.0.1:8501/embed \
      -H "Content-Type: application/json" \
      -d '{"texts": ["hello world", "how are you?"]}'
    

Configuration

  • --model: sentence-transformers model name or local path (defaults to all-MiniLM-L6-v2).
  • --device: target Torch device; auto-detects CUDA.
  • --batch-size: batches encode calls for throughput.
  • --normalize: toggles L2 normalization on embeddings.
  • --show-progress: prints encode progress in the server logs.
  • --host / --port: Uvicorn bind address (defaults 127.0.0.1:8501).

Example with GPU and normalization:

uvx stserve --model sentence-transformers/all-MiniLM-L12-v2 --device cuda --normalize

API

  • POST /embed: Body { "texts": [str, ...] }{ "embeddings": [[float, ...], ...] }.
  • GET /health: Returns current configuration (model, device, batch size, etc.).

Development Notes

  • The CLI entry point is stserve.app:main.
  • Embedding work happens in a thread pool so the event loop stays responsive.
  • CUDA usage requires PyTorch with GPU support.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stserve-0.1.0.tar.gz (52.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stserve-0.1.0-py3-none-any.whl (3.4 kB view details)

Uploaded Python 3

File details

Details for the file stserve-0.1.0.tar.gz.

File metadata

  • Download URL: stserve-0.1.0.tar.gz
  • Upload date:
  • Size: 52.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for stserve-0.1.0.tar.gz
Algorithm Hash digest
SHA256 da15faedd7d3cfd556ef3b248f55d64864cf6fee034be2a7e955e249407bc9ee
MD5 03ef3de71658b82b1df7b37e654571be
BLAKE2b-256 a2f842c730dfbe14ff4beb676e6d59c3b5a54792f3cd6f2fd339099848ab412f

See more details on using hashes here.

Provenance

The following attestation bundles were made for stserve-0.1.0.tar.gz:

Publisher: release.yaml on dschaub95/stserve

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file stserve-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: stserve-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for stserve-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6bb04c73f0716502e6efb80e27304fbe7b67e412c9942085fc2bfd8ab63e12d2
MD5 168862d8b904ff3f24686da397ca91d1
BLAKE2b-256 a3ca76f1e973244a2d8d98f308b4d8a473bf456da653e69780459db1f9f16670

See more details on using hashes here.

Provenance

The following attestation bundles were made for stserve-0.1.0-py3-none-any.whl:

Publisher: release.yaml on dschaub95/stserve

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page