Skip to main content

vLLM adapter for a TGIS-compatible grpc server

Project description

vllm-tgis-adapter

vLLM adapter for a TGIS-compatible grpc server.

PyPi Tests quay.io/opendatahub/vllm codecov

Install

vllm-tgis-adapter is available on PyPi

pip install vllm-tgis-adapter
python -m vllm_tgis_adapter

HealthCheck CLI

Installing the adapter also install a grpc healthcheck cli that can be used to monitor the status of the grpc server:

$ grpc_healtheck
health check...status: SERVING

See usage with

grpc_healthcheck --help

Build

python -m build
pip install dist/*whl
python -m vllm_tgis_adapter

Inference

This will start serving a grpc server on port 8033. This can be queried with grpcurl:

bash examples/inference.sh

Docker

Image available at quay.io/opendatahub/vllm, built from opendatahub-io/vllm's Dockerfile.ubi

docker pull quay.io/opendatahub/vllm

Inference

See examples

Contributing

Set up pre-commit for linting/style/misc fixes:

pip install pre-commit
pre-commit install
# to run on all files
pre-commit run --all-files

This project uses nox to manage test automation and uv for venv management:

pip install nox uv
nox --list  # list available sessions
nox -s tests-3.10 # run tests session for a specific python version
nox -s build-3.11 # build the wheel package
nox -s lint-3.11 -- --mypy # run linting with type checks

Testing without a GPU

The standard vllm built requires an Nvidia GPU. When this is not available, it is possible to compile vllm from source with CPU support:

git clone https://github.com/vllm-project/vllm
cd vllm

uv venv
source .venv/bin/activate

export UV_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cpu \
    UV_INDEX_STRATEGY=unsafe-best-match\

.github/scripts/install_vllm_build_deps.py pyproject.toml

env \
    VLLM_TARGET_DEVICE=cpu \
    python setup.py bdist_wheel
export VLLM_VERSION_OVERRIDE=$PWD/dist/*whl
# the nox session can now be run with the custom built vllm cpu version

making it possible to run the tests on most hardware. Please note that the uv extra index url is required in order to install the torch CPU version.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_tgis_adapter-0.9.0.post1.tar.gz (92.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_tgis_adapter-0.9.0.post1-py3-none-any.whl (50.7 kB view details)

Uploaded Python 3

File details

Details for the file vllm_tgis_adapter-0.9.0.post1.tar.gz.

File metadata

  • Download URL: vllm_tgis_adapter-0.9.0.post1.tar.gz
  • Upload date:
  • Size: 92.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vllm_tgis_adapter-0.9.0.post1.tar.gz
Algorithm Hash digest
SHA256 d12f52160d91e70e8dd24bd23562a18e73691a5909fa18d5abb7d2a0f691d5b5
MD5 d4d5834e23efe0dae1dbcd20fc038ee7
BLAKE2b-256 d8abdfe22ff73968b06b88fe4c54088a1b0fbc3d22b8d21f0eb6692e0026544f

See more details on using hashes here.

Provenance

The following attestation bundles were made for vllm_tgis_adapter-0.9.0.post1.tar.gz:

Publisher: release.yaml on opendatahub-io/vllm-tgis-adapter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vllm_tgis_adapter-0.9.0.post1-py3-none-any.whl.

File metadata

File hashes

Hashes for vllm_tgis_adapter-0.9.0.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 ffb9ba982d506573355b0f47513078fb834ca868c4fa5ad06b11f4fcec4a9677
MD5 17ad7d4b245f7b2088035334e974f1ca
BLAKE2b-256 8fcf29bf9d6259ffbfe8b1d5e57e566696ae089ee9eb0895904fae53cea6363e

See more details on using hashes here.

Provenance

The following attestation bundles were made for vllm_tgis_adapter-0.9.0.post1-py3-none-any.whl:

Publisher: release.yaml on opendatahub-io/vllm-tgis-adapter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page