Skip to main content

Haystack integration for vllm

Project description

vllm-haystack

PyPI - Version PyPI - Python Version


Contributing

Refer to the general Contribution Guidelines.

To run integration tests locally, you need two vLLM servers running in parallel: one for the chat generator on port 8000 and one for the embedders on port 8001. Refer to the workflow file for more details.

For example, on macOs, you can install vLLM-metal and start the chat generator server with:

# chat generator server (port 8000)
source ~/.venv-vllm-metal/bin/activate && vllm serve Qwen/Qwen3-0.6B --reasoning-parser qwen3 --max-model-len 1024 --enforce-eager --enable-auto-tool-choice --tool-call-parser hermes

vLLM-metal does not support embedding models. On macOS, you can run the embedding server via CPU Docker image:

# embedders server (port 8001)
docker run --rm -p 8001:8000 -e VLLM_CPU_OMP_THREADS_BIND=0-3 vllm/vllm-openai-cpu:latest \
    --model sentence-transformers/all-MiniLM-L6-v2 --enforce-eager

To run the ranker server, use CPU Docker image:

# ranker server (port 8002)
docker run --rm -p 8002:8000 -e VLLM_CPU_OMP_THREADS_BIND=0-3 vllm/vllm-openai-cpu:latest \
    --model BAAI/bge-reranker-base --enforce-eager

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_haystack-1.1.0.tar.gz (25.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_haystack-1.1.0-py3-none-any.whl (23.9 kB view details)

Uploaded Python 3

File details

Details for the file vllm_haystack-1.1.0.tar.gz.

File metadata

  • Download URL: vllm_haystack-1.1.0.tar.gz
  • Upload date:
  • Size: 25.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for vllm_haystack-1.1.0.tar.gz
Algorithm Hash digest
SHA256 1403f7d3a8d3cac910e81fd44fb772c49bde1c0aaa6d674837174baf66f6e20e
MD5 68694e984a6d48b07d8d07d8aebc14ea
BLAKE2b-256 affc016901887553ad8ed18aff492d7dd147d84d846cdeafe9b9c3f78e86ba01

See more details on using hashes here.

Provenance

The following attestation bundles were made for vllm_haystack-1.1.0.tar.gz:

Publisher: CI_pypi_release.yml on deepset-ai/haystack-core-integrations

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vllm_haystack-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: vllm_haystack-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for vllm_haystack-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae9ebe74b238ab1f790beaf9283be614ad4c627d6893a304802e9e03dfeab7e2
MD5 2d0b18465bfb24651a87efb286eab32c
BLAKE2b-256 00120c8c3c0886528878ff7b15ed33ee43a8446fa9ccc1cab17c75f8753232ba

See more details on using hashes here.

Provenance

The following attestation bundles were made for vllm_haystack-1.1.0-py3-none-any.whl:

Publisher: CI_pypi_release.yml on deepset-ai/haystack-core-integrations

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page