Skip to main content

Haystack integration for vllm

Project description

vllm-haystack

PyPI - Version PyPI - Python Version


Contributing

Refer to the general Contribution Guidelines.

To run integration tests locally, you need two vLLM servers running in parallel: one for the chat generator on port 8000 and one for the embedders on port 8001. Refer to the workflow file for more details.

For example, on macOs, you can install vLLM-metal and start the chat generator server with:

# chat generator server (port 8000)
source ~/.venv-vllm-metal/bin/activate && vllm serve Qwen/Qwen3-0.6B --reasoning-parser qwen3 --max-model-len 1024 --enforce-eager --enable-auto-tool-choice --tool-call-parser hermes

vLLM-metal does not support embedding models. On macOS, you can run the embedding server via CPU Docker image:

# embedders server (port 8001)
docker run --rm -p 8001:8000 -e VLLM_CPU_OMP_THREADS_BIND=0-3 vllm/vllm-openai-cpu:latest \
    --model sentence-transformers/all-MiniLM-L6-v2 --enforce-eager

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_haystack-1.0.0.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_haystack-1.0.0-py3-none-any.whl (20.2 kB view details)

Uploaded Python 3

File details

Details for the file vllm_haystack-1.0.0.tar.gz.

File metadata

  • Download URL: vllm_haystack-1.0.0.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for vllm_haystack-1.0.0.tar.gz
Algorithm Hash digest
SHA256 70399daffcfb62a4414b567133fb150cadc6eeef83c4f0b3ed55abd0c926b8ce
MD5 902e7ea237614f6fc4b0534462168300
BLAKE2b-256 b8516eeecc07c1e0614f023201fcd54d1b25699b1ee7237872d1d94a48dfe5b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for vllm_haystack-1.0.0.tar.gz:

Publisher: CI_pypi_release.yml on deepset-ai/haystack-core-integrations

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vllm_haystack-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: vllm_haystack-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 20.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for vllm_haystack-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6a2a69269d6bce58c64e5cd70188250fcb1c550044892e4dd7d01ff42964a671
MD5 b92f32e4ba56fd60da8cff3fd08d27a9
BLAKE2b-256 98c4423c22a97ceb8ad77abbc48a006142bf63f04640f778fc429e992e0b35f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for vllm_haystack-1.0.0-py3-none-any.whl:

Publisher: CI_pypi_release.yml on deepset-ai/haystack-core-integrations

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page