A simple adapter to use vLLM in your Haystack pipelines.
Project description
vLLM-haystack-adapter
Simply use vLLM in your haystack pipeline, to utilize fast, self-hosted LLMs.
Installation
Install the wrapper via pip: pip install vllm-haystack
Usage
This integration provides two invocation layers:
vLLMInvocationLayer
: To use models hosted on a vLLM server (or any other OpenAI compatible server)vLLMLocalInvocationLayer
: To use locally hosted vLLM models
Use a Model Hosted on a vLLM Server
To utilize the wrapper the vLLMInvocationLayer
has to be used.
Here is a simple example of how a PromptNode
can be created with the wrapper.
from haystack.nodes import PromptNode, PromptModel
from vllm_haystack import vLLMInvocationLayer
model = PromptModel(model_name_or_path="", invocation_layer_class=vLLMInvocationLayer, max_length=256, api_key="EMPTY", model_kwargs={
"api_base" : API, # Replace this with your API-URL
"maximum_context_length": 2048,
})
prompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)
The model will be inferred based on the model served on the vLLM server. For more configuration examples, take a look at the unit-tests.
Hosting a vLLM Server
To create an OpenAI-Compatible Server via vLLM you can follow the steps in the Quickstart section of their documentation.
Use a Model Hosted Locally
⚠️To run vLLM
locally you need to have vllm
installed and a supported GPU.
If you don't want to use an API-Server this wrapper also provides a vLLMLocalInvocationLayer
which executes the vLLM on the same node Haystack is running on.
Here is a simple example of how a PromptNode
can be created with the vLLMLocalInvocationLayer
.
from haystack.nodes import PromptNode, PromptModel
from vllm_haystack import vLLMLocalInvocationLayer
model = PromptModel(model_name_or_path=MODEL, invocation_layer_class=vLLMLocalInvocationLayer, max_length=256, model_kwargs={
"maximum_context_length": 2048,
})
prompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file vllm_haystack-0.1.2.tar.gz
.
File metadata
- Download URL: vllm_haystack-0.1.2.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 530015c6518a60114ce02a534addd09dfa035f69dc88d63e1fafd7c43364a692 |
|
MD5 | 86357bb0ef46c7699ea85df9d106332f |
|
BLAKE2b-256 | 1635df875e43b92273f0b9157f029bf6747f674d51da39bb64e65af7f392a7b9 |
File details
Details for the file vllm_haystack-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: vllm_haystack-0.1.2-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 93993735949ab201a70802e31d9d2aad6a98c2c3618bca990e00d829c0868950 |
|
MD5 | d34f78e43441b7647c5b0c8da81dde66 |
|
BLAKE2b-256 | 4a24b561ec26e5b4b7aeb54c851157f624b2aa4c739948f6d41a9c2734bdfc96 |