A simple adapter to use a hosted vLLM-API in your Haystack pipelines.
Project description
vLLM-haystack-adapter
Simply connect your haystack pipeline to a selfhosted vLLM-API server.
Installation
Install the wrapper via pip: pip install vllm-haystack
Usage
To utilize the wrapper the vLLMInvocationLayer
has to be used.
Here is a simple example of how a PromptNode
can be created with the wrapper.
from haystack.nodes import PromptNode, PromptModel
from vllm_haystack import vLLMInvocationLayer
model = PromptModel(model_name_or_path="", invocation_layer_class=vLLMInvocationLayer, max_length=256, api_key="EMPTY", model_kwargs={
"api_base" : API, # Replace this with your API-URL
"maximum_context_length": 2048,
})
prompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)
For more configuration examples, take a look at the unit-tests.
Hosting a vLLM Server
To create an OpenAI-Compatible Server via vLLM you can follow the steps in the Quickstart section of their documenetation.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vllm_haystack-0.0.1.tar.gz
(9.1 kB
view hashes)
Built Distribution
Close
Hashes for vllm_haystack-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 64cebd4550db1e965d0d8c68bce8270a164e0a9b75fa340d681480f309a823da |
|
MD5 | 2cbeec4708fde1e98ce82122b1871f7d |
|
BLAKE2b-256 | c92bdb947624cc4dc0b0876fdf79ad10c06e940ce1b489a7980ac1ac538ba66d |