No project description provided
Project description
Integration: TitanML
Use TitanML's Takeoff server to serve local models efficiently with Haystack 2.0
Introduction
You can use the Takeoff inference server to deploy local models efficiently in your Haystack 2.0 pipelines. Takeoff is a state-of-the art inference server focused on deploying openly available language models at scale. It can run LLMs on local machines with consumer GPUs, and on cloud infrastructure.
The TakeoffGenerator component in Haystack 2.0 is a wrapper around the Takeoff server API, and can be used to serve takeoff-deployed models efficiently in Haystack pipelines.
Installation
pip install takeoff_haystack
Usage
You can interact with takeoff deployed models using the TakeoffGenerator
component in Haystack. To do so, you must have a takeoff model deployed. For information on how to do so, please read the takeoff docs here.
The following example deploys a gpt2 model using takeoff locally on port 3000.
docker run --gpus all -e TAKEOFF_MODEL_NAME=TheBloke/Llama-2-7B-Chat-AWQ \
-e TAKEOFF_DEVICE=cuda \
-e TAKEOFF_MAX_SEQUENCE_LENGTH=256 \
-it \
-p 3000:3000 tytn/takeoff-pro:0.11.0-gpu
TextGeneration
Below is an example of using takeoff models in a Haystack RAG pipeline. It summarizes headlines from popular news sites in technology.
from typing import Dict, List
from haystack import Document, Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
import feedparser
from takeoff_haystack import TakeoffGenerator
# Dict of website RSS feeds
urls = {
'theverge': 'https://www.theverge.com/rss/frontpage/',
'techcrunch': 'https://techcrunch.com/feed',
'mashable': 'https://mashable.com/feeds/rss/all',
'cnet': 'https://cnet.com/rss/news',
'engadget': 'https://engadget.com/rss.xml',
'zdnet': 'https://zdnet.com/news/rss.xml',
'venturebeat': 'https://feeds.feedburner.com/venturebeat/SZYF',
'readwrite': 'https://readwrite.com/feed/',
'wired': 'https://wired.com/feed/rss',
'gizmodo': 'https://gizmodo.com/rss',
}
# Configurable parameters
NUM_WEBSITES = 3
NUM_TITLES = 1
def get_titles(urls: Dict[str, str], num_sites: int, num_titles: int) -> List[str]:
titles: List[str] = []
sites = list(urls.keys())[:num_sites]
for site in sites:
feed = feedparser.parse(urls[site])
entries = feed.entries[:num_titles]
for entry in entries:
titles.append(entry.title)
return titles
titles = get_titles(urls, NUM_WEBSITES, NUM_TITLES)
titles_string = " - ".join(titles)
document_store = InMemoryDocumentStore()
document_store.write_documents([Document(content=titles_string)])
template = """
HEADLINES:
{% for document in documents %}
{{ document.content }}
{% endfor %}
REQUEST: {{ query }}
"""
pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", TakeoffGenerator(base_url="http://localhost", port="3000"))
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")
query = f"Summarize each of the {NUM_WEBSITES * NUM_TITLES} provided headlines in three words."
response = pipe.run({"prompt_builder": {"query": query}, "retriever": {"query": query}})
print(response["llm"]["replies"])
You should see a response like the following
['\n\n\nANSWER:\n\n1. Poker Roguelike - Exciting gameplay\n2. AI-powered news reader - Personalized feed\n3. Best laptops MWC 2024 - Powerful devices']
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file takeoff_haystack-0.1.0.tar.gz
.
File metadata
- Download URL: takeoff_haystack-0.1.0.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.1 CPython/3.10.12 Linux/6.5.0-18-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1e61b373fe7938117694b3a5828cb59f0329d504727b033fb3d5570292822b30 |
|
MD5 | 7f3f1fe8deef10d93fd7015a35baab80 |
|
BLAKE2b-256 | ce5b0841d4dad820fb59d3f374fd4bbda17e86d648d34b01b7c66932fd2c38c3 |
File details
Details for the file takeoff_haystack-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: takeoff_haystack-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.1 CPython/3.10.12 Linux/6.5.0-18-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4988927dcb204a2455e9a396afb77a98c498943d72e142285f03d65b1a87bcb0 |
|
MD5 | 5a133aa4713ae0be481703cf0faa1082 |
|
BLAKE2b-256 | 0999815d30a87ff135e18db51823c9bcd832dc2a54df431b63917c413c6ba522 |