llama-index llms llamafile integration
Project description
LlamaIndex Llms Integration: llamafile
Setup Steps
1. Download a LlamaFile
Use the following command to download a LlamaFile from Hugging Face:
wget https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile
2. Make the File Executable
On Unix-like systems, run the following command:
chmod +x TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile
For Windows, simply rename the file to end with .exe
.
3. Start the Model Server
Run the following command to start the model server, which will listen on http://localhost:8080
by default:
./TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile --server --nobrowser --embedding
Using LlamaIndex
If you are using Google Colab or want to interact with LlamaIndex, you will need to install the necessary packages:
%pip install llama-index-llms-llamafile
!pip install llama-index
Import Required Libraries
from llama_index.llms.llamafile import Llamafile
from llama_index.core.llms import ChatMessage
Initialize the LLM
Create an instance of the LlamaFile LLM:
llm = Llamafile(temperature=0, seed=0)
Generate Completions
To generate a completion for a prompt, use the complete
method:
resp = llm.complete("Who is Octavia Butler?")
print(resp)
Call Chat with a List of Messages
You can also interact with the LLM using a list of messages:
messages = [
ChatMessage(
role="system",
content="Pretend you are a pirate with a colorful personality.",
),
ChatMessage(role="user", content="What is your name?"),
]
resp = llm.chat(messages)
print(resp)
Streaming Responses
To use the streaming capabilities, you can call the stream_complete
method:
response = llm.stream_complete("Who is Octavia Butler?")
for r in response:
print(r.delta, end="")
You can also stream chat responses:
messages = [
ChatMessage(
role="system",
content="Pretend you are a pirate with a colorful personality.",
),
ChatMessage(role="user", content="What is your name?"),
]
resp = llm.stream_chat(messages)
for r in resp:
print(r.delta, end="")
LLM Implementation example
https://docs.llamaindex.ai/en/stable/examples/llm/llamafile/
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for llama_index_llms_llamafile-0.2.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7fac777d9571eb379d1ae35d0a9e41085963768a4b27cd021b73856637da7874 |
|
MD5 | 5beede2fcad47bf2080e907c26e858cc |
|
BLAKE2b-256 | 74283d900d1e8d10ab7b0181a5326c7610de028e28e646194223307af1d461c7 |
Hashes for llama_index_llms_llamafile-0.2.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 190245bee6420ba9f990df188b48714aa87b4ad3dc234f18e216c207c5e3c8bd |
|
MD5 | 9a07af4c342ac7c94e8e585f4c5bbf4a |
|
BLAKE2b-256 | 0a8b32132bc886fd021d8436a83f3bacf4501f14a4b7c3e6f00a9a415b335841 |