llama-index llms llamafile integration
Project description
LlamaIndex Llms Integration: llamafile
Setup Steps
1. Download a LlamaFile
Use the following command to download a LlamaFile from Hugging Face:
wget https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile
2. Make the File Executable
On Unix-like systems, run the following command:
chmod +x TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile
For Windows, simply rename the file to end with .exe
.
3. Start the Model Server
Run the following command to start the model server, which will listen on http://localhost:8080
by default:
./TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile --server --nobrowser --embedding
Using LlamaIndex
If you are using Google Colab or want to interact with LlamaIndex, you will need to install the necessary packages:
%pip install llama-index-llms-llamafile
!pip install llama-index
Import Required Libraries
from llama_index.llms.llamafile import Llamafile
from llama_index.core.llms import ChatMessage
Initialize the LLM
Create an instance of the LlamaFile LLM:
llm = Llamafile(temperature=0, seed=0)
Generate Completions
To generate a completion for a prompt, use the complete
method:
resp = llm.complete("Who is Octavia Butler?")
print(resp)
Call Chat with a List of Messages
You can also interact with the LLM using a list of messages:
messages = [
ChatMessage(
role="system",
content="Pretend you are a pirate with a colorful personality.",
),
ChatMessage(role="user", content="What is your name?"),
]
resp = llm.chat(messages)
print(resp)
Streaming Responses
To use the streaming capabilities, you can call the stream_complete
method:
response = llm.stream_complete("Who is Octavia Butler?")
for r in response:
print(r.delta, end="")
You can also stream chat responses:
messages = [
ChatMessage(
role="system",
content="Pretend you are a pirate with a colorful personality.",
),
ChatMessage(role="user", content="What is your name?"),
]
resp = llm.stream_chat(messages)
for r in resp:
print(r.delta, end="")
LLM Implementation example
https://docs.llamaindex.ai/en/stable/examples/llm/llamafile/
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file llama_index_llms_llamafile-0.3.0.tar.gz
.
File metadata
- Download URL: llama_index_llms_llamafile-0.3.0.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.10 Darwin/22.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f50732832524e11607093171f629b8f64a647d160fdcfce2d3b864cc8223ee4 |
|
MD5 | e28847999827bfbce6dd89926f2e533d |
|
BLAKE2b-256 | 1d306d11ae6689abbbcc81c3f3e7b337c843d366027f3bf44ad25be8050c8fa0 |
File details
Details for the file llama_index_llms_llamafile-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: llama_index_llms_llamafile-0.3.0-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.10 Darwin/22.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb6533b2848fa7902953dd816396b7775d8159cc3ccf80c181a475f63ceab7c9 |
|
MD5 | 9a04d3e8085c5a93debbba680517aae2 |
|
BLAKE2b-256 | 433844928583bd37853343be0feaac8044cb7e74ea0f4b4227a11a3331441473 |