Skip to main content

llama-index llms llamafile integration

Project description

LlamaIndex Llms Integration: llamafile

Setup Steps

1. Download a LlamaFile

Use the following command to download a LlamaFile from Hugging Face:

wget https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile

2. Make the File Executable

On Unix-like systems, run the following command:

chmod +x TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile

For Windows, simply rename the file to end with .exe.

3. Start the Model Server

Run the following command to start the model server, which will listen on http://localhost:8080 by default:

./TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile --server --nobrowser --embedding

Using LlamaIndex

If you are using Google Colab or want to interact with LlamaIndex, you will need to install the necessary packages:

%pip install llama-index-llms-llamafile
!pip install llama-index

Import Required Libraries

from llama_index.llms.llamafile import Llamafile
from llama_index.core.llms import ChatMessage

Initialize the LLM

Create an instance of the LlamaFile LLM:

llm = Llamafile(temperature=0, seed=0)

Generate Completions

To generate a completion for a prompt, use the complete method:

resp = llm.complete("Who is Octavia Butler?")
print(resp)

Call Chat with a List of Messages

You can also interact with the LLM using a list of messages:

messages = [
    ChatMessage(
        role="system",
        content="Pretend you are a pirate with a colorful personality.",
    ),
    ChatMessage(role="user", content="What is your name?"),
]
resp = llm.chat(messages)
print(resp)

Streaming Responses

To use the streaming capabilities, you can call the stream_complete method:

response = llm.stream_complete("Who is Octavia Butler?")
for r in response:
    print(r.delta, end="")

You can also stream chat responses:

messages = [
    ChatMessage(
        role="system",
        content="Pretend you are a pirate with a colorful personality.",
    ),
    ChatMessage(role="user", content="What is your name?"),
]
resp = llm.stream_chat(messages)
for r in resp:
    print(r.delta, end="")

LLM Implementation example

https://docs.llamaindex.ai/en/stable/examples/llm/llamafile/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_llms_llamafile-0.5.0.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_llms_llamafile-0.5.0-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_llms_llamafile-0.5.0.tar.gz.

File metadata

  • Download URL: llama_index_llms_llamafile-0.5.0.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_llms_llamafile-0.5.0.tar.gz
Algorithm Hash digest
SHA256 fbb9c5ea9c8c5d95b0fe8832b9571e301b66152aeb6e17e7b7dca37b11d3ec54
MD5 a10ad6810348b16f2d5518974926f3a2
BLAKE2b-256 5410075c11d076c363f7b254631c74cd6c53f2808e79e81b905bc208e8e9f6da

See more details on using hashes here.

File details

Details for the file llama_index_llms_llamafile-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_llms_llamafile-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 5.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_llms_llamafile-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1df8c7e9150924206eed5581162cb3c23cdd5c1b0f98fbfe1105a67b6c4f9876
MD5 0cc74bdf2cdf63e55836d194e981e913
BLAKE2b-256 ded2af24dd219e6c04375f57d8aeafd34ef1d768a0ce82ae9fcd267be3ba7ad1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page