llama-index llms llamafile integration

These details have not been verified by PyPI

Project description

LlamaIndex Llms Integration: llamafile

Setup Steps

1. Download a LlamaFile

Use the following command to download a LlamaFile from Hugging Face:

wget https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile

2. Make the File Executable

On Unix-like systems, run the following command:

chmod +x TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile

For Windows, simply rename the file to end with .exe.

3. Start the Model Server

Run the following command to start the model server, which will listen on http://localhost:8080 by default:

./TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile --server --nobrowser --embedding

Using LlamaIndex

If you are using Google Colab or want to interact with LlamaIndex, you will need to install the necessary packages:

%pip install llama-index-llms-llamafile
!pip install llama-index

Import Required Libraries

from llama_index.llms.llamafile import Llamafile
from llama_index.core.llms import ChatMessage

Initialize the LLM

Create an instance of the LlamaFile LLM:

llm = Llamafile(temperature=0, seed=0)

Generate Completions

To generate a completion for a prompt, use the complete method:

resp = llm.complete("Who is Octavia Butler?")
print(resp)

Call Chat with a List of Messages

You can also interact with the LLM using a list of messages:

messages = [
    ChatMessage(
        role="system",
        content="Pretend you are a pirate with a colorful personality.",
    ),
    ChatMessage(role="user", content="What is your name?"),
]
resp = llm.chat(messages)
print(resp)

Streaming Responses

To use the streaming capabilities, you can call the stream_complete method:

response = llm.stream_complete("Who is Octavia Butler?")
for r in response:
    print(r.delta, end="")

You can also stream chat responses:

messages = [
    ChatMessage(
        role="system",
        content="Pretend you are a pirate with a colorful personality.",
    ),
    ChatMessage(role="user", content="What is your name?"),
]
resp = llm.stream_chat(messages)
for r in resp:
    print(r.delta, end="")

LLM Implementation example

https://docs.llamaindex.ai/en/stable/examples/llm/llamafile/

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.5.0

Mar 12, 2026

0.4.1

Sep 8, 2025

0.4.0

Jul 30, 2025

0.3.0

Nov 18, 2024

0.2.2

Oct 8, 2024

0.2.1

Sep 13, 2024

0.2.0

Aug 22, 2024

0.1.2

Mar 12, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_llms_llamafile-0.5.0.tar.gz (5.8 kB view details)

Uploaded Mar 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llama_index_llms_llamafile-0.5.0-py3-none-any.whl (5.5 kB view details)

Uploaded Mar 12, 2026 Python 3

File details

Details for the file llama_index_llms_llamafile-0.5.0.tar.gz.

File metadata

Download URL: llama_index_llms_llamafile-0.5.0.tar.gz
Upload date: Mar 12, 2026
Size: 5.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_llms_llamafile-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`fbb9c5ea9c8c5d95b0fe8832b9571e301b66152aeb6e17e7b7dca37b11d3ec54`
MD5	`a10ad6810348b16f2d5518974926f3a2`
BLAKE2b-256	`5410075c11d076c363f7b254631c74cd6c53f2808e79e81b905bc208e8e9f6da`

See more details on using hashes here.

File details

Details for the file llama_index_llms_llamafile-0.5.0-py3-none-any.whl.

File metadata

Download URL: llama_index_llms_llamafile-0.5.0-py3-none-any.whl
Upload date: Mar 12, 2026
Size: 5.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_llms_llamafile-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1df8c7e9150924206eed5581162cb3c23cdd5c1b0f98fbfe1105a67b6c4f9876`
MD5	`0cc74bdf2cdf63e55836d194e981e913`
BLAKE2b-256	`ded2af24dd219e6c04375f57d8aeafd34ef1d768a0ce82ae9fcd267be3ba7ad1`

See more details on using hashes here.

llama-index-llms-llamafile 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

LlamaIndex Llms Integration: llamafile

Setup Steps

1. Download a LlamaFile

2. Make the File Executable

3. Start the Model Server

Using LlamaIndex

Import Required Libraries

Initialize the LLM

Generate Completions

Call Chat with a List of Messages

Streaming Responses

LLM Implementation example

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes