Skip to main content

llama-index llms optimum intel integration

Project description

LlamaIndex Llms Integration: Optimum Intel IPEX backend

Installation

To install the required packages, run:

%pip install llama-index-llms-optimum-intel
!pip install llama-index

Setup

Define Functions for Prompt Handling

You will need functions to convert messages and completions into prompts:

from llama_index.llms.optimum_intel import OptimumIntelLLM


def messages_to_prompt(messages):
    prompt = ""
    for message in messages:
        if message.role == "system":
            prompt += f"<|system|>\n{message.content}</s>\n"
        elif message.role == "user":
            prompt += f"<|user|>\n{message.content}</s>\n"
        elif message.role == "assistant":
            prompt += f"<|assistant|>\n{message.content}</s>\n"

    # Ensure we start with a system prompt, insert blank if needed
    if not prompt.startswith("<|system|>\n"):
        prompt = "<|system|>\n</s>\n" + prompt

    # Add final assistant prompt
    prompt = prompt + "<|assistant|>\n"

    return prompt


def completion_to_prompt(completion):
    return f"<|system|>\n</s>\n<|user|>\n{completion}</s>\n<|assistant|>\n"

Model Loading

Models can be loaded by specifying parameters using the OptimumIntelLLM method:

oi_llm = OptimumIntelLLM(
    model_name="Intel/neural-chat-7b-v3-3",
    tokenizer_name="Intel/neural-chat-7b-v3-3",
    context_window=3900,
    max_new_tokens=256,
    generate_kwargs={"temperature": 0.7, "top_k": 50, "top_p": 0.95},
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    device_map="cpu",
)

response = oi_llm.complete("What is the meaning of life?")
print(str(response))

Streaming Responses

To use the streaming capabilities, you can use the stream_complete and stream_chat methods:

Using stream_complete

response = oi_llm.stream_complete("Who is Mother Teresa?")
for r in response:
    print(r.delta, end="")

Using stream_chat

from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(
        role="system",
        content="You are an American chef in a small restaurant in New Orleans",
    ),
    ChatMessage(role="user", content="What is your dish of the day?"),
]

resp = oi_llm.stream_chat(messages)

for r in resp:
    print(r.delta, end="")

LLM Implementation example

https://docs.llamaindex.ai/en/stable/examples/llm/optimum_intel/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_llms_optimum_intel-0.5.0.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_llms_optimum_intel-0.5.0.tar.gz.

File metadata

  • Download URL: llama_index_llms_optimum_intel-0.5.0.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_llms_optimum_intel-0.5.0.tar.gz
Algorithm Hash digest
SHA256 95e0764b9f76026756e4d211d52424b07aea3deaa948a5ea618cbe07c5fa98a0
MD5 ecf890a7ff630fca0c21c2e0c1d302a2
BLAKE2b-256 42e4ec850041121ee4fe658f8ea88ebd35c68069b3b349163378e8acb2f247e7

See more details on using hashes here.

File details

Details for the file llama_index_llms_optimum_intel-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_llms_optimum_intel-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 5.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_llms_optimum_intel-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 51e629ccba0d1c9d611f1a98103f399c205a7b0074e66a43410a0b5b0fbf6fe1
MD5 6f814b10d5c66512cc15aaaa629b4422
BLAKE2b-256 703c9364d3f05f6d546c96ce114519c7d62a5661b050a1adac90c8eedf1be02b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page