Llama Stack Remote Eval Provider for TrustyAI LM-Eval

Project description

llama-stack-provider-lmeval

Llama Stack Remote Eval Provider for TrustyAI LM-Eval

About

This repository implements TrustyAI's LM-Eval as an out-of-tree Llama Stack remote provider.

It also includes an end-to-end instructions demonstratring how one can use LM-Eval on LLama Stack to run benchmark evaluations over DK-Bench on a deployed Phi-3-mini-4k-instruct model via OpenShift.

Use

Prerequsites

Admin access to an OpenShift cluster with RHOAI installed
Installation of uv
Installation of oc cli tool
Installation of llama stack cli tool

Clone this repository

git clone https://github.com/trustyai-explainability/llama-stack-provider-lmeval.git

Set llama-stack-provider-lmeval/demo as your working directory.
```
cd llama-stack-provider-lmeval/demo
```
Deploy microsoft/Phi-3-mini-4k-instruct on vLLM Serving Runtime

a. Create a namespace with a name of your choice
```
TEST_NS=<NAMESPACE>
oc create ns $TEST_NS
oc get ns $TEST_NS
```
b. Deploy the model via vLLM
```
oc apply -k resources/kustomization.yaml
```

Before continuing, preform a sanity check to make sure the model was sucessfully deployed

oc get pods | grep "predictor"

Expected output:

phi-3-predictor-00002-deployment-794fb6b4b-clhj7   3/3     Running   0          5h55m

Retrive the model route

VLLM_URL=$(oc get $(oc get ksvc -o name | grep predictor) --template={{.status.url}})

Create and activate a virtual enviornment

uv venv .llamastack-venv

source .llamastack-venv/bin/activate

Install the required libraries
```
uv pip install -e .
```

In the run.yaml, make the following changes:

a. Replace the remote::vllm url

providers:
    inference:
    - provider_id: vllm-0
        provider_type: remote::vllm
        config:
        url: ${env.VLLM_URL:https://phi-3-predictor-llama-test.apps.rosa.p2i7w2k6p6w7t7e.3emk.p3.openshiftapps.com/v1/completions}

b. Replace the remote::lmeval base_url and namespace

- provider_id: lmeval-1
    provider_type: remote::lmeval
    config:
        use_k8s: True
        base_url: https://vllm-test.apps.rosa.p2i7w2k6p6w7t7e.3emk.p3.openshiftapps.com/v1/completions
        namespace: "llama-test"

Start the llama stack server in a virtual enviornment

llama stack run run.yaml --image-type venv

Expected output:

INFO:     Application startup complete.
INFO:     Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit)

Navigate to demo.ipynb to run evaluation

Project details

Release history Release notifications | RSS feed

0.5.0

Feb 16, 2026

0.4.2

Jan 13, 2026

0.4.1

Nov 28, 2025

0.4.0

Nov 27, 2025

0.3.1

Oct 17, 2025

0.3.0

Oct 7, 2025

0.2.4

Sep 10, 2025

0.2.3

Sep 4, 2025

0.2.2

Aug 21, 2025

0.2.1

Aug 20, 2025

0.2.0

Aug 19, 2025

0.1.8

Aug 6, 2025

0.1.7

Jul 7, 2025

0.1.6

Jun 19, 2025

0.1.5

Jun 13, 2025

0.1.4

Jun 11, 2025

0.1.3

May 16, 2025

This version

0.1.2

Apr 30, 2025

0.1.1

Apr 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_stack_provider_lmeval-0.1.2.tar.gz (17.2 kB view details)

Uploaded Apr 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llama_stack_provider_lmeval-0.1.2-py3-none-any.whl (17.4 kB view details)

Uploaded Apr 30, 2025 Python 3

File details

Details for the file llama_stack_provider_lmeval-0.1.2.tar.gz.

File metadata

Download URL: llama_stack_provider_lmeval-0.1.2.tar.gz
Upload date: Apr 30, 2025
Size: 17.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for llama_stack_provider_lmeval-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`023bafc613e220ad23b32cc2b31cd6ade82d822e3ff85386d80d6b6f217074b2`
MD5	`12be70cb925c6fb1e1c11eedd6b50962`
BLAKE2b-256	`e2e6be621247adbdfd593d2a2d4931b97e3d206fc99c6673729557525a0681e4`

See more details on using hashes here.

File details

Details for the file llama_stack_provider_lmeval-0.1.2-py3-none-any.whl.

File metadata

Download URL: llama_stack_provider_lmeval-0.1.2-py3-none-any.whl
Upload date: Apr 30, 2025
Size: 17.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for llama_stack_provider_lmeval-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`290e5ee0fdc65a01b576624c448d12bba5bf74ab3298f68e6b7d8d4fb2beb044`
MD5	`ef54762e1f048494ef64fa4fdaf3cf04`
BLAKE2b-256	`42e02d948d65b2875621d6ce89307a99f9373713ddf37cb94a187c8e33f6c53c`

See more details on using hashes here.

llama-stack-provider-lmeval 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

llama-stack-provider-lmeval

About

Use

Prerequsites

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes