An external provider for Llama Stack allowing for the use of RamaLama for inference.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ramalama

These details have not been verified by PyPI

Project links

homepage

Project description

ramalama-stack

An external provider for Llama Stack allowing for the use of RamaLama for inference.

Installing

You can install ramalama-stack from PyPI via pip install ramalama-stack

This will install Llama Stack and RamaLama as well if they are not installed already.

Usage

[!WARNING] The following workaround is currently needed to run this provider - see https://github.com/containers/ramalama-stack/issues/53 for more details

curl --create-dirs --output ~/.llama/providers.d/remote/inference/ramalama.yaml https://raw.githubusercontent.com/containers/ramalama-stack/refs/tags/v0.2.0/src/ramalama_stack/providers.d/remote/inference/ramalama.yaml
curl --create-dirs --output ~/.llama/distributions/ramalama/ramalama-run.yaml https://raw.githubusercontent.com/containers/ramalama-stack/refs/tags/v0.2.0/src/ramalama_stack/ramalama-run.yaml

First you will need a RamaLama server running - see the RamaLama project docs for more information.
Ensure you set your INFERENCE_MODEL environment variable to the name of the model you have running via RamaLama.
You can then run the RamaLama external provider via llama stack run ~/.llama/distributions/ramalama/ramalama-run.yaml

[!NOTE] You can also run the RamaLama external provider inside of a container via Podman
podman run \
 --net=host \
 --env RAMALAMA_URL=http://0.0.0.0:8080 \
 --env INFERENCE_MODEL=$INFERENCE_MODEL \
 quay.io/ramalama/llama-stack

This will start a Llama Stack server which will use port 8321 by default. You can test this works by configuring the Llama Stack Client to run against this server and sending a test request.

If your client is running on the same machine as the server, you can run llama-stack-client configure --endpoint http://0.0.0.0:8321 --api-key none
If your client is running on a different machine, you can run llama-stack-client configure --endpoint http://<hostname>:8321 --api-key none
The client should give you a message similar to Done! You can now use the Llama Stack Client CLI with endpoint <endpoint>
You can then test the server by running llama-stack-client inference chat-completion --message "tell me a joke" which should return something like

ChatCompletionResponse(
    completion_message=CompletionMessage(
        content='A man walked into a library and asked the librarian, "Do you have any books on Pavlov\'s dogs
and Schrödinger\'s cat?" The librarian replied, "It rings a bell, but I\'m not sure if it\'s here or not."',
        role='assistant',
        stop_reason='end_of_turn',
        tool_calls=[]
    ),
    logprobs=None,
    metrics=[
        Metric(metric='prompt_tokens', value=14.0, unit=None),
        Metric(metric='completion_tokens', value=63.0, unit=None),
        Metric(metric='total_tokens', value=77.0, unit=None)
    ]
)

Llama Stack User Interface

Llama Stack includes an experimental user-interface, check it out here.

To deploy the UI, run this:

podman run -d --rm --network=container:ramalama --name=streamlit quay.io/redhat-et/streamlit_client:0.1.0

[!NOTE] If running on MacOS (not Linux), --network=host doesn't work. You'll need to publish additional ports 8321:8321 and 8501:8501 with the ramalama serve command, then run with network=container:ramalama.

If running on Linux use --network=host or -p 8501:8501 instead. The streamlit container will be able to access the ramalama endpoint with either.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ramalama

These details have not been verified by PyPI

Project links

homepage

Release history Release notifications | RSS feed

0.3.0a0 pre-release

Jul 8, 2025

0.2.5

Jul 9, 2025

0.2.4

Jul 1, 2025

0.2.3

Jun 26, 2025

0.2.2

Jun 23, 2025

This version

0.2.1

Jun 16, 2025

0.2.0

Jun 5, 2025

0.1.5

May 28, 2025

0.1.4

May 7, 2025

0.1.3

May 1, 2025

0.1.2

Apr 29, 2025

0.1.1

Apr 28, 2025

0.1.0

Apr 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ramalama_stack-0.2.1.tar.gz (192.1 kB view details)

Uploaded Jun 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ramalama_stack-0.2.1-py3-none-any.whl (17.3 kB view details)

Uploaded Jun 16, 2025 Python 3

File details

Details for the file ramalama_stack-0.2.1.tar.gz.

File metadata

Download URL: ramalama_stack-0.2.1.tar.gz
Upload date: Jun 16, 2025
Size: 192.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ramalama_stack-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`87a6851be8b4885f8164c6e76f9f026e09ae005d729a898d10f39249ef8fe3fc`
MD5	`c2aae4ed384382d923bbeccac3d0bb0c`
BLAKE2b-256	`dab415c888bc205771521101b79d8ff21966958fa62109ec57abb5fe2cc03c54`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ramalama_stack-0.2.1.tar.gz:

Publisher: pypi.yml on containers/ramalama-stack

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ramalama_stack-0.2.1.tar.gz
- Subject digest: 87a6851be8b4885f8164c6e76f9f026e09ae005d729a898d10f39249ef8fe3fc
- Sigstore transparency entry: 239686978
- Sigstore integration time: Jun 16, 2025
Source repository:
- Permalink: containers/ramalama-stack@a347f05b4cf82261b34baa44a7b61d1fc320ca8c
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/containers
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@a347f05b4cf82261b34baa44a7b61d1fc320ca8c
- Trigger Event: release

File details

Details for the file ramalama_stack-0.2.1-py3-none-any.whl.

File metadata

Download URL: ramalama_stack-0.2.1-py3-none-any.whl
Upload date: Jun 16, 2025
Size: 17.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ramalama_stack-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`50f6955b5ecae45da525c29d9ebb35e71092cadc12a04d3526f1212cbf852981`
MD5	`9f0aae1d76a546a59b5da19b4d893a8a`
BLAKE2b-256	`ecff0a8556135aaaf7e282d667f42c339ec862da2fd35faf9fbd6d7416d99ddc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ramalama_stack-0.2.1-py3-none-any.whl:

Publisher: pypi.yml on containers/ramalama-stack

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ramalama_stack-0.2.1-py3-none-any.whl
- Subject digest: 50f6955b5ecae45da525c29d9ebb35e71092cadc12a04d3526f1212cbf852981
- Sigstore transparency entry: 239686983
- Sigstore integration time: Jun 16, 2025
Source repository:
- Permalink: containers/ramalama-stack@a347f05b4cf82261b34baa44a7b61d1fc320ca8c
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/containers
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@a347f05b4cf82261b34baa44a7b61d1fc320ca8c
- Trigger Event: release

ramalama-stack 0.2.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Project description

ramalama-stack

Installing

Usage

Llama Stack User Interface

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance