dora-llama-cpp-python

Project description

dora-llama-cpp-python

A Dora node that provides access to LLaMA models using llama-cpp-python for efficient CPU/GPU inference.

Features

GPU acceleration support (CUDA on Linux, Metal on macOS)
Easy integration with speech-to-text and text-to-speech pipelines
Configurable system prompts and activation words
Lightweight CPU inference with GGUF models
Thread-level CPU optimization
Adjustable context window size

Getting started

Installation

uv venv -p 3.11 --seed
uv pip install -e .

Usage

The node can be configured in your dataflow YAML file:

# Using a HuggingFace model
- id: dora-llama-cpp-python
  build: pip install -e path/to/dora-llama-cpp-python
  path: dora-llama-cpp-python
  inputs:
    text: source_node/text  # Input text to generate response for
  outputs:
    - text  # Generated response text
  env:
    MODEL_NAME_OR_PATH: "TheBloke/Llama-2-7B-Chat-GGUF"
    MODEL_FILE_PATTERN: "*Q4_K_M.gguf"
    SYSTEM_PROMPT: "You're a very succinct AI assistant with short answers."
    ACTIVATION_WORDS: "what how who where you"
    MAX_TOKENS: "512"
    N_GPU_LAYERS: "35"     # Enable GPU acceleration
    N_THREADS: "4"         # CPU threads
    CONTEXT_SIZE: "4096"   # Maximum context window

Configuration Options

MODEL_NAME_OR_PATH: Path to local model file or HuggingFace repo id (default: "TheBloke/Llama-2-7B-Chat-GGUF")
MODEL_FILE_PATTERN: Pattern to match model file when downloading from HF (default: "*Q4_K_M.gguf")
SYSTEM_PROMPT: Customize the AI assistant's personality/behavior
ACTIVATION_WORDS: Space-separated list of words that trigger model response
MAX_TOKENS: Maximum number of tokens to generate (default: 512)
N_GPU_LAYERS: Number of layers to offload to GPU (default: 0, set to 35 for GPU acceleration)
N_THREADS: Number of CPU threads to use (default: 4)
CONTEXT_SIZE: Maximum context window size (default: 4096)

Example: Speech Assistant Pipeline

This example shows how to create a conversational AI pipeline that:

Captures audio from microphone
Converts speech to text
Generates AI responses
Converts responses back to speech

nodes:
  - id: dora-microphone
    build: pip install dora-microphone
    path: dora-microphone
    inputs:
      tick: dora/timer/millis/2000
    outputs:
      - audio

  - id: dora-vad
    build: pip install dora-vad
    path: dora-vad
    inputs:
      audio: dora-microphone/audio
    outputs:
      - audio
      - timestamp_start

  - id: dora-whisper
    build: pip install dora-distil-whisper
    path: dora-distil-whisper
    inputs:
      input: dora-vad/audio
    outputs:
      - text

  - id: dora-llama-cpp-python
    build: pip install -e .
    path: dora-llama-cpp-python
    inputs:
      text: dora-whisper/text
    outputs:
      - text
    env:
      MODEL_NAME: "TheBloke/Llama-2-7B-Chat-GGUF"
      MODEL_FILE_PATTERN: "*Q4_K_M.gguf"
      SYSTEM_PROMPT: "You're a helpful assistant."
      ACTIVATION_WORDS: "hey help what how"
      MAX_TOKENS: "512"
      N_GPU_LAYERS: "35"
      N_THREADS: "4"
      CONTEXT_SIZE: "4096"

  - id: dora-tts
    build: pip install dora-kokoro-tts
    path: dora-kokoro-tts
    inputs:
      text: dora-llama-cpp-python/text
    outputs:
      - audio

Running the Example

dora build example.yml
dora run example.yml

Contribution Guide

Format with ruff:

uv pip install ruff
uv run ruff check . --fix

Lint with ruff:

uv run ruff check .

Test with pytest

uv pip install pytest
uv run pytest . # Test

License

dora-llama-cpp-python is released under the MIT License

Project details

Release history Release notifications | RSS feed

This version

1.0.1

Apr 6, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dora_llama_cpp_python-1.0.1.tar.gz (4.6 kB view details)

Uploaded Apr 6, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dora_llama_cpp_python-1.0.1-py3-none-any.whl (5.1 kB view details)

Uploaded Apr 6, 2025 Python 3

File details

Details for the file dora_llama_cpp_python-1.0.1.tar.gz.

File metadata

Download URL: dora_llama_cpp_python-1.0.1.tar.gz
Upload date: Apr 6, 2025
Size: 4.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.12

File hashes

Hashes for dora_llama_cpp_python-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`f876b2efeb8ea108a18e6fae600adc81083014eb8bcb98e562221ae2c31236c9`
MD5	`73894d55130aa27ebf3d3c2f42f86a01`
BLAKE2b-256	`1de80adf73ce0e31a20f41aa701af4ff39b18c78ad50f20a23797c5d713a3263`

See more details on using hashes here.

File details

Details for the file dora_llama_cpp_python-1.0.1-py3-none-any.whl.

File metadata

Download URL: dora_llama_cpp_python-1.0.1-py3-none-any.whl
Upload date: Apr 6, 2025
Size: 5.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.12

File hashes

Hashes for dora_llama_cpp_python-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`406cf0f68bd07df08e96020b6676dc75c3dc9b57b77e7f47a58957ab20c609f6`
MD5	`b18d7acf8fa929a0f787764036927474`
BLAKE2b-256	`93eeeb276ef7458e7a68a7870de274d86ed908824d5a036fe6f61b3791989edb`

See more details on using hashes here.

dora-llama-cpp-python 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

dora-llama-cpp-python

Features

Getting started

Installation

Usage

Configuration Options

Example: Speech Assistant Pipeline

Running the Example

Contribution Guide

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes