dora-llama-cpp-python
Project description
dora-llama-cpp-python
A Dora node that provides access to LLaMA models using llama-cpp-python for efficient CPU/GPU inference.
Features
- GPU acceleration support (CUDA on Linux, Metal on macOS)
- Easy integration with speech-to-text and text-to-speech pipelines
- Configurable system prompts and activation words
- Lightweight CPU inference with GGUF models
- Thread-level CPU optimization
- Adjustable context window size
Getting started
Installation
uv venv -p 3.11 --seed
uv pip install -e .
Usage
The node can be configured in your dataflow YAML file:
# Using a HuggingFace model
- id: dora-llama-cpp-python
build: pip install -e path/to/dora-llama-cpp-python
path: dora-llama-cpp-python
inputs:
text: source_node/text # Input text to generate response for
outputs:
- text # Generated response text
env:
MODEL_NAME_OR_PATH: "TheBloke/Llama-2-7B-Chat-GGUF"
MODEL_FILE_PATTERN: "*Q4_K_M.gguf"
SYSTEM_PROMPT: "You're a very succinct AI assistant with short answers."
ACTIVATION_WORDS: "what how who where you"
MAX_TOKENS: "512"
N_GPU_LAYERS: "35" # Enable GPU acceleration
N_THREADS: "4" # CPU threads
CONTEXT_SIZE: "4096" # Maximum context window
Configuration Options
MODEL_NAME_OR_PATH: Path to local model file or HuggingFace repo id (default: "TheBloke/Llama-2-7B-Chat-GGUF")MODEL_FILE_PATTERN: Pattern to match model file when downloading from HF (default: "*Q4_K_M.gguf")SYSTEM_PROMPT: Customize the AI assistant's personality/behaviorACTIVATION_WORDS: Space-separated list of words that trigger model responseMAX_TOKENS: Maximum number of tokens to generate (default: 512)N_GPU_LAYERS: Number of layers to offload to GPU (default: 0, set to 35 for GPU acceleration)N_THREADS: Number of CPU threads to use (default: 4)CONTEXT_SIZE: Maximum context window size (default: 4096)
Example: Speech Assistant Pipeline
This example shows how to create a conversational AI pipeline that:
- Captures audio from microphone
- Converts speech to text
- Generates AI responses
- Converts responses back to speech
nodes:
- id: dora-microphone
build: pip install dora-microphone
path: dora-microphone
inputs:
tick: dora/timer/millis/2000
outputs:
- audio
- id: dora-vad
build: pip install dora-vad
path: dora-vad
inputs:
audio: dora-microphone/audio
outputs:
- audio
- timestamp_start
- id: dora-whisper
build: pip install dora-distil-whisper
path: dora-distil-whisper
inputs:
input: dora-vad/audio
outputs:
- text
- id: dora-llama-cpp-python
build: pip install -e .
path: dora-llama-cpp-python
inputs:
text: dora-whisper/text
outputs:
- text
env:
MODEL_NAME: "TheBloke/Llama-2-7B-Chat-GGUF"
MODEL_FILE_PATTERN: "*Q4_K_M.gguf"
SYSTEM_PROMPT: "You're a helpful assistant."
ACTIVATION_WORDS: "hey help what how"
MAX_TOKENS: "512"
N_GPU_LAYERS: "35"
N_THREADS: "4"
CONTEXT_SIZE: "4096"
- id: dora-tts
build: pip install dora-kokoro-tts
path: dora-kokoro-tts
inputs:
text: dora-llama-cpp-python/text
outputs:
- audio
Running the Example
dora build example.yml
dora run example.yml
Contribution Guide
- Format with ruff:
uv pip install ruff
uv run ruff check . --fix
- Lint with ruff:
uv run ruff check .
- Test with pytest
uv pip install pytest
uv run pytest . # Test
License
dora-llama-cpp-python is released under the MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dora_llama_cpp_python-1.0.1.tar.gz.
File metadata
- Download URL: dora_llama_cpp_python-1.0.1.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f876b2efeb8ea108a18e6fae600adc81083014eb8bcb98e562221ae2c31236c9
|
|
| MD5 |
73894d55130aa27ebf3d3c2f42f86a01
|
|
| BLAKE2b-256 |
1de80adf73ce0e31a20f41aa701af4ff39b18c78ad50f20a23797c5d713a3263
|
File details
Details for the file dora_llama_cpp_python-1.0.1-py3-none-any.whl.
File metadata
- Download URL: dora_llama_cpp_python-1.0.1-py3-none-any.whl
- Upload date:
- Size: 5.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
406cf0f68bd07df08e96020b6676dc75c3dc9b57b77e7f47a58957ab20c609f6
|
|
| MD5 |
b18d7acf8fa929a0f787764036927474
|
|
| BLAKE2b-256 |
93eeeb276ef7458e7a68a7870de274d86ed908824d5a036fe6f61b3791989edb
|