dora-transformer
Project description
dora-transformers
A Dora node that provides access to Hugging Face transformer models for efficient text generation and chat completion.
Features
- Multi-platform GPU acceleration support (CUDA, CPU, MPS)
- Memory-efficient model loading with 8-bit quantization
- Seamless integration with speech-to-text and text-to-speech pipelines
- Configurable system prompts and activation words
- Support for various transformer models
- Conversation history management
- Optimized for both small and large language models
Getting Started
Installation
uv venv -p 3.11 --seed
uv pip install -e .
Usage
Configure the node in your dataflow YAML file:
- id: dora-transformers
build: pip install -e path/to/dora-transformers
path: dora-transformers
inputs:
text: source_node/text # Input text to generate response for
outputs:
- text # Generated response text
env:
MODEL_NAME: "Qwen/Qwen2.5-0.5B-Instruct" # Model from Hugging Face
SYSTEM_PROMPT: "You're a very succinct AI assistant with short answers."
ACTIVATION_WORDS: "what how who where you"
MAX_TOKENS: "128" # Reduced for concise responses
DEVICE: "cuda" # Use "cpu" for CPU, "cuda" for NVIDIA GPU, "mps" for Apple Silicon
ENABLE_MEMORY_EFFICIENT: "true" # Enable 8-bit quantization and memory optimizations
TORCH_DTYPE: "float16" # Use half precision for better memory efficiency
Configuration Options
MODEL_NAME: Hugging Face model identifier (default: "Qwen/Qwen2.5-0.5B-Instruct")SYSTEM_PROMPT: Customize the AI assistant's personality/behaviorACTIVATION_WORDS: Space-separated list of words that trigger model responseMAX_TOKENS: Maximum number of tokens to generate (default: 128)DEVICE: Computation device to use (default: "cpu")ENABLE_MEMORY_EFFICIENT: Enable memory optimizations (default: "true")TORCH_DTYPE: Model precision type (default: "auto")
Memory Management
The node includes several memory optimization features:
- 8-bit quantization for CUDA devices
- Automatic CUDA cache clearing
- Memory-efficient model loading
- Half-precision support
Example: Voice Assistant Pipeline
Create a conversational AI pipeline that:
- Captures audio from microphone
- Converts speech to text
- Generates AI responses using transformers
- Converts responses back to speech
nodes:
- id: dora-microphone
build: pip install -e ../../node-hub/dora-microphone
path: dora-microphone
inputs:
tick: dora/timer/millis/2000
outputs:
- audio
- id: dora-vad
build: pip install -e ../../node-hub/dora-vad
path: dora-vad
inputs:
audio: dora-microphone/audio
outputs:
- audio
- timestamp_start
- id: dora-distil-whisper
build: pip install -e ../../node-hub/dora-distil-whisper
path: dora-distil-whisper
inputs:
input: dora-vad/audio
outputs:
- text
env:
TARGET_LANGUAGE: english
- id: dora-transformers
build: pip install -e ../../node-hub/dora-transformers
path: dora-transformers
inputs:
text: dora-distil-whisper/text
outputs:
- text
env:
MODEL_NAME: "Qwen/Qwen2.5-0.5B-Instruct"
SYSTEM_PROMPT: "You are an AI assistant that gives extremely concise responses, never more than one or two sentences. Always be direct and to the point."
ACTIVATION_WORDS: "what how who where you"
MAX_TOKENS: "128"
DEVICE: "cuda"
ENABLE_MEMORY_EFFICIENT: "true"
TORCH_DTYPE: "float16"
- id: dora-kokoro-tts
build: pip install -e ../../node-hub/dora-kokoro-tts
path: dora-kokoro-tts
inputs:
text: dora-transformers/text
outputs:
- audio
Running the Example
dora build test.yml
dora run test.yml
Troubleshooting
If you encounter CUDA out of memory errors:
- Set
DEVICE: "cpu"for CPU-only inference - Enable memory optimizations with
ENABLE_MEMORY_EFFICIENT: "true" - Use a smaller model or reduce
MAX_TOKENS - Set
TORCH_DTYPE: "float16"for reduced memory usage
Contribution Guide
Format with ruff:
uv pip install ruff
uv run ruff check . --fix
Lint with ruff:
uv run ruff check .
Test with pytest:
uv pip install pytest
uv run pytest . # Test
License
dora-transformers is released under the MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dora_transformers-1.0.0.tar.gz.
File metadata
- Download URL: dora_transformers-1.0.0.tar.gz
- Upload date:
- Size: 4.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0d0a1d5a7e60d224a3058fcaa77a423f3c91f83d43e0dcac6b1afe65b422018c
|
|
| MD5 |
915267d388e1228542259e1b5672ab36
|
|
| BLAKE2b-256 |
de990c51c20f56681693824631a63db9e67ae4c8a8792296f6ce7582ebc346fd
|
File details
Details for the file dora_transformers-1.0.0-py3-none-any.whl.
File metadata
- Download URL: dora_transformers-1.0.0-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ca19a0071755a57956e67df06f7a91df463b995cc73fccd92b9632fa19472ff
|
|
| MD5 |
60363b3244a6dfd0fd6a36064d40b262
|
|
| BLAKE2b-256 |
3c505a51abf4c9f99f4f0026564a7f52091a64c80118bf2ee14e7099317df02e
|