One-command setup for real-time audio transcription with LFM2.5-Audio model
Project description
Liqui-Speak 🎤
One-command setup for real-time audio transcription using LFM2.5-Audio-1.5B
Liqui-Speak automates the entire setup process for audio transcription, handling system dependencies, model downloads, and format conversions automatically.
🚀 Quick Start
# Install the package
uv tool install liqui-speak
# Run one-time setup (installs everything)
liqui-speak config
# Transcribe any audio file
liqui-speak audio.m4a
✨ Features
- 🔄 Auto-setup: Single command installs all dependencies
- 📁 Format support: M4A, AAC, WAV, MP3, FLAC, and more
- ⚡ Fast conversion: PyDub-based in-memory processing
- 🎯 Cross-platform: macOS, Linux, Windows support
- 📦 Complete automation: Downloads models, binaries, libraries
- 🔧 Zero configuration: Works out of the box
- 📱 macOS Shortcut: Voice-to-clipboard with one keystroke
📋 Installation
Prerequisites
- Python >= 3.12
- libmagic (for audio format detection)
- Package manager: Homebrew (macOS/Linux), apt/yum/pacman (Linux), or Chocolatey (Windows)
Installing libmagic:
# macOS
brew install libmagic
# Ubuntu/Debian
sudo apt-get install libmagic1
# Fedora/RHEL/CentOS
sudo dnf install file-libs
# Arch Linux
sudo pacman -S file
# Windows
pip install python-magic-bin
Install Package
uv tool install liqui-speak
First-time Setup
liqui-speak config
This will:
- Install PortAudio and FFmpeg system dependencies
- Download LFM2.5-Audio-1.5B model files
- Download platform-specific llama.cpp binary
- Install macOS Shortcut for voice transcription (macOS only)
- Verify installation
Quantization Options
Choose model size vs. quality trade-off:
# F16 - Full precision (default, ~3.4GB, best quality)
liqui-speak config
# Q8_0 - 8-bit quantization (~1.8GB)
liqui-speak config --quant Q8_0
# Q4_0 - 4-bit quantization (~1GB, smallest)
liqui-speak config --quant Q4_0
🎤 Usage
Basic Transcription
# Transcribe any audio file (both formats work)
liqui-speak audio.m4a # Simple format
liqui-speak transcribe audio.m4a # Explicit format
# Or with different file types
liqui-speak recording.wav
liqui-speak podcast.mp3
Advanced Options
# Play audio during transcription
liqui-speak audio.m4a --play-audio
# Verbose output
liqui-speak audio.mp3 --verbose
Python API
from liqui_speak import transcribe
# Transcribe audio file
text = transcribe("audio.m4a")
print(text)
� macOS Shortcut
During liqui-speak config, a macOS Shortcut is automatically installed that:
- Records audio - Start speaking immediately
- Transcribes - Runs liqui-speak on the recording
- Copies to clipboard - Ready to paste anywhere
First run permissions: macOS will ask for microphone access, file access, and shell script execution permissions.
�🔧 Configuration
Environment Variables
export LIQUI_SPEAK_MODEL_DIR="/custom/path"
export LIQUI_SPEAK_SAMPLE_RATE="44100"
Setup Directory
Configuration and models are stored in ~/.liqui_speak/
📊 Supported Formats
✅ Direct support: WAV (no conversion needed) ✅ Auto-converted: M4A, AAC, MP3, FLAC, OGG, WMA, ALAC ❌ Not supported: DRM-protected files
All supported formats are automatically converted to WAV internally for optimal transcription performance.
🏗️ Development
Setup Development Environment
# Clone repository
git clone https://github.com/abhishekbhakat/liqui-speak.git
cd liqui-speak
# Install with dev dependencies
make install-dev
# Run quality checks
make lint
make type-check
make test
🧪 Tests
"Tests? Where we're going, we don't need tests." — Doc Brown, probably
The code works on my machine. Ship it. 🚀
🔍 Troubleshooting
"Format not recognized" error
Your file might be M4A with wrong extension. Use:
liqui-speak config # Will detect and convert automatically
Missing system dependencies
Run setup again:
liqui-speak config --verbose
Model download fails
Check internet connection and available disk space (~2GB needed).
Permission errors
Make sure you have admin/sudo access for system dependency installation.
🚀 Performance
- Setup time: < 5 minutes (first run)
- Conversion speed: < 10% of audio duration
- Memory usage: ~2GB during transcription
- Model size: ~1.5GB
🔗 Dependencies
Python Packages
pydub- Audio conversionhuggingface-hub- Model downloadspython-magic- Format detection
System Dependencies
portaudio- Audio I/O libraryffmpeg- Audio format support
📄 License
MIT License - see LICENSE file for details.
🤝 Contributing
- Fork the repository
- Create feature branch:
git checkout -b feature-name - Make changes and test:
make quality - Commit changes:
git commit -am 'Add feature' - Push to branch:
git push origin feature-name - Submit pull request
📞 Support
- Issues: GitHub Issues
🙏 Acknowledgments
- LFM2.5-Audio-1.5B model: LiquidAI team
- llama.cpp: Georgi Gerganov
- PyDub: James Robert
- Hugging Face: Model hosting platform
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file liqui_speak-0.3.1.tar.gz.
File metadata
- Download URL: liqui_speak-0.3.1.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef743c9dc9392242194e3a5dad0af0da3e6e37b7264e0de7aaa128749f2ad292
|
|
| MD5 |
3e960c4a90728d95b670c218da6f95c7
|
|
| BLAKE2b-256 |
d213158a89933577c60ca8f1cfcc93b97172f8503c0d2592e5dc93dcbe3a6be9
|
File details
Details for the file liqui_speak-0.3.1-py3-none-any.whl.
File metadata
- Download URL: liqui_speak-0.3.1-py3-none-any.whl
- Upload date:
- Size: 22.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fcd4d167f935e7643c721f625ab887f3fbed866036f396593bf3bc2a454d9cab
|
|
| MD5 |
aeb3852a5344f3cadd44c796fc45d5d6
|
|
| BLAKE2b-256 |
8cf07179acf35f5f76d53244da965633bcc80a60e83db975e57009005f72703b
|