Skip to main content

One-command setup for real-time audio transcription with LFM2.5-Audio model

Project description

Liqui-Speak 🎤

One-command setup for real-time audio transcription using LFM2.5-Audio-1.5B

Liqui-Speak automates the entire setup process for audio transcription, handling system dependencies, model downloads, and format conversions automatically.

🚀 Quick Start

# Install the package
uv tool install liqui-speak

# Run one-time setup (installs everything)
liqui-speak config

# Transcribe any audio file
liqui-speak audio.m4a

✨ Features

  • 🔄 Auto-setup: Single command installs all dependencies
  • 📁 Format support: M4A, AAC, WAV, MP3, FLAC, and more
  • ⚡ Fast conversion: PyDub-based in-memory processing
  • 🎯 Cross-platform: macOS, Linux, Windows support
  • 📦 Complete automation: Downloads models, binaries, libraries
  • 🔧 Zero configuration: Works out of the box
  • 📱 macOS Shortcut: Voice-to-clipboard with one keystroke

📋 Installation

Prerequisites

  • Python >= 3.12
  • libmagic (for audio format detection)
  • Package manager: Homebrew (macOS/Linux), apt/yum/pacman (Linux), or Chocolatey (Windows)

Installing libmagic:

# macOS
brew install libmagic

# Ubuntu/Debian
sudo apt-get install libmagic1

# Fedora/RHEL/CentOS
sudo dnf install file-libs

# Arch Linux
sudo pacman -S file

# Windows
pip install python-magic-bin

Install Package

uv tool install liqui-speak

First-time Setup

liqui-speak config

This will:

  • Install PortAudio and FFmpeg system dependencies
  • Download LFM2.5-Audio-1.5B model files
  • Download platform-specific llama.cpp binary
  • Install macOS Shortcut for voice transcription (macOS only)
  • Verify installation

Quantization Options

Choose model size vs. quality trade-off:

# F16 - Full precision (default, ~3.4GB, best quality)
liqui-speak config

# Q8_0 - 8-bit quantization (~1.8GB)
liqui-speak config --quant Q8_0

# Q4_0 - 4-bit quantization (~1GB, smallest)
liqui-speak config --quant Q4_0

🎤 Usage

Basic Transcription

# Transcribe any audio file (both formats work)
liqui-speak audio.m4a                    # Simple format
liqui-speak transcribe audio.m4a         # Explicit format

# Or with different file types
liqui-speak recording.wav
liqui-speak podcast.mp3

Advanced Options

# Play audio during transcription
liqui-speak audio.m4a --play-audio

# Verbose output
liqui-speak audio.mp3 --verbose

Python API

from liqui_speak import transcribe

# Transcribe audio file
text = transcribe("audio.m4a")
print(text)

� macOS Shortcut

During liqui-speak config, a macOS Shortcut is automatically installed that:

  1. Records audio - Start speaking immediately
  2. Transcribes - Runs liqui-speak on the recording
  3. Copies to clipboard - Ready to paste anywhere

First run permissions: macOS will ask for microphone access, file access, and shell script execution permissions.

�🔧 Configuration

Environment Variables

export LIQUI_SPEAK_MODEL_DIR="/custom/path"
export LIQUI_SPEAK_SAMPLE_RATE="44100"

Setup Directory

Configuration and models are stored in ~/.liqui_speak/

📊 Supported Formats

✅ Direct support: WAV (no conversion needed) ✅ Auto-converted: M4A, AAC, MP3, FLAC, OGG, WMA, ALAC ❌ Not supported: DRM-protected files

All supported formats are automatically converted to WAV internally for optimal transcription performance.

🏗️ Development

Setup Development Environment

# Clone repository
git clone https://github.com/abhishekbhakat/liqui-speak.git
cd liqui-speak

# Install with dev dependencies
make install-dev

# Run quality checks
make lint
make type-check
make test

🧪 Tests

"Tests? Where we're going, we don't need tests." — Doc Brown, probably

The code works on my machine. Ship it. 🚀

🔍 Troubleshooting

"Format not recognized" error

Your file might be M4A with wrong extension. Use:

liqui-speak config  # Will detect and convert automatically

Missing system dependencies

Run setup again:

liqui-speak config --verbose

Model download fails

Check internet connection and available disk space (~2GB needed).

Permission errors

Make sure you have admin/sudo access for system dependency installation.

🚀 Performance

  • Setup time: < 5 minutes (first run)
  • Conversion speed: < 10% of audio duration
  • Memory usage: ~2GB during transcription
  • Model size: ~1.5GB

🔗 Dependencies

Python Packages

  • pydub - Audio conversion
  • huggingface-hub - Model downloads
  • python-magic - Format detection

System Dependencies

  • portaudio - Audio I/O library
  • ffmpeg - Audio format support

📄 License

MIT License - see LICENSE file for details.

🤝 Contributing

  1. Fork the repository
  2. Create feature branch: git checkout -b feature-name
  3. Make changes and test: make quality
  4. Commit changes: git commit -am 'Add feature'
  5. Push to branch: git push origin feature-name
  6. Submit pull request

📞 Support

🙏 Acknowledgments

  • LFM2.5-Audio-1.5B model: LiquidAI team
  • llama.cpp: Georgi Gerganov
  • PyDub: James Robert
  • Hugging Face: Model hosting platform

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liqui_speak-0.3.1.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

liqui_speak-0.3.1-py3-none-any.whl (22.1 kB view details)

Uploaded Python 3

File details

Details for the file liqui_speak-0.3.1.tar.gz.

File metadata

  • Download URL: liqui_speak-0.3.1.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for liqui_speak-0.3.1.tar.gz
Algorithm Hash digest
SHA256 ef743c9dc9392242194e3a5dad0af0da3e6e37b7264e0de7aaa128749f2ad292
MD5 3e960c4a90728d95b670c218da6f95c7
BLAKE2b-256 d213158a89933577c60ca8f1cfcc93b97172f8503c0d2592e5dc93dcbe3a6be9

See more details on using hashes here.

File details

Details for the file liqui_speak-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: liqui_speak-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 22.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for liqui_speak-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fcd4d167f935e7643c721f625ab887f3fbed866036f396593bf3bc2a454d9cab
MD5 aeb3852a5344f3cadd44c796fc45d5d6
BLAKE2b-256 8cf07179acf35f5f76d53244da965633bcc80a60e83db975e57009005f72703b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page