GPU-powered transcription API with speaker diarization
Project description
MurmurAI
GPU-powered transcription API in one command
Features • Quick Start • API • Config • Security • Development
Turn any audio into text with speaker labels. No cloud. No limits. Just run:
uvx murmurai
MurmurAI wraps murmurai-core (our WhisperX fork) in a REST API with speaker diarization, word-level timestamps, and multiple export formats. Self-hosted alternative to AssemblyAI, Deepgram, and Rev.ai.
Features
- Speaker Diarization - Identify who said what with pyannote
- Word-Level Timestamps - Precise alignment for every word
- Multiple Export Formats - SRT, WebVTT, TXT, JSON
- Webhook Callbacks - Get notified when transcription completes
- GPU Model Caching - Fast subsequent transcriptions
- Background Processing - Non-blocking async jobs
- Progress Tracking - Poll for real-time status
Quick Start
Prerequisites
- NVIDIA GPU with 6GB+ VRAM (or CPU mode for testing)
- CUDA 12.x drivers installed
Option A: One-Liner Install (Recommended)
curl -fsSL https://raw.githubusercontent.com/namastexlabs/murmurai/main/get-murmurai.sh | bash
This installs Python 3.12, uv, checks CUDA, and sets up murmurai.
Option B: Direct Run (if dependencies met)
uvx murmurai
Option C: pip install
pip install murmurai
murmurai
Option D: Docker (GPU required)
# Clone and run with docker compose
git clone https://github.com/namastexlabs/murmurai.git
cd murmurai
docker compose up
Requires NVIDIA Container Toolkit. Set MURMURAI_API_KEY in environment for production.
The API starts at http://localhost:8880. Swagger docs at /docs.
First Transcription
# Default API key is "namastex888" - works out of the box
curl -X POST http://localhost:8880/v1/transcript \
-H "Authorization: namastex888" \
-F "file=@audio.mp3"
# Check status (replace {id} with returned transcript ID)
curl http://localhost:8880/v1/transcript/{id} \
-H "Authorization: namastex888"
API Reference
| Method | Endpoint | Description |
|---|---|---|
POST |
/v1/transcript |
Submit transcription job |
GET |
/v1/transcript/{id} |
Get transcript status/result |
GET |
/v1/transcript/{id}/srt |
Export as SRT subtitles |
GET |
/v1/transcript/{id}/vtt |
Export as WebVTT |
GET |
/v1/transcript/{id}/txt |
Export as plain text |
GET |
/v1/transcript/{id}/json |
Export as JSON |
DELETE |
/v1/transcript/{id} |
Delete transcript |
GET |
/health |
Health check (no auth) |
Submit Transcription
File upload:
curl -X POST http://localhost:8880/v1/transcript \
-H "Authorization: namastex888" \
-F "file=@audio.mp3"
URL download:
curl -X POST http://localhost:8880/v1/transcript \
-H "Authorization: namastex888" \
-F "audio_url=https://example.com/audio.mp3"
With speaker diarization:
curl -X POST http://localhost:8880/v1/transcript \
-H "Authorization: namastex888" \
-F "file=@audio.mp3" \
-F "speaker_labels=true" \
-F "speakers_expected=2"
Response Format
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"text": "Hello world, this is a transcription.",
"words": [
{"text": "Hello", "start": 0, "end": 500, "confidence": 0.98, "speaker": "A"}
],
"utterances": [
{"speaker": "A", "text": "Hello world...", "start": 0, "end": 3000}
],
"language_code": "en"
}
Status values: queued → processing → completed (or error)
Configuration
All settings via environment variables with MURMURAI_ prefix. Everything has sensible defaults - no .env file needed for local use.
| Variable | Default | Description |
|---|---|---|
MURMURAI_API_KEY |
namastex888 |
API authentication key |
MURMURAI_HOST |
0.0.0.0 |
Server bind address |
MURMURAI_PORT |
8880 |
Server port |
MURMURAI_MODEL |
large-v3-turbo |
Whisper model |
MURMURAI_DATA_DIR |
./data |
SQLite database location |
MURMURAI_HF_TOKEN |
- | HuggingFace token (for diarization) |
MURMURAI_DEVICE |
0 |
GPU device index |
MURMURAI_LOG_FORMAT |
text |
Logging format (text or json) |
MURMURAI_LOG_LEVEL |
INFO |
Logging level (DEBUG, INFO, WARNING, ERROR) |
Speaker Diarization Setup
To enable speaker_labels=true:
- Accept license at pyannote/speaker-diarization
- Get token at huggingface.co/settings/tokens
- Add to config:
echo "MURMURAI_HF_TOKEN=hf_xxx" >> ~/.config/murmurai/.env
Security
Default API Key Warning
MurmurAI ships with a default API key (namastex888) for zero-config local use. This key is publicly known.
For any network-exposed deployment, set a secure key:
# Generate a secure random key
export MURMURAI_API_KEY=$(openssl rand -hex 32)
# Or add to your .env file
echo "MURMURAI_API_KEY=$(openssl rand -hex 32)" >> .env
The server will display a security warning at startup if using the default key.
Network Exposure
- Local-only (default): Safe to use default key for
localhosttesting - LAN/Docker: Change the API key before exposing to your network
- Internet: Always use a strong API key + consider a reverse proxy with HTTPS
SSRF Protection
The API validates all audio_url parameters to prevent Server-Side Request Forgery:
- Blocks internal IPs (127.0.0.1, 10.x.x.x, 192.168.x.x, etc.)
- Blocks cloud metadata endpoints (169.254.169.254)
- Only allows HTTP/HTTPS schemes
- Resolves DNS and validates the resolved IP
Troubleshooting
CUDA not available:
# Check NVIDIA driver
nvidia-smi
# Check PyTorch CUDA
python -c "import torch; print(torch.cuda.is_available())"
Out of VRAM:
- Use smaller model:
MURMURAI_MODEL=medium - Reduce batch size:
MURMURAI_BATCH_SIZE=8
Diarization fails:
- Verify HF token:
echo $MURMURAI_HF_TOKEN - Accept license at HuggingFace (link above)
Built On
This project uses murmurai-core - our maintained fork of WhisperX with modern dependency support (PyTorch 2.6+, Pyannote 4.x, Python 3.10-3.13).
Development
Setup
git clone https://github.com/namastexlabs/murmurai.git
cd murmurai
uv sync
Run Tests
uv run pytest tests/ -v
Code Quality
uv run ruff check .
uv run ruff format .
uv run mypy src/
Project Structure
murmurai/
├── src/murmurai/
│ ├── server.py # FastAPI application
│ ├── transcriber.py # Transcription pipeline
│ ├── model_manager.py # GPU model caching
│ ├── database.py # SQLite persistence
│ ├── config.py # Settings management
│ ├── auth.py # API authentication
│ ├── models.py # Pydantic schemas
│ ├── deps.py # Dependency checks
│ └── main.py # CLI entry point
├── tests/ # Test suite
├── get-murmurai.sh # One-liner installer
└── pyproject.toml # Project config
CI/CD
- CI: Runs on every push (lint, typecheck, test)
Performance Notes
- First request: ~60-90s (model loading)
- Subsequent: ~same as audio duration
- VRAM usage: ~5-6GB for large-v3-turbo
Made with ❤️ by Namastex Labs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file murmurai-1.0.3rc1.tar.gz.
File metadata
- Download URL: murmurai-1.0.3rc1.tar.gz
- Upload date:
- Size: 500.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1424841c36e313e3955ea6f62e4ebc73456fe250128e82faa6267848f74a5009
|
|
| MD5 |
7e5d257c268329a5fcb39f71ca8b25e7
|
|
| BLAKE2b-256 |
925b28046fad6fe9aa31c5d6e3488872f4206fa073cc17acf89e27f616bbdf66
|
File details
Details for the file murmurai-1.0.3rc1-py3-none-any.whl.
File metadata
- Download URL: murmurai-1.0.3rc1-py3-none-any.whl
- Upload date:
- Size: 31.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c0fc4495d1d9038c90ddcbb98f66522c739c46e99d2fbb035fbb4cff8da4233
|
|
| MD5 |
f65a50d783506c7083653dccf80c94d2
|
|
| BLAKE2b-256 |
a26eafb0256943339742762a8481f987d2a8bf57ff4faa0d8670081bd14cdef8
|