Skip to main content

GPU-powered transcription API with speaker diarization

Project description

MurmurAI

GPU-powered transcription API in one command

PyPI CI Python 3.12 MIT License

FeaturesQuick StartAPIConfigSecurityDevelopment


Turn any audio into text with speaker labels. No cloud. No limits. Just run:

uvx murmurai

MurmurAI wraps murmurai-core (our WhisperX fork) in a REST API with speaker diarization, word-level timestamps, and multiple export formats. Self-hosted alternative to AssemblyAI, Deepgram, and Rev.ai.

Features

  • Speaker Diarization - Identify who said what with pyannote
  • Word-Level Timestamps - Precise alignment for every word
  • Multiple Export Formats - SRT, WebVTT, TXT, JSON
  • Webhook Callbacks - Get notified when transcription completes
  • GPU Model Caching - Fast subsequent transcriptions
  • Background Processing - Non-blocking async jobs
  • Progress Tracking - Poll for real-time status

Quick Start

Prerequisites

  • NVIDIA GPU with 6GB+ VRAM (or CPU mode for testing)
  • CUDA 12.x drivers installed

Option A: One-Liner Install (Recommended)

curl -fsSL https://raw.githubusercontent.com/namastexlabs/murmurai/main/get-murmurai.sh | bash

This installs Python 3.12, uv, checks CUDA, and sets up murmurai.

Option B: Direct Run (if dependencies met)

uvx murmurai

Option C: pip install

pip install murmurai
murmurai

Option D: Docker (GPU required)

# Clone and run with docker compose
git clone https://github.com/namastexlabs/murmurai.git
cd murmurai
docker compose up

Requires NVIDIA Container Toolkit. Set MURMURAI_API_KEY in environment for production.

The API starts at http://localhost:8880. Swagger docs at /docs.

First Transcription

# Default API key is "namastex888" - works out of the box
curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "file=@audio.mp3"

# Check status (replace {id} with returned transcript ID)
curl http://localhost:8880/v1/transcript/{id} \
  -H "Authorization: namastex888"

API Reference

Method Endpoint Description
POST /v1/transcript Submit transcription job
GET /v1/transcript/{id} Get transcript status/result
GET /v1/transcript/{id}/srt Export as SRT subtitles
GET /v1/transcript/{id}/vtt Export as WebVTT
GET /v1/transcript/{id}/txt Export as plain text
GET /v1/transcript/{id}/json Export as JSON
DELETE /v1/transcript/{id} Delete transcript
GET /health Health check (no auth)

Submit Transcription

File upload:

curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "file=@audio.mp3"

URL download:

curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "audio_url=https://example.com/audio.mp3"

With speaker diarization:

curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "file=@audio.mp3" \
  -F "speaker_labels=true" \
  -F "speakers_expected=2"

Response Format

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "text": "Hello world, this is a transcription.",
  "words": [
    {"text": "Hello", "start": 0, "end": 500, "confidence": 0.98, "speaker": "A"}
  ],
  "utterances": [
    {"speaker": "A", "text": "Hello world...", "start": 0, "end": 3000}
  ],
  "language_code": "en"
}

Status values: queuedprocessingcompleted (or error)

Configuration

All settings via environment variables with MURMURAI_ prefix. Everything has sensible defaults - no .env file needed for local use.

Variable Default Description
MURMURAI_API_KEY namastex888 API authentication key
MURMURAI_HOST 0.0.0.0 Server bind address
MURMURAI_PORT 8880 Server port
MURMURAI_MODEL large-v3-turbo Whisper model
MURMURAI_DATA_DIR ./data SQLite database location
MURMURAI_HF_TOKEN - HuggingFace token (for diarization)
MURMURAI_DEVICE 0 GPU device index
MURMURAI_LOG_FORMAT text Logging format (text or json)
MURMURAI_LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR)

Speaker Diarization Setup

To enable speaker_labels=true:

  1. Accept license at pyannote/speaker-diarization
  2. Get token at huggingface.co/settings/tokens
  3. Add to config:
    echo "MURMURAI_HF_TOKEN=hf_xxx" >> ~/.config/murmurai/.env
    

Security

Default API Key Warning

MurmurAI ships with a default API key (namastex888) for zero-config local use. This key is publicly known.

For any network-exposed deployment, set a secure key:

# Generate a secure random key
export MURMURAI_API_KEY=$(openssl rand -hex 32)

# Or add to your .env file
echo "MURMURAI_API_KEY=$(openssl rand -hex 32)" >> .env

The server will display a security warning at startup if using the default key.

Network Exposure

  • Local-only (default): Safe to use default key for localhost testing
  • LAN/Docker: Change the API key before exposing to your network
  • Internet: Always use a strong API key + consider a reverse proxy with HTTPS

SSRF Protection

The API validates all audio_url parameters to prevent Server-Side Request Forgery:

  • Blocks internal IPs (127.0.0.1, 10.x.x.x, 192.168.x.x, etc.)
  • Blocks cloud metadata endpoints (169.254.169.254)
  • Only allows HTTP/HTTPS schemes
  • Resolves DNS and validates the resolved IP

Troubleshooting

CUDA not available:

# Check NVIDIA driver
nvidia-smi

# Check PyTorch CUDA
python -c "import torch; print(torch.cuda.is_available())"

Out of VRAM:

  • Use smaller model: MURMURAI_MODEL=medium
  • Reduce batch size: MURMURAI_BATCH_SIZE=8

Diarization fails:

  • Verify HF token: echo $MURMURAI_HF_TOKEN
  • Accept license at HuggingFace (link above)

Built On

This project uses murmurai-core - our maintained fork of WhisperX with modern dependency support (PyTorch 2.6+, Pyannote 4.x, Python 3.10-3.13).


Development

Setup

git clone https://github.com/namastexlabs/murmurai.git
cd murmurai
uv sync

Run Tests

uv run pytest tests/ -v

Code Quality

uv run ruff check .
uv run ruff format .
uv run mypy src/

Project Structure

murmurai/
├── src/murmurai/
│   ├── server.py          # FastAPI application
│   ├── transcriber.py     # Transcription pipeline
│   ├── model_manager.py   # GPU model caching
│   ├── database.py        # SQLite persistence
│   ├── config.py          # Settings management
│   ├── auth.py            # API authentication
│   ├── models.py          # Pydantic schemas
│   ├── deps.py            # Dependency checks
│   └── main.py            # CLI entry point
├── tests/                 # Test suite
├── get-murmurai.sh        # One-liner installer
└── pyproject.toml         # Project config

CI/CD

  • CI: Runs on every push (lint, typecheck, test)

Performance Notes

  • First request: ~60-90s (model loading)
  • Subsequent: ~same as audio duration
  • VRAM usage: ~5-6GB for large-v3-turbo

Made with ❤️ by Namastex Labs

Star us on GitHub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

murmurai-1.0.0.tar.gz (501.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

murmurai-1.0.0-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file murmurai-1.0.0.tar.gz.

File metadata

  • Download URL: murmurai-1.0.0.tar.gz
  • Upload date:
  • Size: 501.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for murmurai-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b355ccf3daa31462b93c44efa428702e04a7a3451398262649f5d05ab956d50c
MD5 348afaff21da3e3dc7c7812129d90a68
BLAKE2b-256 ba57b16c5f858abf3351f3640c6fc06fd9c3a2e7904c33941adf06fe7d15250f

See more details on using hashes here.

File details

Details for the file murmurai-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: murmurai-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 31.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for murmurai-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7c4812509910d1c31ccd7ffa8aa3455989351659d44abd9b68d0c2fb4cb78af2
MD5 2f09faa95ec30fb5a9f19cfc2bbb8249
BLAKE2b-256 f6fcfab8968ca0935038fdb810f405fbb81afe6672c2adb9887307cc15c7f2b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page