Skip to main content

GPU-powered transcription API with speaker diarization

Project description

MurmurAI

GPU-powered transcription API in one command

PyPI CI Python 3.12 MIT License

FeaturesQuick StartAPIConfigSecurityDevelopment


Turn any audio into text with speaker labels. No cloud. No limits. Just run:

uvx murmurai

MurmurAI wraps murmurai-core (our WhisperX fork) in a REST API with speaker diarization, word-level timestamps, and multiple export formats. Self-hosted alternative to AssemblyAI, Deepgram, and Rev.ai.

Features

  • Speaker Diarization - Identify who said what with pyannote
  • Word-Level Timestamps - Precise alignment for every word
  • Multiple Export Formats - SRT, WebVTT, TXT, JSON
  • Webhook Callbacks - Get notified when transcription completes
  • GPU Model Caching - Fast subsequent transcriptions
  • Background Processing - Non-blocking async jobs
  • Progress Tracking - Poll for real-time status

Quick Start

Prerequisites

  • NVIDIA GPU with 6GB+ VRAM (or CPU mode for testing)
  • CUDA 12.x drivers installed

Option A: One-Liner Install (Recommended)

curl -fsSL https://raw.githubusercontent.com/namastexlabs/murmurai/main/get-murmurai.sh | bash

This installs Python 3.12, uv, checks CUDA, and sets up murmurai.

Option B: Direct Run (if dependencies met)

uvx murmurai

Option C: pip install

pip install murmurai
murmurai

Option D: Docker (GPU required)

# Clone and run with docker compose
git clone https://github.com/namastexlabs/murmurai.git
cd murmurai
docker compose up

Requires NVIDIA Container Toolkit. Set MURMURAI_API_KEY in environment for production.

The API starts at http://localhost:8880. Swagger docs at /docs.

First Transcription

# Default API key is "namastex888" - works out of the box
curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "file=@audio.mp3"

# Check status (replace {id} with returned transcript ID)
curl http://localhost:8880/v1/transcript/{id} \
  -H "Authorization: namastex888"

API Reference

Method Endpoint Description
POST /v1/transcript Submit transcription job
GET /v1/transcript/{id} Get transcript status/result
GET /v1/transcript/{id}/srt Export as SRT subtitles
GET /v1/transcript/{id}/vtt Export as WebVTT
GET /v1/transcript/{id}/txt Export as plain text
GET /v1/transcript/{id}/json Export as JSON
DELETE /v1/transcript/{id} Delete transcript
GET /health Health check (no auth)

Submit Transcription

File upload:

curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "file=@audio.mp3"

URL download:

curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "audio_url=https://example.com/audio.mp3"

With speaker diarization:

curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "file=@audio.mp3" \
  -F "speaker_labels=true" \
  -F "speakers_expected=2"

Response Format

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "text": "Hello world, this is a transcription.",
  "words": [
    {"text": "Hello", "start": 0, "end": 500, "confidence": 0.98, "speaker": "A"}
  ],
  "utterances": [
    {"speaker": "A", "text": "Hello world...", "start": 0, "end": 3000}
  ],
  "language_code": "en"
}

Status values: queuedprocessingcompleted (or error)

Configuration

All settings via environment variables with MURMURAI_ prefix. Everything has sensible defaults - no .env file needed for local use.

Variable Default Description
MURMURAI_API_KEY namastex888 API authentication key
MURMURAI_HOST 0.0.0.0 Server bind address
MURMURAI_PORT 8880 Server port
MURMURAI_MODEL large-v3-turbo Whisper model
MURMURAI_DATA_DIR ./data SQLite database location
MURMURAI_HF_TOKEN - HuggingFace token (for diarization)
MURMURAI_DEVICE 0 GPU device index
MURMURAI_LOG_FORMAT text Logging format (text or json)
MURMURAI_LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR)

Speaker Diarization Setup

To enable speaker_labels=true:

  1. Accept license at pyannote/speaker-diarization
  2. Get token at huggingface.co/settings/tokens
  3. Add to config:
    echo "MURMURAI_HF_TOKEN=hf_xxx" >> ~/.config/murmurai/.env
    

Security

Default API Key Warning

MurmurAI ships with a default API key (namastex888) for zero-config local use. This key is publicly known.

For any network-exposed deployment, set a secure key:

# Generate a secure random key
export MURMURAI_API_KEY=$(openssl rand -hex 32)

# Or add to your .env file
echo "MURMURAI_API_KEY=$(openssl rand -hex 32)" >> .env

The server will display a security warning at startup if using the default key.

Network Exposure

  • Local-only (default): Safe to use default key for localhost testing
  • LAN/Docker: Change the API key before exposing to your network
  • Internet: Always use a strong API key + consider a reverse proxy with HTTPS

SSRF Protection

The API validates all audio_url parameters to prevent Server-Side Request Forgery:

  • Blocks internal IPs (127.0.0.1, 10.x.x.x, 192.168.x.x, etc.)
  • Blocks cloud metadata endpoints (169.254.169.254)
  • Only allows HTTP/HTTPS schemes
  • Resolves DNS and validates the resolved IP

Troubleshooting

CUDA not available:

# Check NVIDIA driver
nvidia-smi

# Check PyTorch CUDA
python -c "import torch; print(torch.cuda.is_available())"

Out of VRAM:

  • Use smaller model: MURMURAI_MODEL=medium
  • Reduce batch size: MURMURAI_BATCH_SIZE=8

Diarization fails:

  • Verify HF token: echo $MURMURAI_HF_TOKEN
  • Accept license at HuggingFace (link above)

Built On

This project uses murmurai-core - our maintained fork of WhisperX with modern dependency support (PyTorch 2.6+, Pyannote 4.x, Python 3.10-3.13).


Development

Setup

git clone https://github.com/namastexlabs/murmurai.git
cd murmurai
uv sync

Run Tests

uv run pytest tests/ -v

Code Quality

uv run ruff check .
uv run ruff format .
uv run mypy src/

Project Structure

murmurai/
├── src/murmurai/
│   ├── server.py          # FastAPI application
│   ├── transcriber.py     # Transcription pipeline
│   ├── model_manager.py   # GPU model caching
│   ├── database.py        # SQLite persistence
│   ├── config.py          # Settings management
│   ├── auth.py            # API authentication
│   ├── models.py          # Pydantic schemas
│   ├── deps.py            # Dependency checks
│   └── main.py            # CLI entry point
├── tests/                 # Test suite
├── get-murmurai.sh        # One-liner installer
└── pyproject.toml         # Project config

CI/CD

  • CI: Runs on every push (lint, typecheck, test)

Performance Notes

  • First request: ~60-90s (model loading)
  • Subsequent: ~same as audio duration
  • VRAM usage: ~5-6GB for large-v3-turbo

Made with ❤️ by Namastex Labs

Star us on GitHub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

murmurai-1.0.3rc3.tar.gz (501.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

murmurai-1.0.3rc3-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file murmurai-1.0.3rc3.tar.gz.

File metadata

  • Download URL: murmurai-1.0.3rc3.tar.gz
  • Upload date:
  • Size: 501.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for murmurai-1.0.3rc3.tar.gz
Algorithm Hash digest
SHA256 25bbc6a6252b8f0969a11c37b10020c2ef4b8351b20be6bbc696479ccb318a91
MD5 b0c479c88d6bd31bdd44e42523b01f9a
BLAKE2b-256 a281b77f8129cda81af32b71dfa29095ae9817e512c08309e9f9894c27d86488

See more details on using hashes here.

File details

Details for the file murmurai-1.0.3rc3-py3-none-any.whl.

File metadata

  • Download URL: murmurai-1.0.3rc3-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for murmurai-1.0.3rc3-py3-none-any.whl
Algorithm Hash digest
SHA256 9d93fe28150b592e6bfca0413c80b843fcc9cb00ec5b6caf7e1b47737427151e
MD5 94e6e396064571c7716bfc7f7c074764
BLAKE2b-256 6325dc417e08176e9187a469b57bb0f16e98992cfc8f99af4aa8e237b54002aa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page