Native Apple-Silicon ASR service powered by whispermlx (MLX) with FastAPI

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kalebjs

These details have not been verified by PyPI

Project description

Whispermlx ASR Service

A native Apple-Silicon ASR API service powered by whispermlx (MLX) with FastAPI.

Runs natively on macOS with Apple Silicon (M1/M2/M3/M4). MLX Whisper inference runs on the Metal GPU automatically. No CUDA, no Docker, no Ray Serve.

What This Does

Transcribes audio files using OpenAI Whisper models via the MLX backend
Identifies speakers ("Who spoke when") using Pyannote.audio
Returns word-level timestamps via wav2vec2 alignment
Supports 90+ languages
Outputs JSON, SRT, VTT, TSV, and plain text formats
OpenAI-compatible API (/v1/audio/transcriptions, /v1/audio/translations, /v1/models)
Runs natively on Apple Silicon with uv and Python 3.13

Limitations

Not production-grade: Basic error handling, no authentication
Apple Silicon only: Requires an M-series Mac. No NVIDIA/CUDA support.
File size limits: Large audio files (>1GB) can cause out-of-memory errors
Memory usage: RAM consumption increases with file size and diarization. Peak ~2.3 GB for a small-model full pipeline on a 16 GB M1.
Alpha software: Expect bugs and breaking changes

How It Works

Audio --> MLX Whisper (transcription, Metal GPU) --> Wav2Vec2 (alignment) --> Pyannote (speaker ID) --> Output

The service runs as a single-process uvicorn server with an async queue. Requests are serialized through a semaphore so only one pipeline runs on the Metal GPU at a time. This is suitable for single-device, low-traffic, or development use.

Device semantics: MLX Whisper ASR always runs on the Metal GPU automatically. The DEVICE environment variable (default mps) only controls where the VAD, wav2vec2 alignment, and pyannote diarization (torch-based stages) run. COMPUTE_TYPE and BATCH_SIZE are accepted for API compatibility but have no effect on the MLX backend.

Prerequisites

Hardware Requirements

Apple Silicon Mac (M1, M2, M3, or M4)
RAM: 16 GB recommended (8 GB may work with tiny/base models)
Storage: 50 GB SSD for model caching

Memory requirements vary by model size:

Whisper Model	RAM (full pipeline*)	Notes
`tiny`, `base`	~2 GB	Fast, low quality
`small`	~2.3 GB	Good balance of speed and quality
`medium`	~5 GB	Good quality, slower
`large-v3-turbo`, `turbo`	~5 GB	Fast, high quality
`large-v3`	~10+ GB	Best quality, slowest

*Full pipeline = Whisper model + alignment model + pyannote speaker diarization. Measured on M1 16 GB.

Software Requirements

macOS with Apple Silicon
uv (Python package manager)
Python 3.13 (installed via uv; system Python 3.14 is incompatible with whispermlx)
FFmpeg (for audio decoding; brew install ffmpeg)
Hugging Face Account (for speaker diarization models)

Quick Start

1. Install uv and Set Up Python 3.13

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone the repository
git clone https://github.com/KalebJS/whispermlx-asr-service.git
cd whispermlx-asr-service

# Create a Python 3.13 virtual environment and install dependencies
uv venv --python 3.13
uv sync

2. Get Hugging Face Token (for Speaker Diarization)

Speaker diarization requires a Hugging Face token and model access:

a) Create a Hugging Face Account:

Visit https://huggingface.co/join and sign up

b) Accept the Model User Agreement:

pyannote/speaker-diarization-community-1 - Click "Agree and access repository"

c) Generate an Access Token:

Visit https://huggingface.co/settings/tokens
Click "New token", name it (e.g., "whispermlx-diarization")
Select "Read" permission and generate
Copy the token (starts with hf_...)

Without the token and accepted agreement, diarization is gracefully skipped and transcription still works, but no speaker labels will be assigned.

3. Configure Environment

# Copy example environment file
cp .env.example .env

# Edit .env and add your Hugging Face token
nano .env

Minimal .env:

HF_TOKEN=hf_your_token_here
DEVICE=mps
PRELOAD_MODEL=large-v3
PORT=9001

4. Run the Service

# Export your .env vars first (entrypoint.sh does NOT auto-load .env)
set -a; source .env; set +a

# Start the service (binds 0.0.0.0:9001)
./entrypoint.sh

# Or start directly with uvicorn (binds localhost only)
uv run uvicorn app.main:app --host 127.0.0.1 --port 9001

# Or load .env and start in one step
uv run uvicorn app.main:app --host 127.0.0.1 --port 9001 --env-file .env

The service will be available at http://localhost:9001.

Note: entrypoint.sh hardcodes port 9001 and binds to 0.0.0.0 (all interfaces). The PORT env var is only respected when launching uvicorn directly with --port $PORT. Since the service has no authentication, prefer --host 127.0.0.1 unless you need remote access.

Port 9001 is the default. Port 9000 may be in use by other services (e.g., php-fpm on some macOS setups). The reserved port range for this service is 9001-9010.

5. Test the Service

# Health check
curl http://localhost:9001/health

# Test transcription
curl -X POST http://localhost:9001/asr \
  -F "audio_file=@your_audio.mp3" \
  -F "language=en"

A smoke test script is included:

./test-api.sh localhost 9001 path/to/audio.wav

API Documentation

Once running, visit http://localhost:9001/docs for interactive API documentation.

Main Endpoint: POST /asr

Parameters:

Parameter	Type	Default	Description
`audio_file`	File	Required	Audio file to transcribe
`task`	String	`transcribe`	Task type: `transcribe` or `translate`
`language`	String	Auto-detect	Language code (e.g., `en`, `es`, `fr`)
`model`	String	`large-v3`	Whisper model (see Model Selection)
`initial_prompt`	String	None	Context or spelling guide to steer the model
`hotwords`	String	None	Accepted but ignored by the MLX backend (see Hotwords)
`output_format`	String	`json`	Output format: `json`, `text`, `srt`, `vtt`, `tsv`
`output`	String	None	Legacy alias for `output_format`
`word_timestamps`	Boolean	`true`	Return word-level timestamps
`diarize`	Boolean	`true`	Enable speaker diarization
`enable_diarization`	Boolean	None	Alias for `diarize`
`num_speakers`	Integer	Auto	Exact number of speakers (overrides min/max)
`min_speakers`	Integer	Auto	Minimum number of speakers
`max_speakers`	Integer	Auto	Maximum number of speakers
`return_speaker_embeddings`	Boolean	`false`	Return 256-dimensional speaker embedding vectors

Example Request (JSON output):

curl -X POST http://localhost:9001/asr \
  -F "audio_file=@meeting.mp3" \
  -F "language=en" \
  -F "model=large-v3" \
  -F "output_format=json" \
  -F "diarize=true" \
  -F "min_speakers=2" \
  -F "max_speakers=5"

Example Request (SRT subtitles):

curl -X POST http://localhost:9001/asr \
  -F "audio_file=@video.mp4" \
  -F "language=en" \
  -F "output_format=srt" \
  -F "diarize=false"

Example Response (JSON):

The text field is a JSON array mirroring the segments array (legacy drop-in shape from the original whisper-asr-webservice):

{
  "text": [
    {
      "start": 0.5,
      "end": 2.3,
      "text": " Hello, welcome to the meeting.",
      "speaker": "SPEAKER_00",
      "words": [
        {"word": "Hello", "start": 0.5, "end": 0.8, "score": 0.95},
        {"word": "welcome", "start": 0.9, "end": 1.2, "score": 0.93}
      ]
    }
  ],
  "language": "en",
  "segments": [...],
  "word_segments": [...]
}

Hotwords (No-Op)

The hotwords parameter is accepted for API compatibility but is a no-op on the MLX backend. The whispermlx library has no hotwords mechanism. When you supply hotwords, the service logs a warning ("hotwords is ignored by the MLX backend") and proceeds with normal transcription. The parameter never causes an error.

To bias transcription toward specific spellings, use initial_prompt instead, which provides context that primes the model to expect certain terms:

# Use initial_prompt to guide spelling
curl -X POST "http://localhost:9001/asr?language=en&initial_prompt=Speakr+is+a+transcription+app." \
  -F "audio_file=@meeting.mp3"

Speaker Diarization

Speaker diarization assigns SPEAKER_NN labels to segments and words. It is enabled by default when HF_TOKEN is set.

Requirements:

HF_TOKEN must be set in your .env file
You must have accepted the model agreement for pyannote/speaker-diarization-community-1

When HF_TOKEN is missing or diarization fails: The service gracefully skips diarization and returns the transcription without speaker labels (HTTP 200, no crash). This ensures transcription is never blocked by a missing token.

Exact Speaker Count:

curl -X POST http://localhost:9001/asr \
  -F "audio_file=@interview.mp3" \
  -F "num_speakers=2" \
  -F "diarize=true"

num_speakers overrides min_speakers and max_speakers.

Speaker Embeddings:

curl -X POST http://localhost:9001/asr \
  -F "audio_file=@meeting.mp3" \
  -F "return_speaker_embeddings=true" \
  -F "diarize=true"

Returns a speaker_embeddings object keyed by speaker label, with one numeric vector per detected speaker. Embeddings are only included in json output format.

OpenAI-Compatible Endpoints

The service provides drop-in OpenAI API compatibility:

POST /v1/audio/transcriptions

curl -X POST http://localhost:9001/v1/audio/transcriptions \
  -F "file=@audio.mp3" \
  -F "model=whisper-1"

Supports response_format: json (default, returns {"text": "..."}), text, srt, vtt, verbose_json (full object with segments, optional word timestamps via timestamp_granularities[]).

The model field accepts OpenAI-style aliases (whisper-1, whisper-tiny, whisper-large-v3) and raw MLX model names (tiny, base, large-v3-turbo, etc.).

POST /v1/audio/translations

curl -X POST http://localhost:9001/v1/audio/translations \
  -F "file=@spanish_audio.mp3" \
  -F "model=whisper-1"

Translates non-English audio into English text. Same response formats as transcriptions. verbose_json reports task: "translate".

GET /v1/models

curl http://localhost:9001/v1/models

Returns an OpenAI-style list of available models, built from the MLX model map. Includes the whisper-1 alias and all canonical MLX model names.

GET /v1/models/{model_id}

curl http://localhost:9001/v1/models/large-v3

Returns the matching model object, or a 404 OpenAI error for unknown ids.

Health and Metrics

# Health check
curl http://localhost:9001/health
# {"status": "healthy", "device": "mps", "loaded_models": ["large-v3"], "serve_mode": "simple"}

# Root endpoint
curl http://localhost:9001/

# Prometheus metrics
curl http://localhost:9001/metrics

# Queue metrics (JSON)
curl http://localhost:9001/queue-metrics

Prometheus Metrics:

Metric	Type	Notes
`whisperx_requests_total{endpoint,status}`	Counter	`status` is `ok`, `http_<code>`, or `error`
`whisperx_request_duration_seconds{endpoint}`	Histogram	End-to-end handler time
`whisperx_active_transcriptions`	Gauge	In-flight `/asr` requests
`whisperx_loaded_models`	Gauge	Whisper models currently in cache
`whisperx_model_evictions_total{model}`	Counter	Models unloaded by the idle-eviction sweep
`whisperx_audio_duration_seconds`	Histogram	Submitted audio duration
`whisperx_audio_size_megabytes`	Histogram	Submitted file size
`whisperx_vram_allocated_bytes`	Gauge	MLX active memory (or 0)
`whisperx_service_info`	Info	Static labels: version, device, compute_type, serve_mode

The whisperx_vram_allocated_bytes gauge reports MLX active memory via mlx.core.get_active_memory() when available, or 0 otherwise. No torch.cuda is used.

Model Selection

Available MLX Whisper models (speed vs accuracy tradeoff):

Model	Parameters	Speed	Quality
`tiny`, `tiny.en`	39M	Fastest	Lowest
`base`, `base.en`	74M	Very Fast	Low
`small`, `small.en`	244M	Fast	Medium
`medium`, `medium.en`	769M	Moderate	Good
`large`, `large-v1`	1550M	Slow	Excellent
`large-v2`	1550M	Slow	Excellent
`large-v3`	1550M	Slow	Best
`large-v3-turbo`, `turbo`	809M	Fast	High

OpenAI-style aliases (whisper-1, whisper-tiny, whisper-large-v3, etc.) are also accepted and resolve to the corresponding MLX model.

Recommendation:

Use large-v3 for best quality
Use small or base for speed and lower memory usage
Use large-v3-turbo for a good balance of speed and quality

Models are downloaded on first use and cached in CACHE_DIR (default ~/.cache/whisperx-asr).

Configuration

Environment Variables

Edit .env to customize:

# Device for torch-based stages (VAD, alignment, diarization).
# MLX Whisper ASR always runs on the Metal GPU regardless of this setting.
DEVICE=mps              # mps (default, recommended) or cpu (fallback, slower diarization)

# Compute type and batch size (accepted but INERT under MLX — no effect on inference)
# Code defaults: COMPUTE_TYPE=int8, BATCH_SIZE=2 (leftover from CUDA era, unused)
#COMPUTE_TYPE=int8
#BATCH_SIZE=2

# Hugging Face token for diarization (REQUIRED for speaker labels)
HF_TOKEN=hf_xxx...

# Model preloading (optional, reduces first-request latency)
# Also sets the default model for /asr requests when no model= param is given
PRELOAD_MODEL=large-v3   # Leave empty to disable

# Override which model the OpenAI "whisper-1" alias resolves to
# (defaults to the PRELOAD_MODEL value, or large-v3 if unset)
#OPENAI_WHISPER1_MODEL=large-v3

# Service port (default 9001)
PORT=9001

# Model cache directories
CACHE_DIR=~/.cache/whisperx-asr
HF_HOME=~/.cache/whisperx-asr

# Maximum file size in MB (prevents out-of-memory errors)
MAX_FILE_SIZE_MB=1000

# GPU concurrency (Metal GPU semaphore; default 1)
#GPU_CONCURRENCY=1

# Maximum queued requests before rejecting with 503 (default 32)
#MAX_QUEUE_SIZE=32

# Idle model eviction (default disabled). When > 0, Whisper models that have
# not served a request in this many seconds are unloaded from memory.
#MODEL_KEEP_ALIVE_SECONDS=0
#MODEL_EVICTION_INTERVAL_SECONDS=60

# Offline mode (optional): set to 1 to prevent network requests after models are cached
#HF_HUB_OFFLINE=1

Idle Model Eviction

Set MODEL_KEEP_ALIVE_SECONDS to unload Whisper models that have been idle longer than the configured window. The next request that needs the model reloads it transparently:

MODEL_KEEP_ALIVE_SECONDS=3600          # unload models idle for 1 hour
MODEL_EVICTION_INTERVAL_SECONDS=60     # sweep cadence (floor 30 seconds)

Default is 0 (disabled; models stay loaded).

Integration with Speakr

To use this service with Speakr:

Update Speakr's .env file:

USE_ASR_ENDPOINT=true
ASR_BASE_URL=http://localhost:9001

If the service is on a different machine, replace localhost with the IP address and ensure the port is accessible through your firewall.

If Speakr runs in Docker: localhost inside a container refers to the container itself, not the host. Use host.docker.internal instead:

ASR_BASE_URL=http://host.docker.internal:9001

Model compatibility: distil-* models (e.g., distil-large-v2) are no longer available on the MLX backend. If Speakr was configured to use a distil-* model, switch to a standard model name such as large-v3, small, or large-v3-turbo. The hotwords parameter is accepted but silently ignored; use initial_prompt for spelling bias instead.

Running the Service

# Using entrypoint.sh (exports .env first; binds 0.0.0.0:9001)
set -a; source .env; set +a
./entrypoint.sh

# Or directly with uvicorn (localhost only, auto-loads .env)
uv run uvicorn app.main:app --host 127.0.0.1 --port 9001 --env-file .env

# With environment variables inline (no .env needed)
DEVICE=mps PRELOAD_MODEL=base uv run uvicorn app.main:app --host 127.0.0.1 --port 9001

Offline Use

The service can run completely offline after an initial setup with internet access:

Start the service with internet access

Run at least one transcription request with diarization enabled to cache all models:

curl -X POST http://localhost:9001/asr \
  -F "audio_file=@test.mp3" \
  -F "diarize=true"

Set HF_HUB_OFFLINE=1 in your .env file
Restart the service

The service will now operate without any network requests to Hugging Face.

Monitoring and Logs

View Logs

When running in the foreground, logs appear in the terminal. When running in the background:

# Export .env first, then start in the background
set -a; source .env; set +a
./entrypoint.sh &> service.log &
tail -f service.log

Health Check

curl http://localhost:9001/health
# {"status": "healthy", "device": "mps", "loaded_models": ["large-v3"], "serve_mode": "simple"}

Supported Audio Formats

The service supports formats decodable by FFmpeg:

Audio: MP3, WAV, M4A, FLAC, AAC, OGG, WMA
Video: MP4, AVI, MOV, MKV, WebM (audio track extracted)
Other: AMR, 3GP, 3GPP

Troubleshooting

Speaker Diarization Not Working

Symptom: No speaker labels in output

Solutions:

Verify HF_TOKEN is set correctly in .env
Accept the model agreement at pyannote/speaker-diarization-community-1
Check logs for diarization errors
Ensure diarize=true in request (diarization defaults to true when HF_TOKEN is set)

Without HF_TOKEN, diarization is silently skipped and transcription proceeds without speaker labels.

Out of Memory Errors

Solutions:

Reduce MAX_FILE_SIZE_MB in .env
Use a smaller model (small or base instead of large-v3)
Disable diarization for very large files: diarize=false
Split large audio files into smaller chunks before uploading

Slow Processing

Solutions:

Use a smaller model for faster processing
Disable diarization if not needed: diarize=false
Use large-v3-turbo for a good speed/quality balance

API Returns 500 Errors

Check logs for error details. Common causes:

Invalid audio format (use FFmpeg to convert)
Model download failure (check internet access)
Incorrect parameters (check API docs at /docs)

Stress Testing

A stress test script is included to measure throughput and latency under concurrent load:

# Default: 4 concurrent workers, all files in testfiles/
uv run python tests/stress_test.py

# 8 concurrent workers, 3 rounds
uv run python tests/stress_test.py --workers 8 --rounds 3

# Test OpenAI-compat endpoint
uv run python tests/stress_test.py --endpoint openai

# Without diarization
uv run python tests/stress_test.py --no-diarize

Place audio files in the tests/testfiles/ directory (gitignored).

Security Notes

This service has NO built-in authentication or security features.

If exposing to a network:

Use firewall rules to restrict access
Consider putting behind a reverse proxy
Store HF_TOKEN securely in the .env file (never hardcode)

License

This project is MIT licensed. See LICENSE for details.

WhisperX is licensed under BSD-4-Clause. See WhisperX repository for details.

Credits

whispermlx: MLX fork of WhisperX for Apple Silicon
WhisperX: m-bain/whisperX
OpenAI Whisper: openai/whisper
MLX: ml-explore/mlx
Pyannote.audio: pyannote/pyannote-audio

Changelog

See CHANGELOG.md for version history.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kalebjs

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.5.1

Jun 24, 2026

0.5.0

Jun 23, 2026

0.4.0

Jun 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whispermlx_asr_service-0.5.1.tar.gz (195.6 kB view details)

Uploaded Jun 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

whispermlx_asr_service-0.5.1-py3-none-any.whl (29.2 kB view details)

Uploaded Jun 24, 2026 Python 3

File details

Details for the file whispermlx_asr_service-0.5.1.tar.gz.

File metadata

Download URL: whispermlx_asr_service-0.5.1.tar.gz
Upload date: Jun 24, 2026
Size: 195.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for whispermlx_asr_service-0.5.1.tar.gz
Algorithm	Hash digest
SHA256	`1d7450087f38b28ab4c519e819d7b00f4e9d0333dd5df0c69be0b757dcc4c5f6`
MD5	`38d311ef15aac143f515e67695d61a97`
BLAKE2b-256	`cf3df686c24b19efb26931016831a32d72411b455a1d3b931adf527b716f5d08`

See more details on using hashes here.

Provenance

The following attestation bundles were made for whispermlx_asr_service-0.5.1.tar.gz:

Publisher: publish.yml on KalebJS/whispermlx-asr-service

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: whispermlx_asr_service-0.5.1.tar.gz
- Subject digest: 1d7450087f38b28ab4c519e819d7b00f4e9d0333dd5df0c69be0b757dcc4c5f6
- Sigstore transparency entry: 1933604951
- Sigstore integration time: Jun 24, 2026
Source repository:
- Permalink: KalebJS/whispermlx-asr-service@c8237e15544406147d5fce78c0d47146348fc256
- Branch / Tag: refs/tags/v0.5.1
- Owner: https://github.com/KalebJS
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c8237e15544406147d5fce78c0d47146348fc256
- Trigger Event: push

File details

Details for the file whispermlx_asr_service-0.5.1-py3-none-any.whl.

File metadata

Download URL: whispermlx_asr_service-0.5.1-py3-none-any.whl
Upload date: Jun 24, 2026
Size: 29.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for whispermlx_asr_service-0.5.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f0375ff619d0ec090691c88d2e3bffce9957044cca02e48a23b9bc7a9a028930`
MD5	`f08e8841f595012538c39d2c9976d24b`
BLAKE2b-256	`6975ebc0e41058e7cb5d3d0bad4229a014faffc469a6c440be8f5c735f2f05a9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for whispermlx_asr_service-0.5.1-py3-none-any.whl:

Publisher: publish.yml on KalebJS/whispermlx-asr-service

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: whispermlx_asr_service-0.5.1-py3-none-any.whl
- Subject digest: f0375ff619d0ec090691c88d2e3bffce9957044cca02e48a23b9bc7a9a028930
- Sigstore transparency entry: 1933605076
- Sigstore integration time: Jun 24, 2026
Source repository:
- Permalink: KalebJS/whispermlx-asr-service@c8237e15544406147d5fce78c0d47146348fc256
- Branch / Tag: refs/tags/v0.5.1
- Owner: https://github.com/KalebJS
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c8237e15544406147d5fce78c0d47146348fc256
- Trigger Event: push

whispermlx-asr-service 0.5.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Whispermlx ASR Service

What This Does

Limitations

How It Works

Prerequisites

Hardware Requirements

Software Requirements

Quick Start

1. Install uv and Set Up Python 3.13

2. Get Hugging Face Token (for Speaker Diarization)

3. Configure Environment

4. Run the Service

5. Test the Service

API Documentation

Main Endpoint: POST /asr

Hotwords (No-Op)

Speaker Diarization

OpenAI-Compatible Endpoints

POST /v1/audio/transcriptions

POST /v1/audio/translations

GET /v1/models

GET /v1/models/{model_id}

Health and Metrics

Model Selection

Configuration

Environment Variables

Idle Model Eviction

Integration with Speakr

Running the Service

Offline Use

Monitoring and Logs

View Logs

Health Check

Supported Audio Formats

Troubleshooting

Speaker Diarization Not Working

Out of Memory Errors

Slow Processing

API Returns 500 Errors

Stress Testing

Security Notes

License

Credits

Changelog

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance