Skip to main content

GLM-ASR - All-in-One Speech Recognition Service based on GLM-ASR-Nano

Project description

English | 简体中文 | 繁體中文 | 日本語

GLM-ASR

Docker License Python FastAPI

All-in-One Speech Recognition Service based on GLM-ASR-Nano

Web UI • REST API • SSE Streaming • Swagger Docs


🖥️ Screenshot

Web UI


✨ Features

  • 🎯 High Accuracy - Based on GLM-ASR-Nano-2512 (1.5B), outperforms Whisper V3
  • 🌍 17 Languages - Chinese, English, Cantonese, Japanese, Korean, and more
  • 🎤 Long Audio - VAD smart segmentation for unlimited audio length
  • 🚀 SSE Streaming - Real-time progress and results for long audio
  • 🖥️ Web UI - Modern dark-mode interface with 4 language support
  • 🔌 REST API - Full API with Swagger documentation
  • 💾 GPU Management - Manual load/unload for memory control
  • 🐳 Docker Ready - One-command deployment with pre-loaded model

🚀 Quick Start

Docker (Recommended)

docker run -d --gpus all -p 7860:7860 neosun/glm-asr:v2.0.1

Access:

Docker Compose

git clone https://github.com/neosun100/glm-asr.git
cd glm-asr
docker compose up -d

📖 API Reference

Base URL

http://localhost:7860

Endpoints

Health Check

GET /health
{"status": "ok", "model_loaded": true}

Transcribe (Sync) - For short audio

POST /api/transcribe
Content-Type: multipart/form-data
Parameter Type Default Description
file File required Audio file (wav/mp3/flac/m4a/ogg/webm)
max_new_tokens int 512 Max output tokens (1-2048)
curl -X POST http://localhost:7860/api/transcribe \
  -F "file=@audio.mp3" \
  -F "max_new_tokens=512"
{"status": "success", "text": "Transcribed text here..."}

Transcribe (SSE Stream) - For long audio

POST /api/transcribe/stream
Content-Type: multipart/form-data

Returns Server-Sent Events with real-time progress:

Event Type Description Example
start Processing started {"type": "start"}
progress Segment progress {"type": "progress", "current": 3, "total": 10, "duration": 22.5}
partial Segment result {"type": "partial", "text": "Segment text..."}
done Complete {"type": "done", "text": "Full transcription..."}
error Error occurred {"type": "error", "message": "Error details"}
curl -X POST http://localhost:7860/api/transcribe/stream \
  -F "file=@long_audio.mp3"

GPU Status

GET /gpu/status
{
  "model_loaded": true,
  "device": "cuda",
  "gpu_memory_used_mb": 4320.5,
  "gpu_memory_total_mb": 24576.0
}

Load/Unload Model

POST /gpu/load
POST /gpu/unload

Interactive Documentation


⚙️ Configuration

Environment Variables

Variable Default Description
MODEL_CHECKPOINT zai-org/GLM-ASR-Nano-2512 HuggingFace model path
PORT 7860 Service port
HF_HOME /app/cache Model cache directory

docker-compose.yml

services:
  glm-asr:
    image: neosun/glm-asr:v2.0.1
    container_name: glm-asr
    ports:
      - "7860:7860"
    volumes:
      - ./cache:/app/cache
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

🏗️ Tech Stack

Component Technology
Model GLM-ASR-Nano-2512 (1.5B)
Backend FastAPI + Uvicorn
Streaming Server-Sent Events (SSE)
Frontend HTML5 + Vanilla JS
Container Docker + NVIDIA CUDA
API Docs Swagger / ReDoc

📊 Benchmark

GLM-ASR-Nano achieves the lowest average error rate (4.10) among comparable models:

Benchmark


📝 Changelog

v2.0.1 (2025-12-28)

  • ✅ Migrated to FastAPI async framework
  • ✅ SSE streaming for real-time progress
  • ✅ Complete Swagger API documentation
  • ✅ Dual API mode: sync + streaming
  • ✅ Fixed browser timeout for long audio
  • ✅ Modern dark UI with progress display

v1.1.0 (2025-12-15)

  • ✅ VAD smart segmentation (silero-vad)
  • ✅ Support unlimited audio length

v1.0.0 (2025-12-14)

  • ✅ Initial release
  • ✅ Web UI with 4 language support
  • ✅ REST API with Swagger docs
  • ✅ Docker all-in-one image

📄 License

Apache License 2.0


⭐ Star History

Star History Chart

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iflow_mcp_neosun100_glm_asr-1.0.0.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iflow_mcp_neosun100_glm_asr-1.0.0-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file iflow_mcp_neosun100_glm_asr-1.0.0.tar.gz.

File metadata

  • Download URL: iflow_mcp_neosun100_glm_asr-1.0.0.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_neosun100_glm_asr-1.0.0.tar.gz
Algorithm Hash digest
SHA256 2c247f4567594cc0c21fc9b3703bef1ac712fc93665cf6f80bd191d914075852
MD5 4ff4578c323c62d77bdee9c9b253228c
BLAKE2b-256 5186b974f58db1ff4a470a7a849de9f1bcbf33049606bd4d1f1324d7c9b9f3e1

See more details on using hashes here.

File details

Details for the file iflow_mcp_neosun100_glm_asr-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: iflow_mcp_neosun100_glm_asr-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_neosun100_glm_asr-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 76dcd03013bdf00d9d88e477b882bb6bf1ca0f8126edb4a13ca5af44e13dfe84
MD5 2bdba77d26b6cff1a1deb58dbadfaa5a
BLAKE2b-256 f093bdb0452511bb4406edb400e1943dd6a82638c04b11094c182ea7f9d00e3b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page