GLM-ASR - All-in-One Speech Recognition Service based on GLM-ASR-Nano

These details have not been verified by PyPI

Project links

Project description

English | 简体中文 | 繁體中文 | 日本語

GLM-ASR

All-in-One Speech Recognition Service based on GLM-ASR-Nano

Web UI • REST API • SSE Streaming • Swagger Docs

🖥️ Screenshot

Web UI

✨ Features

🎯 High Accuracy - Based on GLM-ASR-Nano-2512 (1.5B), outperforms Whisper V3
🌍 17 Languages - Chinese, English, Cantonese, Japanese, Korean, and more
🎤 Long Audio - VAD smart segmentation for unlimited audio length
🚀 SSE Streaming - Real-time progress and results for long audio
🖥️ Web UI - Modern dark-mode interface with 4 language support
🔌 REST API - Full API with Swagger documentation
💾 GPU Management - Manual load/unload for memory control
🐳 Docker Ready - One-command deployment with pre-loaded model

🚀 Quick Start

Docker (Recommended)

docker run -d --gpus all -p 7860:7860 neosun/glm-asr:v2.0.1

Access:

Web UI: http://localhost:7860
Swagger Docs: http://localhost:7860/docs
ReDoc: http://localhost:7860/redoc

Docker Compose

git clone https://github.com/neosun100/glm-asr.git
cd glm-asr
docker compose up -d

📖 API Reference

Base URL

http://localhost:7860

Endpoints

Health Check

GET /health

{"status": "ok", "model_loaded": true}

Transcribe (Sync) - For short audio

POST /api/transcribe
Content-Type: multipart/form-data

Parameter	Type	Default	Description
file	File	required	Audio file (wav/mp3/flac/m4a/ogg/webm)
max_new_tokens	int	512	Max output tokens (1-2048)

curl -X POST http://localhost:7860/api/transcribe \
  -F "file=@audio.mp3" \
  -F "max_new_tokens=512"

{"status": "success", "text": "Transcribed text here..."}

Transcribe (SSE Stream) - For long audio

POST /api/transcribe/stream
Content-Type: multipart/form-data

Returns Server-Sent Events with real-time progress:

Event Type	Description	Example
`start`	Processing started	`{"type": "start"}`
`progress`	Segment progress	`{"type": "progress", "current": 3, "total": 10, "duration": 22.5}`
`partial`	Segment result	`{"type": "partial", "text": "Segment text..."}`
`done`	Complete	`{"type": "done", "text": "Full transcription..."}`
`error`	Error occurred	`{"type": "error", "message": "Error details"}`

curl -X POST http://localhost:7860/api/transcribe/stream \
  -F "file=@long_audio.mp3"

GPU Status

GET /gpu/status

{
  "model_loaded": true,
  "device": "cuda",
  "gpu_memory_used_mb": 4320.5,
  "gpu_memory_total_mb": 24576.0
}

Load/Unload Model

POST /gpu/load
POST /gpu/unload

Interactive Documentation

Swagger UI: http://localhost:7860/docs
ReDoc: http://localhost:7860/redoc

⚙️ Configuration

Environment Variables

Variable	Default	Description
`MODEL_CHECKPOINT`	`zai-org/GLM-ASR-Nano-2512`	HuggingFace model path
`PORT`	`7860`	Service port
`HF_HOME`	`/app/cache`	Model cache directory

docker-compose.yml

services:
  glm-asr:
    image: neosun/glm-asr:v2.0.1
    container_name: glm-asr
    ports:
      - "7860:7860"
    volumes:
      - ./cache:/app/cache
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

🏗️ Tech Stack

Component	Technology
Model	GLM-ASR-Nano-2512 (1.5B)
Backend	FastAPI + Uvicorn
Streaming	Server-Sent Events (SSE)
Frontend	HTML5 + Vanilla JS
Container	Docker + NVIDIA CUDA
API Docs	Swagger / ReDoc

📊 Benchmark

GLM-ASR-Nano achieves the lowest average error rate (4.10) among comparable models:

Benchmark

📝 Changelog

v2.0.1 (2025-12-28)

✅ Migrated to FastAPI async framework
✅ SSE streaming for real-time progress
✅ Complete Swagger API documentation
✅ Dual API mode: sync + streaming
✅ Fixed browser timeout for long audio
✅ Modern dark UI with progress display

v1.1.0 (2025-12-15)

✅ VAD smart segmentation (silero-vad)
✅ Support unlimited audio length

v1.0.0 (2025-12-14)

✅ Initial release
✅ Web UI with 4 language support
✅ REST API with Swagger docs
✅ Docker all-in-one image

📄 License

Apache License 2.0

⭐ Star History

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Feb 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iflow_mcp_neosun100_glm_asr-1.0.0.tar.gz (14.2 kB view details)

Uploaded Feb 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

iflow_mcp_neosun100_glm_asr-1.0.0-py3-none-any.whl (15.5 kB view details)

Uploaded Feb 4, 2026 Python 3

File details

Details for the file iflow_mcp_neosun100_glm_asr-1.0.0.tar.gz.

File metadata

Download URL: iflow_mcp_neosun100_glm_asr-1.0.0.tar.gz
Upload date: Feb 4, 2026
Size: 14.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_neosun100_glm_asr-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`2c247f4567594cc0c21fc9b3703bef1ac712fc93665cf6f80bd191d914075852`
MD5	`4ff4578c323c62d77bdee9c9b253228c`
BLAKE2b-256	`5186b974f58db1ff4a470a7a849de9f1bcbf33049606bd4d1f1324d7c9b9f3e1`

See more details on using hashes here.

File details

Details for the file iflow_mcp_neosun100_glm_asr-1.0.0-py3-none-any.whl.

File metadata

Download URL: iflow_mcp_neosun100_glm_asr-1.0.0-py3-none-any.whl
Upload date: Feb 4, 2026
Size: 15.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_neosun100_glm_asr-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`76dcd03013bdf00d9d88e477b882bb6bf1ca0f8126edb4a13ca5af44e13dfe84`
MD5	`2bdba77d26b6cff1a1deb58dbadfaa5a`
BLAKE2b-256	`f093bdb0452511bb4406edb400e1943dd6a82638c04b11094c182ea7f9d00e3b`

See more details on using hashes here.

iflow-mcp_neosun100_glm-asr 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GLM-ASR

🖥️ Screenshot

✨ Features

🚀 Quick Start

Docker (Recommended)

Docker Compose

📖 API Reference

Base URL

Endpoints

Health Check

Transcribe (Sync) - For short audio

Transcribe (SSE Stream) - For long audio

GPU Status

Load/Unload Model

Interactive Documentation

⚙️ Configuration

Environment Variables

docker-compose.yml

🏗️ Tech Stack

📊 Benchmark

📝 Changelog

v2.0.1 (2025-12-28)

v1.1.0 (2025-12-15)

v1.0.0 (2025-12-14)

📄 License

⭐ Star History

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes