Skip to main content

Qwen3 ASR model for fasr

Project description

fasr-asr-qwen3asr

Chinese documentation

Qwen3-ASR speech recognition for fasr. The plugin uses the bundled Qwen3-ASR vLLM backend and supports both batch transcription and cumulative streaming transcription on the same loaded engine.

Install

pip install fasr-asr-qwen3asr

Registered Model

Registry name Class selected by size Best for
qwen3asr Qwen3ASRSmall or Qwen3ASRLarge GPU ASR without word timestamps

Use size="small" for Qwen/Qwen3-ASR-0.6B and size="large" for Qwen/Qwen3-ASR-1.7B.

Pipeline Usage

from fasr import AudioPipeline

pipeline = (
    AudioPipeline()
    .add_pipe("detector", model="fsmn")
    .add_pipe(
        "recognizer",
        model="qwen3asr",
        size="small",
        gpu_memory_utilization=0.7,
        max_new_tokens=2048,
    )
    .add_pipe("sentencizer", model="ct_transformer")
)

Quick choices:

Goal Use Result
Lower VRAM size="small" Uses the 0.6B checkpoint
Better accuracy size="large" Uses the 1.7B checkpoint, needs more GPU memory
Leave GPU headroom gpu_memory_utilization=0.6 vLLM reserves less memory
Long-form output max_new_tokens=4096 or higher Allows longer generation
Bias vocabulary context="..." Adds prompt context for names or rare terms
Force language language="zh" Uses a fixed language hint

Confection Config

[asr_model]
@asr_models = "qwen3asr"
size = "small"
gpu_memory_utilization = 0.7
max_new_tokens = 2048
language = "zh"
context = "product names: fasr, Qwen3-ASR"

Inside a pipeline:

[pipeline]
@pipelines = "AudioPipeline.v1"
pipe_order = ["recognizer"]

[pipeline.pipes]

[pipeline.pipes.recognizer]
@pipes = "thread_pipe"
batch_size = 1

[pipeline.pipes.recognizer.component]
@components = "recognizer"

[pipeline.pipes.recognizer.component.model]
@asr_models = "qwen3asr"
size = "small"
gpu_memory_utilization = 0.7
max_new_tokens = 2048
language = "zh"
context = "product names: fasr, Qwen3-ASR"

Direct Model Usage

from fasr.config import registry

model = registry.asr_models.get("qwen3asr")(
    size="small",
    gpu_memory_utilization=0.7,
)

spans = model.transcribe(audio_spans)
for span in spans:
    print(span.text)

Streaming uses the same model instance and returns cumulative text. Consumers should overwrite the displayed partial text instead of appending deltas:

model = registry.asr_models.get("qwen3asr")(
    size="small",
    chunk_size_ms=2000,
    language="zh",
)

for chunk in audio_chunks:
    span = model.push_chunk(chunk)
    if span is not None:
        print(span.text)

Parameters

Parameter Type / range Default Higher / true Lower / false Change when
size "small" or "large" "small" "large" improves capacity, needs more VRAM "small" is cheaper Accuracy or resource budget changes
gpu_memory_utilization float, (0, 1] 0.8 More KV-cache headroom, more VRAM reserved Leaves more VRAM for other processes vLLM OOMs or underuses GPU
max_new_tokens int >= 1 4096 Longer outputs, more decode work Shorter cap, less compute Text is truncated or memory is tight
max_inference_batch_size int, -1 or positive -1 -1 lets backend choose Smaller cap reduces peak memory Batch inference OOMs
max_model_len int or None None Longer prompt+generation context Shorter context, less memory Long prompts or OOMs
language str or None None Fixed language hint Auto language behavior Language is known
context str "" More biasing context Less prompt bias Rare terms or names are missed
chunk_size_ms int or None None Less frequent streaming decode More responsive streaming Streaming latency/throughput tuning
unfixed_chunk_num int or None None More initial chunks without prefix fixing Earlier prefix use Early streaming text is unstable
unfixed_token_num int or None None More rollback, more correction room Less rollback, more stable prefix Streaming revisions are too aggressive or too sticky

Generic checkpoint fields such as checkpoint, cache_dir, endpoint, revision, and force_download are inherited from the base model.

Output

  • Batch mode writes full text to span.raw_text.
  • Streaming mode returns cumulative AudioSpan(raw_text=...).
  • Word or character timestamps are not returned by this plugin.

Dependencies

  • fasr
  • transformers == 4.57.6
  • vllm == 0.14.0
  • accelerate == 1.12.0
  • librosa
  • soundfile
  • Python 3.10-3.12

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fasr_asr_qwen3asr-0.5.2.tar.gz (116.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fasr_asr_qwen3asr-0.5.2-py3-none-any.whl (119.4 kB view details)

Uploaded Python 3

File details

Details for the file fasr_asr_qwen3asr-0.5.2.tar.gz.

File metadata

  • Download URL: fasr_asr_qwen3asr-0.5.2.tar.gz
  • Upload date:
  • Size: 116.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fasr_asr_qwen3asr-0.5.2.tar.gz
Algorithm Hash digest
SHA256 242e0573ab3ecb0f49f7fecfc187d6c7daac074ba25d4bc8abe98d179173ce53
MD5 6e49fad7a1da1af6ca56eb27e3b3f7d6
BLAKE2b-256 b208e1a9c639f24c4d1132e6c222199e90e9b8870851bf194df9e085bbd955e4

See more details on using hashes here.

File details

Details for the file fasr_asr_qwen3asr-0.5.2-py3-none-any.whl.

File metadata

  • Download URL: fasr_asr_qwen3asr-0.5.2-py3-none-any.whl
  • Upload date:
  • Size: 119.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fasr_asr_qwen3asr-0.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d9115c8baf82ab9cf411be233da4a1a30967797d5491b344b208e6d384315aa9
MD5 21f5789f78ba8a4bbfdad72d57266942
BLAKE2b-256 e529d96f03db92e234a08eafda0c94ab0093714b6eb9c095d6cebeb2892c73b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page