Qwen3 ASR model for fasr

Project description

fasr-asr-qwen3asr

Qwen3-ASR speech recognition for fasr. The plugin uses the bundled Qwen3-ASR vLLM backend and supports both batch transcription and cumulative streaming transcription on the same loaded engine.

Install

pip install fasr-asr-qwen3asr

Registered Model

Registry name	Class selected by `size`	Best for
`qwen3asr`	`Qwen3ASRSmall` or `Qwen3ASRLarge`	GPU ASR without word timestamps

Use size="small" for Qwen/Qwen3-ASR-0.6B and size="large" for Qwen/Qwen3-ASR-1.7B.

Pipeline Usage

from fasr import AudioPipeline

pipeline = (
    AudioPipeline()
    .add_pipe("detector", model="fsmn")
    .add_pipe(
        "recognizer",
        model="qwen3asr",
        size="small",
        gpu_memory_utilization=0.7,
        max_new_tokens=2048,
    )
    .add_pipe("sentencizer", model="ct_transformer")
)

Quick choices:

Goal	Use	Result
Lower VRAM	`size="small"`	Uses the 0.6B checkpoint
Better accuracy	`size="large"`	Uses the 1.7B checkpoint, needs more GPU memory
Leave GPU headroom	`gpu_memory_utilization=0.6`	vLLM reserves less memory
Long-form output	`max_new_tokens=4096` or higher	Allows longer generation
Bias vocabulary	`context="..."`	Adds prompt context for names or rare terms
Force language	`language="zh"`	Uses a fixed language hint

Confection Config

[asr_model]
@asr_models = "qwen3asr"
size = "small"
gpu_memory_utilization = 0.7
max_new_tokens = 2048
language = "zh"
context = "product names: fasr, Qwen3-ASR"

Inside a pipeline:

[pipeline]
@pipelines = "AudioPipeline.v1"
pipe_order = ["recognizer"]

[pipeline.pipes]

[pipeline.pipes.recognizer]
@pipes = "thread_pipe"
batch_size = 1

[pipeline.pipes.recognizer.component]
@components = "recognizer"

[pipeline.pipes.recognizer.component.model]
@asr_models = "qwen3asr"
size = "small"
gpu_memory_utilization = 0.7
max_new_tokens = 2048
language = "zh"
context = "product names: fasr, Qwen3-ASR"

Direct Model Usage

from fasr.config import registry

model = registry.asr_models.get("qwen3asr")(
    size="small",
    gpu_memory_utilization=0.7,
)

spans = model.transcribe(audio_spans)
for span in spans:
    print(span.text)

Streaming uses the same model instance and returns cumulative text. Consumers should overwrite the displayed partial text instead of appending deltas:

model = registry.asr_models.get("qwen3asr")(
    size="small",
    chunk_size_ms=2000,
    language="zh",
)

for chunk in audio_chunks:
    span = model.push_chunk(chunk)
    if span is not None:
        print(span.text)

Parameters

Parameter	Type / range	Default	Higher / true	Lower / false	Change when
`size`	`"small"` or `"large"`	`"small"`	`"large"` improves capacity, needs more VRAM	`"small"` is cheaper	Accuracy or resource budget changes
`gpu_memory_utilization`	`float`, `(0, 1]`	`0.8`	More KV-cache headroom, more VRAM reserved	Leaves more VRAM for other processes	vLLM OOMs or underuses GPU
`max_new_tokens`	`int >= 1`	`4096`	Longer outputs, more decode work	Shorter cap, less compute	Text is truncated or memory is tight
`max_inference_batch_size`	`int`, `-1` or positive	`-1`	`-1` lets backend choose	Smaller cap reduces peak memory	Batch inference OOMs
`max_model_len`	`int` or `None`	`None`	Longer prompt+generation context	Shorter context, less memory	Long prompts or OOMs
`language`	`str` or `None`	`None`	Fixed language hint	Auto language behavior	Language is known
`context`	`str`	`""`	More biasing context	Less prompt bias	Rare terms or names are missed
`chunk_size_ms`	`int` or `None`	`None`	Less frequent streaming decode	More responsive streaming	Streaming latency/throughput tuning
`unfixed_chunk_num`	`int` or `None`	`None`	More initial chunks without prefix fixing	Earlier prefix use	Early streaming text is unstable
`unfixed_token_num`	`int` or `None`	`None`	More rollback, more correction room	Less rollback, more stable prefix	Streaming revisions are too aggressive or too sticky

Generic checkpoint fields such as checkpoint, cache_dir, endpoint, revision, and force_download are inherited from the base model.

Output

Batch mode writes full text to span.raw_text.
Streaming mode returns cumulative AudioSpan(raw_text=...).
Word or character timestamps are not returned by this plugin.

Dependencies

fasr
transformers == 4.57.6
vllm == 0.14.0
accelerate == 1.12.0
librosa
soundfile
Python 3.10-3.12

Project details

Release history Release notifications | RSS feed

This version

0.5.2

May 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fasr_asr_qwen3asr-0.5.2.tar.gz (116.9 kB view details)

Uploaded May 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fasr_asr_qwen3asr-0.5.2-py3-none-any.whl (119.4 kB view details)

Uploaded May 1, 2026 Python 3

File details

Details for the file fasr_asr_qwen3asr-0.5.2.tar.gz.

File metadata

Download URL: fasr_asr_qwen3asr-0.5.2.tar.gz
Upload date: May 1, 2026
Size: 116.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fasr_asr_qwen3asr-0.5.2.tar.gz
Algorithm	Hash digest
SHA256	`242e0573ab3ecb0f49f7fecfc187d6c7daac074ba25d4bc8abe98d179173ce53`
MD5	`6e49fad7a1da1af6ca56eb27e3b3f7d6`
BLAKE2b-256	`b208e1a9c639f24c4d1132e6c222199e90e9b8870851bf194df9e085bbd955e4`

See more details on using hashes here.

File details

Details for the file fasr_asr_qwen3asr-0.5.2-py3-none-any.whl.

File metadata

Download URL: fasr_asr_qwen3asr-0.5.2-py3-none-any.whl
Upload date: May 1, 2026
Size: 119.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fasr_asr_qwen3asr-0.5.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d9115c8baf82ab9cf411be233da4a1a30967797d5491b344b208e6d384315aa9`
MD5	`21f5789f78ba8a4bbfdad72d57266942`
BLAKE2b-256	`e529d96f03db92e234a08eafda0c94ab0093714b6eb9c095d6cebeb2892c73b8`

See more details on using hashes here.

fasr-asr-qwen3asr 0.5.2

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Project description

fasr-asr-qwen3asr

Install

Registered Model

Pipeline Usage

Confection Config

Direct Model Usage

Parameters

Output

Dependencies

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes