Qwen3 ASR model for fasr
Project description
fasr-asr-qwen3asr
Qwen3-ASR speech recognition for fasr. The plugin uses the bundled Qwen3-ASR vLLM backend and supports both batch transcription and cumulative streaming transcription on the same loaded engine.
Install
pip install fasr-asr-qwen3asr
Registered Model
| Registry name | Class selected by size |
Best for |
|---|---|---|
qwen3asr |
Qwen3ASRSmall or Qwen3ASRLarge |
GPU ASR without word timestamps |
Use size="small" for Qwen/Qwen3-ASR-0.6B and size="large" for
Qwen/Qwen3-ASR-1.7B.
Pipeline Usage
from fasr import AudioPipeline
pipeline = (
AudioPipeline()
.add_pipe("detector", model="fsmn")
.add_pipe(
"recognizer",
model="qwen3asr",
size="small",
gpu_memory_utilization=0.7,
max_new_tokens=2048,
)
.add_pipe("sentencizer", model="ct_transformer")
)
Quick choices:
| Goal | Use | Result |
|---|---|---|
| Lower VRAM | size="small" |
Uses the 0.6B checkpoint |
| Better accuracy | size="large" |
Uses the 1.7B checkpoint, needs more GPU memory |
| Leave GPU headroom | gpu_memory_utilization=0.6 |
vLLM reserves less memory |
| Long-form output | max_new_tokens=4096 or higher |
Allows longer generation |
| Bias vocabulary | context="..." |
Adds prompt context for names or rare terms |
| Force language | language="zh" |
Uses a fixed language hint |
Confection Config
[asr_model]
@asr_models = "qwen3asr"
size = "small"
gpu_memory_utilization = 0.7
max_new_tokens = 2048
language = "zh"
context = "product names: fasr, Qwen3-ASR"
Inside a pipeline:
[pipeline]
@pipelines = "AudioPipeline.v1"
pipe_order = ["recognizer"]
[pipeline.pipes]
[pipeline.pipes.recognizer]
@pipes = "thread_pipe"
batch_size = 1
[pipeline.pipes.recognizer.component]
@components = "recognizer"
[pipeline.pipes.recognizer.component.model]
@asr_models = "qwen3asr"
size = "small"
gpu_memory_utilization = 0.7
max_new_tokens = 2048
language = "zh"
context = "product names: fasr, Qwen3-ASR"
Direct Model Usage
from fasr.config import registry
model = registry.asr_models.get("qwen3asr")(
size="small",
gpu_memory_utilization=0.7,
)
spans = model.transcribe(audio_spans)
for span in spans:
print(span.text)
Streaming uses the same model instance and returns cumulative text. Consumers should overwrite the displayed partial text instead of appending deltas:
model = registry.asr_models.get("qwen3asr")(
size="small",
chunk_size_ms=2000,
language="zh",
)
for chunk in audio_chunks:
span = model.push_chunk(chunk)
if span is not None:
print(span.text)
Parameters
| Parameter | Type / range | Default | Higher / true | Lower / false | Change when |
|---|---|---|---|---|---|
size |
"small" or "large" |
"small" |
"large" improves capacity, needs more VRAM |
"small" is cheaper |
Accuracy or resource budget changes |
gpu_memory_utilization |
float, (0, 1] |
0.8 |
More KV-cache headroom, more VRAM reserved | Leaves more VRAM for other processes | vLLM OOMs or underuses GPU |
max_new_tokens |
int >= 1 |
4096 |
Longer outputs, more decode work | Shorter cap, less compute | Text is truncated or memory is tight |
max_inference_batch_size |
int, -1 or positive |
-1 |
-1 lets backend choose |
Smaller cap reduces peak memory | Batch inference OOMs |
max_model_len |
int or None |
None |
Longer prompt+generation context | Shorter context, less memory | Long prompts or OOMs |
language |
str or None |
None |
Fixed language hint | Auto language behavior | Language is known |
context |
str |
"" |
More biasing context | Less prompt bias | Rare terms or names are missed |
chunk_size_ms |
int or None |
None |
Less frequent streaming decode | More responsive streaming | Streaming latency/throughput tuning |
unfixed_chunk_num |
int or None |
None |
More initial chunks without prefix fixing | Earlier prefix use | Early streaming text is unstable |
unfixed_token_num |
int or None |
None |
More rollback, more correction room | Less rollback, more stable prefix | Streaming revisions are too aggressive or too sticky |
Generic checkpoint fields such as checkpoint, cache_dir, endpoint,
revision, and force_download are inherited from the base model.
Output
- Batch mode writes full text to
span.raw_text. - Streaming mode returns cumulative
AudioSpan(raw_text=...). - Word or character timestamps are not returned by this plugin.
Dependencies
fasrtransformers == 4.57.6vllm == 0.14.0accelerate == 1.12.0librosasoundfile- Python 3.10-3.12
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fasr_asr_qwen3asr-0.5.2.tar.gz.
File metadata
- Download URL: fasr_asr_qwen3asr-0.5.2.tar.gz
- Upload date:
- Size: 116.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
242e0573ab3ecb0f49f7fecfc187d6c7daac074ba25d4bc8abe98d179173ce53
|
|
| MD5 |
6e49fad7a1da1af6ca56eb27e3b3f7d6
|
|
| BLAKE2b-256 |
b208e1a9c639f24c4d1132e6c222199e90e9b8870851bf194df9e085bbd955e4
|
File details
Details for the file fasr_asr_qwen3asr-0.5.2-py3-none-any.whl.
File metadata
- Download URL: fasr_asr_qwen3asr-0.5.2-py3-none-any.whl
- Upload date:
- Size: 119.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9115c8baf82ab9cf411be233da4a1a30967797d5491b344b208e6d384315aa9
|
|
| MD5 |
21f5789f78ba8a4bbfdad72d57266942
|
|
| BLAKE2b-256 |
e529d96f03db92e234a08eafda0c94ab0093714b6eb9c095d6cebeb2892c73b8
|