FireRed ASR for fasr (bundled fireredasr2 inference)
Project description
fasr-asr-firered
FireRedASR2 speech recognition for fasr. The plugin exposes both AED decoding and LLM decoding. AED can return token timestamps; LLM focuses on full-text accuracy without timestamps.
Install
pip install fasr-asr-firered
Registered Models
| Registry name | Class | Best for |
|---|---|---|
firered |
FireRedAEDForASR |
Default alias for AED mode |
firered_aed |
FireRedAEDForASR |
Timestamped AED recognition |
firered_llm |
FireRedLLMForASR |
LLM decoding, no timestamps |
Default checkpoints:
| Model | Checkpoint |
|---|---|
firered_aed |
FireRedTeam/FireRedASR2-AED |
firered_llm |
FireRedTeam/FireRedASR2-LLM |
Pipeline Usage
from fasr import AudioPipeline
pipeline = (
AudioPipeline()
.add_pipe("detector", model="fsmn")
.add_pipe(
"recognizer",
model="firered_aed",
device="cuda",
beam_size=3,
return_timestamp=True,
)
.add_pipe("sentencizer", model="ct_transformer")
)
Quick choices:
| Goal | Use | Result |
|---|---|---|
| Token timestamps | model="firered_aed", return_timestamp=True |
Populates span.tokens |
| Full-text decoding | model="firered_llm" |
Populates span.raw_text, no timestamps |
| Lower VRAM for AED | use_half=True |
FP16 inference on GPU |
| CPU inference | device="cpu" |
Runs without CUDA, slower |
| Wider search | beam_size=5 |
Potentially better accuracy, slower |
Confection Config
[asr_model]
@asr_models = "firered_aed"
device = "cuda"
beam_size = 3
return_timestamp = true
use_half = true
Inside a pipeline:
[pipeline]
@pipelines = "AudioPipeline.v1"
pipe_order = ["recognizer"]
[pipeline.pipes]
[pipeline.pipes.recognizer]
@pipes = "thread_pipe"
batch_size = 2
[pipeline.pipes.recognizer.component]
@components = "recognizer"
[pipeline.pipes.recognizer.component.model]
@asr_models = "firered_aed"
device = "cuda"
beam_size = 3
return_timestamp = true
Direct Model Usage
from fasr.config import registry
model = registry.asr_models.get("firered_aed")(
device="cuda",
beam_size=3,
return_timestamp=True,
)
spans = model.transcribe(audio_spans)
for span in spans:
print(span.text)
Use local weights:
model.load_checkpoint("/path/to/FireRedASR2-AED")
Shared Parameters
| Parameter | Type / range | Default | Higher / true | Lower / false | Change when |
|---|---|---|---|---|---|
device |
str or None |
None |
"cuda" uses GPU |
"cpu" uses CPU |
Deployment target changes |
beam_size |
int >= 1 |
3 |
Wider search, slower, more memory | Faster, possibly lower accuracy | Accuracy/speed tradeoff |
decode_max_len |
int >= 0 |
0 |
Allows longer outputs | Shorter cap; 0 lets backend decide |
Output is truncated or too long |
AED Parameters
| Parameter | Type / range | Default | Higher / true | Lower / false | Change when |
|---|---|---|---|---|---|
use_half |
bool |
True |
Lower VRAM, faster on GPU | FP32, more stable | GPU memory or numeric stability matters |
nbest |
int >= 1 |
1 |
More hypotheses | Single best result | You need alternative hypotheses |
softmax_smoothing |
float |
1.25 |
Smoother distribution | Sharper distribution | Beam search needs tuning |
aed_length_penalty |
float |
0.6 |
Favors different output lengths | Less length adjustment | Output is too short or too long |
eos_penalty |
float |
1.0 |
Discourages ending too early | Easier EOS | Decoding ends too early or too late |
return_timestamp |
bool |
True |
Returns token timestamps | Text only | You need word/character timing |
elm_weight |
float |
0.0 |
More external LM influence | 0.0 disables external LM |
You provide elm_dir |
LLM Parameters
| Parameter | Type / range | Default | Higher value | Lower value | Change when |
|---|---|---|---|---|---|
decode_min_len |
int >= 0 |
0 |
Forces longer minimum output | Allows shorter output | Output ends too early |
repetition_penalty |
float |
1.2 |
Stronger repetition suppression | Allows more repetition | Repeated phrases appear |
llm_length_penalty |
float |
0.0 |
Adjusts length preference | Less length adjustment | Output length is biased |
temperature |
float >= 0 |
1.0 |
More diverse, less deterministic | More deterministic | You need stability or diversity |
Generic checkpoint fields such as checkpoint, cache_dir, endpoint,
revision, and force_download are inherited from the base model.
Output
- AED writes
span.raw_text. - AED also fills
span.tokenswhenreturn_timestamp=True. - LLM writes
span.raw_textand leavesspan.tokensempty.
Dependencies
fasrtorch >= 2.0.0torchaudiotransformers >= 4.36librosa >= 0.10.0kaldiio >= 2.18.0kaldi-native-fbank >= 1.19.0- Python 3.10-3.12
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fasr_asr_firered-0.5.2.tar.gz.
File metadata
- Download URL: fasr_asr_firered-0.5.2.tar.gz
- Upload date:
- Size: 29.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9981a4dc78c25d953c4b5320e52049707256061e66dd7a76283361ad316e213a
|
|
| MD5 |
a130d1653c243d6633c8f8221f7af710
|
|
| BLAKE2b-256 |
612a708572e25b6ee151f9628c6adcc86f8b0675f793023c29ba9c9f5ad397c5
|
File details
Details for the file fasr_asr_firered-0.5.2-py3-none-any.whl.
File metadata
- Download URL: fasr_asr_firered-0.5.2-py3-none-any.whl
- Upload date:
- Size: 36.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
077e43ffce592778d7dce1c6a5d70d0a0f70b09e3bf07f0a0c79a71e88c33a73
|
|
| MD5 |
fbce9796d927135fe3ceda11ace84eba
|
|
| BLAKE2b-256 |
55e5c25e402a2f7141d5a20a70310704d62c2721b88270b6923c0d7f63a6174a
|