ct-transformer punctuation model for fasr
Project description
fasr-punc-ct-transformer
CT-Transformer punctuation restoration for fasr. Use it as the sentencizer
stage after ASR to split raw recognized text into punctuated AudioSpan
sentences.
Install
pip install fasr-punc-ct-transformer
Registered Model
| Registry name | Class | Best for |
|---|---|---|
ct_transformer |
CTTransformerForPunc |
Chinese and mixed Chinese-English punctuation restoration |
The default checkpoint is
iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch.
Pipeline Usage
from fasr import AudioPipeline
pipeline = (
AudioPipeline()
.add_pipe("detector", model="fsmn")
.add_pipe("recognizer", model="paraformer")
.add_pipe(
"sentencizer",
model="ct_transformer",
disable_log=True,
disable_pbar=True,
)
)
Confection Config
[punc_model]
@punc_models = "ct_transformer"
disable_update = true
disable_log = true
disable_pbar = true
Inside a pipeline:
[pipeline]
@pipelines = "AudioPipeline.v1"
pipe_order = ["sentencizer"]
[pipeline.pipes]
[pipeline.pipes.sentencizer]
@pipes = "thread_pipe"
[pipeline.pipes.sentencizer.component]
@components = "sentencizer"
[pipeline.pipes.sentencizer.component.model]
@punc_models = "ct_transformer"
disable_update = true
disable_log = true
disable_pbar = true
Direct Model Usage
from fasr.config import registry
model = registry.punc_models.get("ct_transformer")()
sentences = model.restore("今天天气真好我想出去玩你觉得呢")
for sentence in sentences:
print(sentence.text)
Use local weights:
model.load_checkpoint("/path/to/ct-transformer")
Parameters
| Parameter | Type / range | Default | true |
false |
Change when |
|---|---|---|---|---|---|
disable_update |
bool |
True |
Skips FunASR checkpoint update checks | Lets FunASR check for updates | You need reproducible startup or want update checks |
disable_log |
bool |
True |
Suppresses backend logs | Shows backend logs | Debugging model loading or inference |
disable_pbar |
bool |
True |
Hides progress bars | Shows progress bars | Interactive scripts where progress output is useful |
Generic checkpoint fields such as checkpoint, cache_dir, endpoint,
revision, and force_download are inherited from the base model.
Notes
restore(text)returns anAudioSpanList, not a plain string.- Input text should already be recognized text. This plugin does not run ASR.
- For pipeline usage, put this model on the
sentencizercomponent.
Dependencies
fasrfunasr- Python 3.10-3.12
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fasr_punc_ct_transformer-0.5.2.tar.gz.
File metadata
- Download URL: fasr_punc_ct_transformer-0.5.2.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ebc9b424c3c4fb44a2c56f9deef13922fb53ad6abf13352cda019709b26715a9
|
|
| MD5 |
236708c0e9c1359144a1eca2e2462e9d
|
|
| BLAKE2b-256 |
59ae11c3b3a288519dc71d616621396dafc6242e1a3c9f4c8d45acb3265f5144
|
File details
Details for the file fasr_punc_ct_transformer-0.5.2-py3-none-any.whl.
File metadata
- Download URL: fasr_punc_ct_transformer-0.5.2-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c833d116a795b8030154d3f9d5cc97bf43ff23e59ed806b20b1582faf13e231
|
|
| MD5 |
07329d99d3122010dfad2741eb6a8163
|
|
| BLAKE2b-256 |
d8ba694ecc20371251765f7d2d2cec63ab9eb43b88044e6e01c581d5ee8117a1
|