A benchmarking and analysis framework for Russian ASR models

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

AKatsnelson

These details have not been verified by PyPI

Project description

🌱 plantain2asr

Benchmarking and analysis framework for Russian ASR models.

plantain2asr lets you compare ASR backends, normalize transcripts, compute metrics, inspect errors, benchmark latency, and export research artifacts without losing the underlying composable pipeline model.

Start Here

There are three entry points, ordered from simplest to most flexible:

Interactive Constructor: open the docs constructor and assemble a ready-made chain visually.
Experiment facade: run common research scenarios with a few high-level calls.
>> pipeline API: build custom chains from datasets, models, normalizers, metrics, reports, and analyzers.

Docs: akatsnelson.github.io/plantain2asr

Install

# Core only: datasets, normalization, metrics, reports
pip install plantain2asr

# Common CPU-only local stack
pip install plantain2asr[asr-cpu]

# Common GPU-ready local stack
pip install plantain2asr[asr-gpu]

# Individual model families
pip install plantain2asr[gigaam]
pip install plantain2asr[whisper]
pip install plantain2asr[vosk]
pip install plantain2asr[canary]
pip install plantain2asr[tone]
pip install "tone @ https://github.com/voicekit-team/T-one/archive/3c5b6c015038173840e62cea99e10cdb1c759116.tar.gz"

# Research analysis tools
pip install plantain2asr[analysis]

# Everything
pip install plantain2asr[all]

Device selection is automatic where supported: NVIDIA GPU first, then MPS, then CPU.

Recommended Quick Start

For most research workflows, start with the >> pipeline:

from plantain2asr import GolosDataset, Models, SimpleNormalizer, Metrics

ds = GolosDataset("data/golos")

ds >> Models.GigaAM_v3()
ds >> Models.Whisper()

norm = ds >> SimpleNormalizer()
norm >> Metrics.composite()

df = norm.to_pandas()
print(df.groupby("model")[["WER", "CER", "Accuracy"]].mean().sort_values("WER"))

If you want ready-made research scenarios on top of the same building blocks, use Experiment:

Experiment.compare_on_corpus() for straightforward model comparison
Experiment.prepare_thesis_tables() for publication-ready aggregate tables
Experiment.export_appendix_bundle() for a full appendix bundle with exports and optional static report
Experiment.benchmark_models() for latency, throughput, and RTF measurement

Advanced Pipeline API

If you want full composability, the canonical chain is still:

from plantain2asr import GolosDataset, Models, DagrusNormalizer, Metrics, ReportServer

ds = GolosDataset("data/golos")
ds >> Models.GigaAM_v3()
ds >> Models.Whisper()

norm = ds >> DagrusNormalizer()
norm >> Metrics.composite()

ReportServer(norm, audio_dir="data/golos").serve()

Pipeline rules:

Every step returns a dataset or processor-compatible object.
Normalization creates a new dataset view and does not mutate the original.
Model results are cached and safe to resume.
You can branch at any point with filter(), take(), or cloned views.

Supported Models

Call	Stored name	Extra	Device
`Models.GigaAM_v3()`	`GigaAM-v3-e2e_rnnt`	`gigaam`	CUDA / MPS / CPU
`Models.GigaAM_v3(model_name="e2e_ctc")`	`GigaAM-v3-e2e_ctc`	`gigaam`	CUDA / MPS / CPU
`Models.GigaAM_v3(model_name="rnnt")`	`GigaAM-v3-rnnt`	`gigaam`	CUDA / MPS / CPU
`Models.GigaAM_v3(model_name="ctc")`	`GigaAM-v3-ctc`	`gigaam`	CUDA / MPS / CPU
`Models.GigaAM_v2(model_name="v2_rnnt")`	`GigaAM-v2_rnnt`	`gigaam`	CUDA / MPS / CPU
`Models.GigaAM_v2(model_name="v2_ctc")`	`GigaAM-v2_ctc`	`gigaam`	CUDA / MPS / CPU
`Models.Whisper()`	`Whisper-whisper-large-v3-ru-podlodka`	`whisper`	CUDA / MPS / CPU
`Models.Tone()`	`T-One`	`tone` + T-One source archive	CUDA / CPU
`Models.Vosk(model_path=...)`	`Vosk`	`vosk`	CPU
`Models.Canary()`	`Canary-1B`	`canary`	CUDA
`Models.SaluteSpeech()`	`SaluteSpeech`	none	cloud

You can also resolve models by user-facing names with Models.create(...), including case and separator variants such as "gigaam_v3", "GigaAM-v3", or "tone".

Typical Research Outputs

Metrics tables as Python dicts or pandas DataFrames
Leaderboards sorted by a chosen metric
Error-case tables and CSV exports
Static HTML reports for sharing without a running server
Appendix bundles with thesis-ready artifacts
Benchmark summaries for CPU, CUDA, or MPS

Extending

plantain2asr keeps the "plantain" idea of modular composition. If the built-in stack is not enough, extend one of the four base types:

BaseASRModel
BaseNormalizer
BaseMetric
BaseSection

See the docs extending guides for custom components and implementation patterns.

License

MIT — Artem Katsnelson

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

AKatsnelson

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.4

Apr 1, 2026

0.1.4

Feb 22, 2026

0.1.3

Feb 22, 2026

0.1.2

Feb 22, 2026

0.1.1

Feb 22, 2026

0.1.0

Feb 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plantain2asr-1.0.4.tar.gz (137.3 kB view details)

Uploaded Apr 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

plantain2asr-1.0.4-py3-none-any.whl (163.1 kB view details)

Uploaded Apr 1, 2026 Python 3

File details

Details for the file plantain2asr-1.0.4.tar.gz.

File metadata

Download URL: plantain2asr-1.0.4.tar.gz
Upload date: Apr 1, 2026
Size: 137.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for plantain2asr-1.0.4.tar.gz
Algorithm	Hash digest
SHA256	`e36527a629fb9a99f8ed1390aa9c92d0f2d9e8bf14cbaf1956639fe97948f9b6`
MD5	`874cb2af41d7373b8104eb7c4c6b3ec3`
BLAKE2b-256	`2db74d1fec335ea8dc26a50385cfa7811da114d29df06dd27ac694f665f97454`

See more details on using hashes here.

File details

Details for the file plantain2asr-1.0.4-py3-none-any.whl.

File metadata

Download URL: plantain2asr-1.0.4-py3-none-any.whl
Upload date: Apr 1, 2026
Size: 163.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for plantain2asr-1.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d383092af203298184668909725de2494014287595532f010d61063d9b528d95`
MD5	`e081e5940ac85f53db3af3eeb5de9843`
BLAKE2b-256	`4e8da088ce7e6574c25985865a5087a0b16d840709346eab9d248fe7e841d574`

See more details on using hashes here.

plantain2asr 1.0.4

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🌱 plantain2asr

Start Here

Install

Recommended Quick Start

Advanced Pipeline API

Supported Models

Typical Research Outputs

Extending

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes