Turn language-learning audio into Anki-ready study materials.

These details have not been verified by PyPI

Project links

Project description

audawispr

Split audio files into high-quality sentence-based learning materials.

Features

Transcription — Local speech-to-text via faster-whisper; no API keys required.
Segmentation — Splits transcriptions into sentence-level segments using punctuation, pauses, and duration bounds.
Enrichment — French IPA transcription (optional). Translation scaffolding is in place but not yet implemented.
Clipping — Extracts audio snippets for each segment using FFmpeg (bundled via static-ffmpeg).
Export — Outputs Anki-compatible CSV or native .apkg packages with embedded audio.
One-shot CLI — Runs the full pipeline with a single command.

Requirements

Python 3.11+
FFmpeg is bundled via the static-ffmpeg dependency — no separate installation is needed. Run audawispr doctor to verify availability.
uv is only required for local development (see Setup).

Setup

Install audawispr from PyPI:

pip install audawispr

Or with uv:

uv pip install audawispr

After installing, run audawispr directly. Use uv run audawispr only when working in a cloned repository.

For local development, install runtime and development dependencies:

uv sync --dev

Quickstart

Turn an audio file into an Anki deck with one command:

audawispr lesson.mp3 --output deck.apkg --language fr --ipa

Or use the Python API:

from pathlib import Path
from audawispr import Pipeline

Pipeline(
    output=Path("deck.apkg"),
    language="fr",
    ipa=True,
).run(Path("lesson.mp3"))

The one-shot command runs transcription, segmentation, enrichment, clipping, and export in sequence.

Usage

This section and Quickstart use the bare audawispr command. For development, prefix with uv run.

Show the CLI help:

audawispr --help

Show the installed package version:

audawispr --version

Check local runtime readiness:

audawispr doctor

audawispr doctor reports the package version, Python version, and whether FFmpeg and FFprobe are available from AUDAWISPR_FFMPEG, AUDAWISPR_FFPROBE, PATH, or the static-ffmpeg fallback.

Transcribe audio locally into a transcript manifest:

audawispr transcribe lesson.mp3 --output out/transcript.json --language fr

Validate an existing transcript manifest:

audawispr validate out/transcript.json

Segment a transcript manifest and write an inspection TSV:

audawispr segment out/transcript.json --output out/segments.json

Enrich a segmented French manifest with IPA:

audawispr enrich out/segments.json --ipa --output out/enriched.json

transcribe defaults to French, the small faster-whisper model, automatic device selection, int8 compute, VAD enabled, and required word timestamps. The first real transcription may download model files, but no API key is required. Tests and CI use fakes and do not download models.

segment preserves the transcript manifest schema and rebuilds only the segment list. It splits on sentence punctuation, pauses, and duration bounds. By default, it writes out/segments.tsv next to the JSON output; use --inspection-tsv path/to/review.tsv to choose a different TSV path.

enrich preserves timestamps, words, and source metadata while adding optional study fields. IPA (--ipa) is available for French. Translation is not yet implemented; pass --translate none (the default) to skip it. The pipeline works with any language faster-whisper supports — IPA is the only French-specific feature.

Clip audio snippets from a segmented manifest:

audawispr clip out/enriched.json --output out/clipped.json --output-dir out/media

clip reads a segmented or enriched manifest, extracts each segment's audio from the source file using FFmpeg, and writes the clipped manifest with audio_file paths. By default it reuses existing snippets; use --force to re-clip. Padding (--padding-before-ms, --padding-after-ms), format (--format), and bitrate (--bitrate) are configurable.

Export a clipped manifest for Anki import:

audawispr export out/clipped.json --format anki-csv --output out/anki-csv

export reads a clipped manifest, copies audio snippets, and writes out/anki-csv/cards.csv with columns Sentence, Audio, IPA, Translation, SourceFile, TimestampRange, and SegmentId. Audio references use Anki's [sound:...] syntax.

Manual import in Anki Desktop: File → Import → select cards.csv, set "Fields separated by: Comma", and copy the media/ folder contents into your Anki collection.media folder.

Export as a native Anki package (.apkg) with embedded audio:

audawispr export out/clipped.json --output deck.apkg --deck-name "My French Deck"

When the output path ends in .apkg, the apkg format is inferred automatically. Use --deck-name to set the deck name; the default is audawispr::{language} (e.g. audawispr::fr). The resulting .apkg file can be opened directly in Anki Desktop via File → Import.

Development Checks

uv run pytest
uv run ruff check .
uv run ruff format --check .
uv run ty check src tests

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.4

May 11, 2026

0.1.3

May 10, 2026

0.1.2

May 9, 2026

0.1.1

Apr 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audawispr-0.1.4.tar.gz (150.1 kB view details)

Uploaded May 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audawispr-0.1.4-py3-none-any.whl (37.2 kB view details)

Uploaded May 11, 2026 Python 3

File details

Details for the file audawispr-0.1.4.tar.gz.

File metadata

Download URL: audawispr-0.1.4.tar.gz
Upload date: May 11, 2026
Size: 150.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for audawispr-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`98b662271ebba3c7b1da17877a1cb5b8988ea9cab1cf492aa2e06a12b99257a6`
MD5	`51235b0a54efb7f7fbcbadba3c57c1af`
BLAKE2b-256	`06bd1b80ce145dfe66c215f18d8840ff8fd3a42501f7f8dcd0fe3e894491d745`

See more details on using hashes here.

Provenance

The following attestation bundles were made for audawispr-0.1.4.tar.gz:

Publisher: release.yml on audawispr/audawispr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: audawispr-0.1.4.tar.gz
- Subject digest: 98b662271ebba3c7b1da17877a1cb5b8988ea9cab1cf492aa2e06a12b99257a6
- Sigstore transparency entry: 1508774423
- Sigstore integration time: May 11, 2026
Source repository:
- Permalink: audawispr/audawispr@f9a45d4e793649278ba305998ff0bfc0710baf95
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/audawispr
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@f9a45d4e793649278ba305998ff0bfc0710baf95
- Trigger Event: push

File details

Details for the file audawispr-0.1.4-py3-none-any.whl.

File metadata

Download URL: audawispr-0.1.4-py3-none-any.whl
Upload date: May 11, 2026
Size: 37.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for audawispr-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ab312d934c32e64609eaa7b11cc57f63fda150ab0d2d230d033e1f426cc4d088`
MD5	`0101da8b08e35e5b6837c98b13df169e`
BLAKE2b-256	`52cf48c74c66a613922662180afe2f863e3cf878b142e5f85d6b1da06f151d5a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for audawispr-0.1.4-py3-none-any.whl:

Publisher: release.yml on audawispr/audawispr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: audawispr-0.1.4-py3-none-any.whl
- Subject digest: ab312d934c32e64609eaa7b11cc57f63fda150ab0d2d230d033e1f426cc4d088
- Sigstore transparency entry: 1508774562
- Sigstore integration time: May 11, 2026
Source repository:
- Permalink: audawispr/audawispr@f9a45d4e793649278ba305998ff0bfc0710baf95
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/audawispr
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@f9a45d4e793649278ba305998ff0bfc0710baf95
- Trigger Event: push

audawispr 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

audawispr

Features

Requirements

Setup

Quickstart

Usage

Development Checks

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance