Audio-to-Markdown transcription optimized for AI consumption

Project description

Audium

🎧 Audio → AI‑optimized Markdown
_{Transcribe MP3/WAV/FLAC into clean, token‑efficient Markdown — ready for any LLM.}

English · Русский · 中文

✨ Why Audium?

Feed audio to an LLM. Get answers. Simple.

But raw transcripts burn tokens on noise: long timestamps, filler words, silent segments, markup that adds nothing.

Audium turns speech into the minimum viable Markdown: every character counts, nothing wasted.

🎯	⚡	🪙	👁️	🌍
3 formats	GPU‑accelerated	Token‑aware	Watch mode	~97 languages
compact, minimal, structured	2–10× real‑time on CUDA	`[MM:SS]` + VAD + filler‑strip	drop files → auto‑transcribe	tiny to large‑v3

📦 Install

pip install audium-md

Requires ffmpeg on your system: sudo apt install ffmpeg / brew install ffmpeg

🚀 Quick Start

# Process a folder
audium run ./my-recordings/

# Single file
audium run lecture.mp3

# Watch folder — auto‑transcribe new files
audium watch ./incoming/

# See what you've transcribed
audium list

# Change model
audium config set model large-v3

📝 Formats

compact (default)

# lecture.mp3 (01:23:45)

[00:00] Neural networks learn hierarchical representations
[00:04] Each layer detects increasingly abstract features
[00:08] Early layers find edges and textures
[00:12] Later layers detect objects and scenes

minimal

Neural networks learn hierarchical representations
Each layer detects increasingly abstract features
Early layers find edges and textures
Later layers detect objects and scenes

structured (requires speaker diarization)

# interview.mp3 (00:45:12)

## Alice [00:00-00:30]
Neural networks are a powerful tool. It's important to understand their limitations.

## Bob [00:30-01:15]
I completely agree. Let me walk through an example to make this concrete.

⚙️ Commands

Command	Description
`audium run <path>`	Transcribe audio files or folders
`audium watch <path>`	Watch folder and auto‑process new files
`audium list [dir]`	Show processed transcripts with file sizes
`audium config`	Show current configuration
`audium config set <key> <value>`	Change a setting
`audium config reset`	Reset to factory defaults
`audium config path`	Show config file location

Common flags for `run` and `watch`

Flag	Default	Description
`-o, --output-dir`	`./transcripts`	Where to save .md files
`-f, --format`	`compact`	`compact` / `minimal` / `structured`
`-r, --recursive`	off	Search subdirectories
`--model`	`small`	`tiny` / `base` / `small` / `medium` / `large-v3`
`--language`	`auto`	Force language code: `ru`, `en`, `zh`, ...
`--strip-fillers`	off	Remove "um", "uh", "like", "мм", "ээ", etc.
`--no-vad`	off	Disable voice activity detection
`--no-progress`	off	Hide the progress bar

🔧 Configuration

Settings are merged: CLI flags > .audium.yaml (project) > ~/.config/audium/config.yaml > defaults

# Set default model
audium config set model large-v3

# Always strip filler words
audium config set strip_fillers true

# Custom output folder
audium config set output_dir ~/Documents/transcripts

# See what you changed
audium config

# Example .audium.yaml (place in project root)
model: medium
language: ru
format: minimal
output_dir: ./transcripts

🪙 Token Optimization

Audium is built to minimize LLM token cost:

Technique	Savings
`[MM:SS]` instead of `[HH:MM:SS.mmm]`	~30% on timestamps
VAD filtering (skip silence)	15–40% on meeting recordings
Filler‑word stripping	5–10% on conversational speech
`min_segment_duration` threshold	skip noise fragments
One line per segment, no blank lines	~8% vs paragraph output

📊 Model Sizes

Model	Parameters	Speed (GPU)	Best for
tiny	39M	~32× real‑time	Quick drafts, low‑resource
base	74M	~16× real‑time	Dictation, clean audio
small	244M	~6× real‑time	General purpose
medium	769M	~2× real‑time	Accents, noisy audio
large‑v3	1.5B	~1× real‑time	Maximum accuracy

All multilingual models support the same ~97 languages. The size trades accuracy for speed.

📄 License

MIT — do whatever you want. Attribution appreciated.

Project details

Release history Release notifications | RSS feed

0.2.6

Jul 1, 2026

0.2.5

Jul 1, 2026

0.2.4

Jul 1, 2026

0.2.3

Jul 1, 2026

0.2.2

Jul 1, 2026

0.2.1

Jul 1, 2026

0.2.0

Jul 1, 2026

0.1.4

Jul 1, 2026

0.1.3

Jul 1, 2026

0.1.2

Jul 1, 2026

0.1.1

Jul 1, 2026

This version

0.1.0

Jul 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audium_md-0.1.0.tar.gz (18.1 kB view details)

Uploaded Jul 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audium_md-0.1.0-py3-none-any.whl (12.5 kB view details)

Uploaded Jul 1, 2026 Python 3

File details

Details for the file audium_md-0.1.0.tar.gz.

File metadata

Download URL: audium_md-0.1.0.tar.gz
Upload date: Jul 1, 2026
Size: 18.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for audium_md-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`9623a7312315e26a1e25b2b1d172e294fc1a8448dde3d80ed42bf69a166d69fa`
MD5	`3298508ce6d502179fdef8b0a5b582d4`
BLAKE2b-256	`84ee2d9ae14eb51c528f3f7ab77aea9c8bd1b9a86ea369c4bf3a545093ad127e`

See more details on using hashes here.

File details

Details for the file audium_md-0.1.0-py3-none-any.whl.

File metadata

Download URL: audium_md-0.1.0-py3-none-any.whl
Upload date: Jul 1, 2026
Size: 12.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for audium_md-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c8091d2ab86dc041f5b2a50b6943d9e4abc818b7625dcca4ed238cf0920f8156`
MD5	`f06c614dd1062beaf060531166740b97`
BLAKE2b-256	`15151cc0bb745121f020bdfe9a1c9720f00677355e1611a4b5aee69f5a231d6c`

See more details on using hashes here.

audium-md 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Audium

✨ Why Audium?

📦 Install

🚀 Quick Start

📝 Formats

compact (default)

minimal

structured (requires speaker diarization)

⚙️ Commands

Common flags for `run` and `watch`

🔧 Configuration

🪙 Token Optimization

📊 Model Sizes

📄 License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

audium-md 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Audium

✨ Why Audium?

📦 Install

🚀 Quick Start

📝 Formats

compact (default)

minimal

structured (requires speaker diarization)

⚙️ Commands

Common flags for run and watch

🔧 Configuration

🪙 Token Optimization

📊 Model Sizes

📄 License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Common flags for `run` and `watch`