Blazingly fast, open-source audio-to-text transcription library with parallel processing - turn podcasts, videos, and lectures into AI-ready text at 120x lower cost than cloud services.
Project description
Welcome to audinota Documentation
Audinota (Latin for “taking notes from audio”) is a lightweight, high-performance Python library designed for fast audio-to-text transcription. Built specifically for extracting textual information from audio content, Audinota enables you to leverage AI-powered text analysis, summarization, and processing on your audio data.
The library is built on top of the proven faster-whisper open-source framework and features intelligent automatic audio segmentation with parallel processing capabilities. By automatically chunking large audio files and utilizing multiple CPU cores, Audinota delivers exceptional transcription speed while maintaining accuracy.
Audinota follows a “deadly simple” philosophy - it focuses exclusively on pure audio-to-text conversion without the complexity of subtitle generation or timestamp management. This streamlined approach makes it ideal for information-dense audio content such as research videos, podcasts, lectures, and educational materials.
The project was inspired by real-world research workflows where rapid consumption and analysis of valuable audio content from YouTube videos, podcasts, and other sources is essential. Whether you’re a researcher, content creator, or data analyst, Audinota helps you quickly transform audio insights into actionable text for further AI-powered processing and analysis.
💰 Massive Cost Savings: While AWS Transcribe costs $0.024/minute ($1.44/hour), Audinota can be deployed on AWS Lambda for approximately $0.0002/minute - making it 120x cheaper than commercial transcription services. This dramatic cost reduction enables researchers, content creators, and businesses to extract valuable insights from extensive audio archives and YouTube content libraries without breaking the budget. Transform hours of audio content into actionable text for AI analysis, knowledge extraction, and content research at a fraction of traditional cloud service costs.
Quick Start
Audinota makes audio transcription incredibly simple with just a few lines of code:
import io
from pathlib import Path
from audinota.api import transcribe_audio_in_parallel
# Transcribe any audio file to text
text = transcribe_audio_in_parallel(
audio=io.BytesIO(Path("podcast_episode.mp3").read_bytes()),
)
print(text)
What happens under the hood:
Automatic Format Detection: Audinota automatically handles popular audio formats including MP3, MP4, WAV, M4A, FLAC, OGG, and more
Language Detection: The system automatically detects the spoken language without requiring manual specification
Smart Segmentation: Large audio files are intelligently chunked into optimal segments for processing
Parallel Processing: Multiple CPU cores work simultaneously on different audio segments for maximum speed
Text Assembly: All transcribed segments are seamlessly combined into a single, coherent text output
The entire process is optimized for speed and accuracy, typically processing hours of audio content in just minutes while maintaining high transcription quality across different languages and audio conditions.
Command Line Interface
Audinota provides a powerful command-line interface for easy audio transcription without writing code:
Basic Usage
# Simple transcription - output saved next to input file
$ audinota transcribe --input="podcast.mp3"
# Specify output directory
$ audinota transcribe --input="lecture.mp4" --output="/path/to/transcripts/"
# Specify exact output file
$ audinota transcribe --input="interview.wav" --output="result.txt"
# Overwrite existing files
$ audinota transcribe --input="audio.m4a" --output="existing.txt" --overwrite
Parameters
- –input (required)
Path to the input audio file. Supports popular formats including:
MP3, MP4, M4A (most common)
WAV, FLAC, OGG (uncompressed/lossless)
And many more formats supported by faster-whisper
- –output (optional)
Controls where the transcription is saved:
Not specified: Creates a .txt file next to the input file
$ audinota transcribe --input="podcast.mp3" # Creates: podcast.txtDirectory path: Creates a .txt file in the specified directory
$ audinota transcribe --input="podcast.mp3" --output="/transcripts/" # Creates: /transcripts/podcast.txtFile path: Uses the exact specified file path
$ audinota transcribe --input="podcast.mp3" --output="my_transcript.txt" # Creates: my_transcript.txt
- –overwrite (optional, default: False)
Boolean flag that controls file overwriting behavior:
False (default): If output file exists, shows error and stops
True: Overwrites existing output files without asking
File Conflict Resolution
Audinota intelligently handles file name conflicts:
- Automatic Numbering
When output goes to a directory and files already exist:
$ audinota transcribe --input="audio.mp3" --output="/transcripts/" # If /transcripts/audio.txt exists, creates /transcripts/audio_01.txt # If both exist, creates /transcripts/audio_02.txt, etc.- File Path Conflicts
When –output specifies an existing file:
$ audinota transcribe --input="audio.mp3" --output="existing.txt" # Error: Output file 'existing.txt' already exists. Use --overwrite $ audinota transcribe --input="audio.mp3" --output="existing.txt" --overwrite # Overwrites existing.txt
Real-World Examples
# Transcribe a podcast episode
$ audinota transcribe --input="episode_042.mp3"
# Output: episode_042.txt
# Batch processing to organized directory
$ mkdir transcripts
$ audinota transcribe --input="meeting_2024_01.m4a" --output="transcripts/"
$ audinota transcribe --input="meeting_2024_02.m4a" --output="transcripts/"
# Output: transcripts/meeting_2024_01.txt, transcripts/meeting_2024_02.txt
# Process lecture with custom naming
$ audinota transcribe --input="cs101_lecture.mp4" --output="notes/week1_lecture.txt"
# Replace previous transcription
$ audinota transcribe --input="revised_audio.wav" --output="final_transcript.txt" --overwrite
Performance Features
The CLI automatically provides:
🚀Parallel Processing: Utilizes all CPU cores for maximum speed
🧠Smart Segmentation: Automatically splits large files for optimal processing
🌍Language Detection: Automatically detects spoken language
📊Progress Feedback: Real-time status updates with emoji indicators
🔍Format Detection: Handles various audio formats without configuration
$ audinota transcribe --input="long_podcast.mp3"
🎵 Transcribing audio file: long_podcast.mp3
📝 Output will be saved to: long_podcast.txt
🔄 Loading audio data...
🚀 Starting parallel transcription...
💾 Saving transcription...
✅ Transcription completed successfully!
📄 Output saved to: file:///path/to/long_podcast.txt
📊 Text length: 15,847 characters
Install
audinota is released on PyPI, so all you need is to:
$ pip install audinota
To upgrade to latest version:
$ pip install --upgrade audinota
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file audinota-0.1.1.tar.gz.
File metadata
- Download URL: audinota-0.1.1.tar.gz
- Upload date:
- Size: 15.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36a9d71a6edff1af543ee893b00e1a0598651aaecf317b75879c6c085499a9cf
|
|
| MD5 |
698256e94ff8088ced2730d36efaa9dc
|
|
| BLAKE2b-256 |
2238572b8e4ed8b4c79b810e993907049b388fc84f8e830a1e25d34ff3b100c8
|
File details
Details for the file audinota-0.1.1-py3-none-any.whl.
File metadata
- Download URL: audinota-0.1.1-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec43e99700be8192b6f1142902e43ef3a50a9ef8e550a46b877d6ff25278ffbc
|
|
| MD5 |
d185d03eee037a199c44f6eb4e7ec01a
|
|
| BLAKE2b-256 |
fd0dc07873be3744d9e26e80a7b2cdf27202ab4c6ffbe56f71af5802bc1af9ae
|