Extract audio from media files, transcribe speech, and produce documented meeting notes

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Project description

mnemofy

mnemofy Logo

mnemofy extracts audio from media files, transcribes speech using faster-whisper, and produces structured meeting notes with topics, decisions, action items, and mentions with timestamps.

Features

🎵 Audio Extraction: Automatically extracts audio from video files using ffmpeg
🎤 Speech Transcription: Fast local transcription using faster-whisper (no API keys needed)
📝 Structured Notes: Generates Markdown notes with:
- Topics discussed with timestamps
- Decisions made with timestamps
- Action items with timestamps and @mentions
- Full transcript with timestamps
🎯 Supported Formats: aac, mp3, wav, mkv, mp4
🚀 Production Ready: Clean modular architecture, type hints, error handling

Installation

Prerequisites

Python 3.9+ is required

ffmpeg must be installed:

# Ubuntu/Debian
sudo apt install ffmpeg

# macOS
brew install ffmpeg

# Windows
# Download from https://ffmpeg.org/download.html

Install mnemofy

# Clone the repository
git clone https://github.com/tiroq/mnemofy.git
cd mnemofy

# Install the package
pip install -e .

# Or install with development dependencies
pip install -e ".[dev]"

Usage

Basic Usage

Transcribe an audio or video file:

mnemofy transcribe meeting.mp4

This will create meeting_notes.md in the same directory.

Automatic Model Selection

mnemofy automatically detects your system resources (CPU, RAM, GPU) and selects the best Whisper model that fits in your available memory:

Tiny (1.0 GB): Fastest, suitable for low-RAM systems
Base (1.5 GB): Good balance of speed and accuracy
Small (2.5 GB): Better accuracy, requires 8GB+ RAM
Medium (5.0 GB): High accuracy, requires 16GB+ RAM
Large-v3 (10.0 GB): Best accuracy, requires 32GB+ RAM with GPU

Default behavior (recommended):

mnemofy transcribe meeting.mp4
# ✓ Detects your system (RAM, GPU, CPU)
# ✓ Shows interactive menu (if in terminal)
# ✓ You select desired model with ↑↓ arrow keys
# ✓ Falls back to auto-selection on headless systems

Skip auto-detection with explicit model:

mnemofy transcribe meeting.mp4 --model tiny
# ✓ Uses tiny model directly (no detection/menu)

Headless mode (CI/automated environments):

mnemofy transcribe meeting.mp4 --auto
# ✓ Detects resources
# ✓ Auto-selects best model
# ✓ No interactive menu (suitable for cron, CI/CD)

CPU-only mode (disable GPU):

mnemofy transcribe meeting.mp4 --no-gpu
# ✓ Forces CPU-based transcription
# ✓ Useful if GPU causes issues

View available models:

mnemofy transcribe --list-models
# Shows model comparison table with your system specs

Advanced Options

# Specify output directory for all files
mnemofy transcribe meeting.mp4 --outdir outputs/

# Specify output file for notes only
mnemofy transcribe meeting.mp4 -o notes/meeting_summary.md

# Set a custom title for the notes
mnemofy transcribe meeting.mp4 -t "Team Sprint Planning"

# Specify transcription language (ISO 639-1 code)
mnemofy transcribe meeting.mp4 --lang es  # Spanish
mnemofy transcribe meeting.mp4 --lang fr  # French

# Choose notes generation mode
mnemofy transcribe meeting.mp4 --notes basic  # Deterministic extraction (default)
# mnemofy transcribe meeting.mp4 --notes llm  # AI-enhanced (coming in v0.9.0)

# Keep the extracted audio file
mnemofy transcribe video.mkv --keep-audio

Output Files

mnemofy generates 4 output files from each transcription:

meeting.mp4  (input)
├── meeting.transcript.txt   # Timestamped plain text
├── meeting.transcript.srt   # SubRip subtitle format  
├── meeting.transcript.json  # Structured JSON with metadata
└── meeting.notes.md         # Structured meeting notes

File Descriptions

.transcript.txt - Timestamped text transcript
- Format: [HH:MM:SS–HH:MM:SS] text
- Best for: Reading, searching, printing
.transcript.srt - SubRip subtitle file
- Format: Standard SRT (sequence number, timing, text)
- Best for: Video subtitles in VLC, subtitle editors
.transcript.json - Structured JSON
- Contains: Metadata (engine, model, language) + segments
- Best for: Programmatic access, data analysis
.notes.md - Structured meeting notes
- Sections: Metadata, Topics, Decisions, Actions, Mentions, Risks, File Links
- Best for: Quick review, sharing with team

Output Location

By default, files are created in the same directory as the input file:

mnemofy transcribe ~/Videos/meeting.mp4
# Creates: ~/Videos/meeting.transcript.{txt,srt,json}
#          ~/Videos/meeting.notes.md

Use --outdir to specify a different location:

mnemofy transcribe meeting.mp4 --outdir ./transcripts/
# Creates: ./transcripts/meeting.transcript.{txt,srt,json}
#          ./transcripts/meeting.notes.md

Get Help

# Show all options
mnemofy transcribe --help

# Show version
mnemofy version

Example Output

Given an audio file with meeting content, mnemofy generates structured Markdown (.notes.md) like:

# Meeting Notes: meeting

**Date**: 2026-02-10
**Source**: meeting.mp4 (45m 30s)
**Language**: en
**Engine**: faster-whisper (base)
**Generated**: 2026-02-10T15:30:00Z

## Topics

- **[05:00–10:00]** Discussion about project roadmap and milestones
- **[15:30–20:15]** Review of last sprint deliverables
- **[30:00–35:45]** Planning for upcoming feature releases

## Decisions

- **[08:30]** We decided to use Python for the backend
- **[18:45]** Agreed to launch the MVP in Q2
- **[32:10]** Approved budget increase for infrastructure

## Action Items

- **[10:15]** John needs to create the API documentation
- **[22:30]** Sarah will follow up with the design team
- **[40:00]** Team should review security audit by Friday

## Concrete Mentions

### Names
- John (10:15, 12:30)
- Sarah (22:30, 25:00)

### Numbers  
- Q2 (18:45)
- $50,000 (32:10)

### URLs
- https://github.com/project/repo (15:00)

## Risks & Open Questions

### Open Questions
- How should we handle database migrations? **[25:30]**
- What's the timeline for QA testing? **[38:00]**

### Risks
- Potential delay due to third-party API integration **[28:15]**

## Transcript Files

- Full Transcript (TXT): meeting.transcript.txt
- Subtitle Format (SRT): meeting.transcript.srt 
- Structured Data (JSON): meeting.transcript.json
- Audio (WAV): meeting.mnemofy.wav

Architecture

mnemofy follows a clean, modular architecture:

mnemofy/
├── src/mnemofy/
│   ├── __init__.py       # Package initialization
│   ├── audio.py          # Audio extraction using ffmpeg
│   ├── transcriber.py    # Speech transcription using Whisper
│   ├── notes.py          # Structured note generation
│   └── cli.py            # Command-line interface with Typer
├── tests/                # Test suite
├── pyproject.toml        # Modern Python packaging (PEP 621)
└── README.md

Development

Running Tests

pytest

Linting and Type Checking

# Run ruff for linting
ruff check src/

# Run mypy for type checking
mypy src/

Whisper Models

mnemofy supports all Whisper model sizes:

Model	Parameters	Speed	Accuracy
tiny	39M	Fastest	Good
base	74M	Fast	Better
small	244M	Medium	Great
medium	769M	Slow	Excellent
large	1550M	Slowest	Best

The default base model offers a good balance of speed and accuracy. Use tiny for quick tests or medium/large for maximum accuracy.

Requirements

Python 3.9-3.13 (Python 3.14+ not yet supported by dependencies)
ffmpeg
Dependencies (automatically installed):
- typer
- faster-whisper
- rich
- pydantic

Troubleshooting

Model Selection Issues

"No Whisper model fits in available RAM"

This means even the smallest model (tiny, 1.5 GB) requires more memory than available.

Solutions:

Close other applications to free memory
Use explicit tiny model: mnemofy transcribe file.mp4 --model tiny
Process shorter audio files
Upgrade system RAM
Use a machine with more RAM (cloud VM option)

CPU/GPU Not Detected

If model selection falls back to "base" when you expect GPU acceleration:

Check GPU availability:

mnemofy transcribe --list-models
# Look for "VRAM" row - should show your GPU memory if available

Verify GPU drivers (NVIDIA/Metal/ROCm) are installed:
- NVIDIA: nvidia-smi should work
- macOS (Metal): Currently supported on macOS with Apple Silicon (ARM64); Intel Macs will fall back to CPU
- AMD (ROCm): Not yet implemented (planned for future release)
Force CPU mode if GPU causes issues:
```
mnemofy transcribe file.mp4 --no-gpu
```

Interactive Menu Not Showing

If you expect the interactive menu but it skips to auto-selection:

Menu requires a terminal (TTY), not suitable for pipes/redirects
Use --auto explicitly for headless environments
In CI/cron: model selection works automatically with --auto

OpenMP Library Conflict (macOS)

If you encounter an error about libiomp5.dylib already initialized, set this environment variable:

export KMP_DUPLICATE_LIB_OK=TRUE
mnemofy transcribe your_file.mp4

Or run it inline:

KMP_DUPLICATE_LIB_OK=TRUE mnemofy transcribe your_file.mp4

This is a known issue with multiple OpenMP runtimes being linked (common with ctranslate2/faster-whisper).

Audio Extraction Issues

ffmpeg Not Found

If you see ffmpeg: command not found:

Solution: Install ffmpeg:

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg

# Windows (using Chocolatey)
choco install ffmpeg

Verify installation:

ffmpeg -version

Unsupported Video Codec

If audio extraction fails with codec errors:

Check if the video file is corrupted:

ffplay your_video.mp4  # Should play without errors

Try converting to a standard format first:

ffmpeg -i input.mkv -c:v libx264 -c:a aac output.mp4
mnemofy transcribe output.mp4

Check container format is supported (mp4, mkv, mov, avi, webm)

Extracted Audio Quality Issues

If transcription accuracy is poor, the audio extraction might have issues:

Verify audio channel: mnemofy extracts the first audio track

For multi-audio files, extract specific track manually:

ffmpeg -i video.mkv -map 0:a:1 audio.wav  # Extract 2nd audio track
mnemofy transcribe audio.wav

License

MIT License - see LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Release history Release notifications | RSS feed

2.6.4

Feb 11, 2026

2.6.3

Feb 11, 2026

2.6.2

Feb 11, 2026

2.6.1

Feb 11, 2026

2.6.0

Feb 11, 2026

2.5.0

Feb 11, 2026

2.4.3

Feb 11, 2026

2.4.1

Feb 11, 2026

2.4.0

Feb 11, 2026

2.3.1

Feb 11, 2026

2.3.0

Feb 11, 2026

2.2.0

Feb 10, 2026

2.1.1

Feb 10, 2026

2.1.0

Feb 10, 2026

2.0.0

Feb 10, 2026

This version

1.0.0

Feb 10, 2026

0.9.5

Feb 10, 2026

0.9.3

Feb 10, 2026

0.9.2

Feb 10, 2026

0.9.1

Feb 10, 2026

0.9.0

Feb 10, 2026

0.6.2

Feb 9, 2026

0.6.0

Feb 9, 2026

0.5.1

Feb 9, 2026

0.5.0

Feb 9, 2026

0.4.0

Feb 9, 2026

0.3.0

Feb 9, 2026

0.1.1

Feb 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mnemofy-1.0.0.tar.gz (61.0 kB view details)

Uploaded Feb 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mnemofy-1.0.0-py3-none-any.whl (33.7 kB view details)

Uploaded Feb 10, 2026 Python 3

File details

Details for the file mnemofy-1.0.0.tar.gz.

File metadata

Download URL: mnemofy-1.0.0.tar.gz
Upload date: Feb 10, 2026
Size: 61.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for mnemofy-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`16f5aa81075913a96c639bde3d722d2492acef462cd486a77af9422019ea1c6d`
MD5	`be024aacf65c017b38a52c64a0897cda`
BLAKE2b-256	`84062a68fa4b96214aa59145322d08e5033eb9837d47b41d2e49377d13a19735`

See more details on using hashes here.

File details

Details for the file mnemofy-1.0.0-py3-none-any.whl.

File metadata

Download URL: mnemofy-1.0.0-py3-none-any.whl
Upload date: Feb 10, 2026
Size: 33.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for mnemofy-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`40e99d163e42b7cf15c5ae510f8fee17f5b418ffc5a2f64519798b19457f84af`
MD5	`2a2baef5d315a8f325b61e14bad363f6`
BLAKE2b-256	`7353819dc2cafbc7f6b7a47bc46c5165beb4bcfbc3d478248eba53c3d3fd7e84`

See more details on using hashes here.

mnemofy 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mnemofy

Features

Installation

Prerequisites

Install mnemofy

Usage

Basic Usage

Automatic Model Selection

Advanced Options

Output Files

File Descriptions

Output Location

Get Help

Example Output

Architecture

Development

Running Tests

Linting and Type Checking

Whisper Models

Requirements

Troubleshooting

Model Selection Issues

"No Whisper model fits in available RAM"

CPU/GPU Not Detected

Interactive Menu Not Showing

OpenMP Library Conflict (macOS)

Audio Extraction Issues

ffmpeg Not Found

Unsupported Video Codec

Extracted Audio Quality Issues

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes