Convert PDFs to MP3 audiobooks using Microsoft Edge neural voices.

These details have not been verified by PyPI

Project links

Project description

PDF2MP3 – Convert PDFs into Audiobooks

A command-line utility (CLI) in Python to convert PDF files into MP3 audiobooks using high-quality, natural female neural voices.
Supports English (en) and Brazilian Portuguese (pt-br).

Features

Extracts text from PDFs using pdfminer.six.
Converts text to speech using edge-tts (Microsoft Neural Voices).
Default high-quality female voices:
- en-US-AriaNeural
- pt-BR-ThalitaNeural
Smart chunk splitting for fluent narration (default: 1600 characters per chunk).
Generates a single continuous MP3 file (via pydub + ffmpeg).
Configurable language, voice, speaking rate, volume, and chunk size.
Built-in retry system with exponential backoff for long or unstable requests.

Installation

1. Clone the repository

git clone https://github.com/byraphaelmedeiros/pdf2mp3.git
cd pdf2mp3

2. Create a virtual environment (recommended)

python -m venv .venv
.venv\Scripts\activate   # Windows
# source .venv/bin/activate  # Linux/Mac

3. Install dependencies

pip install --upgrade pip
pip install -r requirements.txt

4. Install FFmpeg

pydub requires ffmpeg to export audio to MP3.

Windows (via winget):

winget install --id=Gyan.FFmpeg -e

macOS (via Homebrew):

brew install ffmpeg

Linux (Debian/Ubuntu):

sudo apt-get update && sudo apt-get install ffmpeg

Verify the installation:

ffmpeg -version

Usage

Basic usage requires only the input PDF. Other parameters have defaults.

python pdf2mp3.py <input.pdf> [options]

Parameters

Required
- input → Input PDF file
Optional
- -o, --output → Output MP3 file (default: same name as PDF with .mp3)
- -l, --lang → Language (en or pt-br, default: pt-br)
- --voice → Voice name (default: AriaNeural / ThalitaNeural)
- --rate → Speaking rate (e.g., +5%, -5%, default: +0%)
- --volume → Speaking volume (e.g., +0%, +5%, default: +0%)
- --max-chars → Maximum characters per chunk (default: 1600)

Examples

Convert a PDF in Portuguese (minimal usage):

python pdf2mp3.py "document.pdf"

Output: document.mp3 in pt-br using pt-BR-ThalitaNeural.

Convert a PDF in English:

python pdf2mp3.py "book.pdf" -l en

Specify a custom output file:

python pdf2mp3.py "document.pdf" -o "output.mp3"

Adjust speaking rate and volume:

python pdf2mp3.py "document.pdf" --rate +5% --volume +0%

Use smaller chunks:

python pdf2mp3.py "document.pdf" --max-chars 1200

Notes

This program does not work with scanned PDFs (image-based).
Use OCR first, for example:
```
pip install ocrmypdf
ocrmypdf --force-ocr input.pdf output_ocr.pdf
```
Audio quality depends on the text extraction results from pdfminer.six.
An active internet connection is required (the edge-tts library uses Microsoft’s neural voices).

Quick Checklist

Install Python dependencies:
```
pip install -r requirements.txt
```

Install FFmpeg:

winget install --id=Gyan.FFmpeg -e   # Windows

Run the program:
```
python pdf2mp3.py input.pdf
```

You will get a ready-to-play MP3 audiobook.

Contributing

Contributions are welcome!
Please read the Contributing Guidelines for details.

Code of Conduct

This project follows a Code of Conduct.
By participating, you are expected to uphold this standard.

Security

If you discover a security vulnerability, please report it by emailing:
pdf2mp3@byraphaelmedeiros.com
See SECURITY.md for more details.

License

This project is released under the terms of the MIT License.
See the LICENSE file for details.

Maintainer

Raphael Medeiros

GitHub: @byraphaelmedeiros
Contact: pdf2mp3@byraphaelmedeiros.com

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Oct 1, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf2mp3-1.0.0.tar.gz (11.5 kB view details)

Uploaded Oct 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pdf2mp3-1.0.0-py3-none-any.whl (11.0 kB view details)

Uploaded Oct 1, 2025 Python 3

File details

Details for the file pdf2mp3-1.0.0.tar.gz.

File metadata

Download URL: pdf2mp3-1.0.0.tar.gz
Upload date: Oct 1, 2025
Size: 11.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pdf2mp3-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`a6317d2735d0c6da04f48029c7470abf7053a95ea4e8a29a04e1af388b21212d`
MD5	`8c842cb21e89ce0afd7acbf1e3d4e34e`
BLAKE2b-256	`efea124765e01d7489f3ad210f82f0c32c0a3a4bc5b1ec2e2a980c775688422c`

See more details on using hashes here.

File details

Details for the file pdf2mp3-1.0.0-py3-none-any.whl.

File metadata

Download URL: pdf2mp3-1.0.0-py3-none-any.whl
Upload date: Oct 1, 2025
Size: 11.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pdf2mp3-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d73e73f58cf939b6a24001a63380541bf243375d2537e47c8ff82b64c619d681`
MD5	`bbb8b6ecf3a15e6ec6db0cf4c1b0e3fd`
BLAKE2b-256	`36a44cad17087141cfdfcb862aaf52945e75fb8cf88b4ff7bc1b6a9e81f0b74c`

See more details on using hashes here.

pdf2mp3 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

PDF2MP3 – Convert PDFs into Audiobooks

Features

Installation

1. Clone the repository

2. Create a virtual environment (recommended)

3. Install dependencies

4. Install FFmpeg

Usage

Parameters

Examples

Notes

Quick Checklist

Contributing

Code of Conduct

Security

License

Maintainer

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes