A local notebook implementation
Project description
Local-NotebookLM
A local AI-powered tool that converts PDF documents into engaging audio—such as podcasts or custom audio content—using local LLMs and TTS models.
Features
- PDF text extraction and processing
- Customizable audio generation (podcasts, summaries, interviews, and more) with different styles and lengths
- Support for various LLM providers (OpenAI, Groq, LMStudio, Ollama, Azure)
- Text-to-Speech conversion with voice selection
- Flexible pipeline with many options for content, style, and voices
- Programmatic API for integration in other projects
- FastAPI server for web-based access
- Example podcast included for demonstration
Here are quick examples, can you guess what paper they're talking about?
Your browser does not support the audio element. You can manually listen/download here: Casual example. Your browser does not support the audio element. You can manually listen/download here: Gen-Z example.If your browser still blocks embedded playback on GitHub, use direct links:
Prerequisites
- Python 3.9+
- Local LLM server (optional, for local inference)
- Local TTS server (optional, for local audio generation)
- At least 8GB RAM (32GB+ recommended for local models)
- 10GB+ free disk space
Installation
From PyPI
pip install local-notebooklm
From source
- Clone the repository:
git clone https://github.com/Goekdeniz-Guelmez/Local-NotebookLM.git
cd Local-NotebookLM
- Create and activate a virtual environment (conda works too):
python -m venv venv
source venv/bin/activate # On Windows, use: venv\Scripts\activate
- Install the required packages:
pip install -r requirements.txt
Running with Docker
You can run Local-NotebookLM using Docker for both the Web UI and API modes.
Prerequisites
- Docker installed on your system
Steps
-
Build the Docker image:
docker build -t local-notebooklm-ui .
-
Run the Gradio Web UI:
docker run -p 7860:7860 local-notebooklm-ui
The Web UI will be available at http://localhost:7860.
-
Run the FastAPI API server:
docker run -e APP_MODE=api -p 8000:8000 local-notebooklm-ui
The API server will be available at http://localhost:8000.
Optional pre requisites
Local TTS server
- Follow one installation type (docker, docker-compose, uv) at https://github.com/remsky/Kokoro-FastAPI
- Test in your browser that http://localhost:8880/v1 return the json: {"detail":"Not Found"}
Example Output
The repository includes an example podcast in examples/podcast.wav to demonstrate the quality and format of the output. The models used are: gpt4o and Mini with tts-hs on Azure. You can listen to this example to get a sense of what Local-NotebookLM can produce before running it on your own PDFs.
Usage
Command Line Interface
Run the script with the following command:
python -m local_notebooklm.make_audio --pdf PATH_TO_PDF [options]
Available Options
| Option | Description | Default |
|---|---|---|
--pdf |
Path to the PDF file (required) | - |
--output_dir |
Directory to store output files | ./output |
--llm_model |
Ollama LLM model name | qwen3:30b-a3b-instruct-2507-q4_K_M |
--language |
Language for the audio output | english |
--format_type |
Output format type (summary, podcast, article, interview, panel-discussion, debate, narration, storytelling, explainer, lecture, tutorial, q-and-a, news-report, executive-brief, meeting, analysis) | podcast |
--style |
Content style (normal, casual, formal, technical, academic, friendly, gen-z, funny) | normal |
--length |
Content length (short, medium, long, very-long) | medium |
--is-vlm |
Enable vision mode so extracted PDF images are also sent to the LLM | False |
--num_speakers |
Number of speakers in audio (1, 2, 3, 4, 5) | 2 (for podcast/interview) |
--custom_preferences |
Additional focus preferences or instructions | None |
Format Types
Local-NotebookLM supports both single-speaker and multi-speaker formats:
Single-Speaker Formats:
- summary
- narration
- storytelling
- explainer
- lecture
- tutorial
- news-report
- executive-brief
- analysis
Two-Speaker Formats:
- podcast
- interview
- panel-discussion
- debate
- q-and-a
- meeting
Multi-Speaker Formats:
- panel-discussion (3, 4, or 5 speakers)
- debate (3, 4, or 5 speakers)
Example Commands
Basic usage:
python -m local_notebooklm.make_audio --pdf documents/research_paper.pdf
Customized podcast:
python -m local_notebooklm.make_audio --pdf documents/research_paper.pdf --format_type podcast --length long --style casual
With custom preferences:
python -m local_notebooklm.make_audio --pdf documents/research_paper.pdf --custom_preferences "Focus on practical applications and real-world examples"
Specify number of speakers:
python -m local_notebooklm.make_audio --pdf documents/research_paper.pdf --format_type panel-discussion --num_speakers 3
Enable multimodal transcript generation (text + PDF images):
python -m local_notebooklm.make_audio --pdf documents/research_paper.pdf --is-vlm
Programmatic API
You can also use Local-NotebookLM programmatically in your Python code:
from local_notebooklm.processor import generate_audio
generate_audio(
pdf_path="documents/research_paper.pdf",
output_dir="./test_output",
llm_model="qwen3:30b-a3b-instruct-2507-q4_K_M",
language="english",
format_type="interview",
style="professional",
length="long",
num_speakers=2,
custom_preferences="Focus on the key technical aspects"
)
Gradio Web UI
Local-NotebookLM now includes a user-friendly Gradio web interface that makes it easy to use the tool without command line knowledge:
python -m local_notebooklm.web_ui
By default, the web UI runs locally on http://localhost:7860. You can access it from your browser.
Web UI Screenshots
The main interface of the Local-NotebookLM web UI
Web UI Options
| Option | Description | Default |
|---|---|---|
--share |
Make the UI accessible over the network | False |
--port |
Specify a custom port | 7860 |
Example Commands
Basic local usage:
python -m local_notebooklm.web_ui
Share with others on your network:
python -m local_notebooklm.web_ui --share
Use a custom port:
python -m local_notebooklm.web_ui --port 8080
The web interface provides all the same options as the command line tool in an intuitive UI, making it easier for non-technical users to generate audio content from PDFs.
FastAPI Server
Start the FastAPI server to access the functionality via a web API:
python -m local_notebooklm.server
By default, the server runs on http://localhost:8000. You can access the API documentation at http://localhost:8000/docs.
Pipeline Steps
- PDF Extraction
- Extracts and cleans text from the provided PDF.
- Transcript Generation
- Generates a transcript or script based on the extracted content and user options.
- Audio Generation
- Converts the optimized transcript to audio using the specified TTS model and outputs the final audio file.
Pipeline Diagram
flowchart TD
subgraph "Main Controller"
generate_audio["generate_audio()"]
end
subgraph "Pipeline Steps"
extractPDF["PDF Extraction"]
transcript["Transcript Generation"]
ttsOpt["TTS Optimization"]
audioGen["Audio Generation"]
end
generate_audio --> extractPDF
extractPDF --> transcript
transcript --> ttsOpt
ttsOpt --> audioGen
audioGen --> outputFile["Audio File"]
Multiple Language Support
Local-NotebookLM now supports multiple languages. You can specify the language when using the programmatic API or through the command line.
Important Note: When using a non-English language, ensure that both your selected LLM and TTS models support the desired language. Language support varies significantly between different models and providers. For optimal results, verify that your chosen models have strong capabilities in your target language before processing.
Output Files
The pipeline generates the following files:
segments/podcast_segment_*.wav: Individual audio segmentspodcast.wav: Final concatenated podcast audio file
Troubleshooting
Common Issues
-
PDF Extraction Fails
- Try a different PDF file
- Check if the PDF is password-protected
- Ensure the PDF contains extractable text (not just images)
-
API Connection Errors
- Verify your API keys are correct
- Check your internet connection
- Ensure the API endpoints are accessible
-
Out of Memory Errors
- Reduce the chunk size in the configuration
- Use a smaller model
- Close other memory-intensive applications
-
Audio Quality Issues
- Try different TTS voices
- Adjust the sample rate in the configuration
- Check if the TTS server is running correctly
Getting Help
If you encounter issues not covered here, please:
- Check the logs for detailed error messages
- Open an issue on the GitHub repository with details about your problem
- Include the error message and steps to reproduce the issue
Requirements
- Python 3.9+
- PyPDF2
- tqdm
- numpy
- soundfile
- requests
- pathlib
- fastapi
- uvicorn
Full requirements are listed in requirements.txt.
Acknowledgments
- This project uses various open-source libraries and models
- Special thanks to the developers of LLaMA, OpenAI, and other AI models that make this possible
Best Gökdeniz Gülmez
Citing Local-NotebookLM
The Local-NotebookLM software suite was developed by Gökdeniz Gülmez. If you find Local-NotebookLM useful in your research and wish to cite it, please use the following BibTex entry:
@software{
Local-NotebookLM,
author = {Gökdeniz Gülmez},
title = {{Local-NotebookLM}: A Local-NotebookLM to convert PDFs into Audio.},
url = {https://github.com/Goekdeniz-Guelmez/Local-NotebookLM},
version = {0.1.5},
year = {2025},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file local_notebooklm-2.0.0.tar.gz.
File metadata
- Download URL: local_notebooklm-2.0.0.tar.gz
- Upload date:
- Size: 27.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0cf67a72dc6c093bf59cc1563cb2e2dd53e78942ef81618309fe2bfb75bc69ed
|
|
| MD5 |
91ab24a679d67cd3480d8ac203ee501e
|
|
| BLAKE2b-256 |
ea43ccb815997095226afd002186bddf60c03987991db122c25842c331d30726
|
File details
Details for the file local_notebooklm-2.0.0-py3-none-any.whl.
File metadata
- Download URL: local_notebooklm-2.0.0-py3-none-any.whl
- Upload date:
- Size: 24.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0c2e0653bed5a63b7d527b141aac7ec7a50f9bf50197664b4b361445f272f49
|
|
| MD5 |
d48519075bb939f15516dabce2ff0d4d
|
|
| BLAKE2b-256 |
ba2d47087cb8c988c8c0c9e7fd38e99751d8e7a1dbbe84af94af2824aa86056f
|