Skip to main content

Turn long PDFs into chunked NotebookLM workflows with Studio outputs

Project description

notebooklm-chunker

PyPI version CI CodeQL Python 3.12+ License: MIT Security: Trivy

notebooklm-chunker turns long documents into smaller, heading-aware NotebookLM sources so reports, slide decks, quizzes, flashcards, and audio outputs stay more focused and useful.

Desktop App

The repo ships with an Electron desktop client under desktop/.

This is the main visual workflow:

  • create a local project from a PDF
  • choose a target chunk count and chunking settings
  • review and refine chunk files
  • sync only changed chunks to NotebookLM
  • open a NotebookLM workspace view inside the app
  • queue reports, slides, quizzes, flashcards, or audio jobs from synced sources

Desktop quick start:

cd desktop
npm install
npm run dev

On first launch, the desktop app now runs a setup check screen. It verifies:

  • nblm is available on PATH
  • Playwright Chromium is installed
  • NotebookLM auth state is ready for live sync and Studio work

Desktop release binaries can be attached to GitHub releases, but for now the desktop app still expects nblm to be installed on the host machine and available on PATH. The Electron app is a desktop shell around the real CLI, not a fully bundled Python runtime yet.

What the desktop app gives you:

  • recent projects dashboard
  • local project persistence
  • chunk refinement UI
  • changed-only sync
  • NotebookLM workspace dashboard
  • saved prompt library per Studio type
  • queued Studio generation from selected sources

The desktop app uses the real nblm CLI under the hood. It is not a separate backend.

Python CLI

The Python package is the automation core used by both the CLI and the desktop app.

Requirements

  • Python 3.12+
  • pip

This project automates NotebookLM through notebooklm-py, which is an unofficial community library.

For local development and contribution flow, see DEVELOPMENT.md.

Installation

From PyPI:

pip install notebooklm-chunker
python -m playwright install chromium
nblm login

With pipx:

pipx install notebooklm-chunker
python -m playwright install chromium
nblm login

From a local checkout:

python -m pip install /ABS/PATH/notebooklm-chunker
python -m playwright install chromium
nblm login

If you already have valid NotebookLM auth state, you can skip nblm login.

To clear local notebooklm-py auth state later:

nblm logout

Quick Start

Create a workflow file:

nblm init

Run the whole flow:

nblm run --config ./nblm.toml

Continue later from the saved run state:

nblm resume --config ./nblm.toml

Add new per-chunk Studio outputs later without re-uploading chunks:

nblm studios --config ./quiz.toml

Check auth, config, Playwright, and parser readiness:

nblm doctor --config ./nblm.toml

Show the installed CLI version:

nblm --version

source.path lives in the config file, so you do not need to pass the input document as a CLI argument.

Repo Demo

This repository includes a full example built around the freely downloadable InfoQ mini-book Domain-Driven Design Quickly.

Repo demo command:

nblm run --config ./examples/workflows/ddd-quickly-demo.toml

Generated NotebookLM:

Run State And Resume

nblm run always starts a fresh run and writes a state file next to the chunk output:

./output/chunks/.nblm-run-state.json

That file tracks:

  • source upload status for each chunk
  • Studio status for each chunk
  • saved source_id, task_id, artifact_id, output path, and last error when available

This is why nblm resume can continue later without redoing finished work.

If you want to inspect progress manually, open .nblm-run-state.json.

Source uploads and per-chunk Studio jobs run as separate queues.

Quota blocks are tracked per Studio type. If report is blocked, slide_deck or quiz can still continue until they hit their own limits.

Add More Studios Later

If a previous run already uploaded the chunk sources, nblm studios can reuse .nblm-run-state.json and add new per-chunk Studio outputs later without uploading the chunks again.

Example:

nblm studios --config ./quiz.toml

For per_chunk = true, this stays scoped to the saved per-chunk source IDs from the same run state. It does not silently widen to whole-notebook context.

Output Files

One chunking.output_dir represents one workflow lineage.

That lineage owns:

  • chunk markdown files
  • manifest.json
  • .nblm-run-state.json

If you want another book, or the same book as a separate NotebookLM run, use a different chunking.output_dir.

Workflow Notes

  • paths inside workflow files are resolved relative to that file
  • output paths may use {source_stem}
  • runtime.download_outputs = false is supported
  • one chunking.output_dir maps to one NotebookLM workflow lineage

Example Config

[source]
path = "./book.pdf"

[notebook]
title = "Book Notes"

[chunking]
output_dir = "./output/{source_stem}/chunks"
target_pages = 3.0
min_pages = 2.5
max_pages = 4.0

[runtime]
max_parallel_chunks = 3

[studios.report]
enabled = true
per_chunk = true
output_dir = "./output/{source_stem}/reports"
language = "en"
format = "study-guide"

Release And Development

For setup, testing, packaging, and GitHub release flow, see DEVELOPMENT.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

notebooklm_chunker-0.4.0.tar.gz (64.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

notebooklm_chunker-0.4.0-py3-none-any.whl (49.8 kB view details)

Uploaded Python 3

File details

Details for the file notebooklm_chunker-0.4.0.tar.gz.

File metadata

  • Download URL: notebooklm_chunker-0.4.0.tar.gz
  • Upload date:
  • Size: 64.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for notebooklm_chunker-0.4.0.tar.gz
Algorithm Hash digest
SHA256 3f4c7f1df0dc6e259d1e934bfe08be8633010d63c84e288a35efd2c1234ec5c9
MD5 6338dad94e1e94ffd0f4ccb06f298c7a
BLAKE2b-256 ee7e08b3dc1989c25cea7d0e67da98bd0d8f093eb7e3cc44583d45668c932080

See more details on using hashes here.

Provenance

The following attestation bundles were made for notebooklm_chunker-0.4.0.tar.gz:

Publisher: publish.yml on cmlonder/notebooklm-chunker

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file notebooklm_chunker-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for notebooklm_chunker-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f2170b545dde6c9aee1163d6f830832f69e08efeef0c80c22f4859af540b7d71
MD5 a39c35ca2b34c69a2e570c8c84888140
BLAKE2b-256 125e19d9694a05c3edbf44c095546cb5af076a0aa64becf127428edbe62f81fb

See more details on using hashes here.

Provenance

The following attestation bundles were made for notebooklm_chunker-0.4.0-py3-none-any.whl:

Publisher: publish.yml on cmlonder/notebooklm-chunker

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page