Orchestrator: PDFs → NotebookLM → synthesis → disk + Notion
Project description
Folio Notion
User story: configure once → add PDFs → stay logged into NotebookLM → run → local export + Notion.
This repo is a single product: an orchestrator that owns the sequence and glue between tools you already use (files, NotebookLM, disk, Notion). It does not replace those tools; it drives them in order and passes data between steps.
For technical readers: this is a research / ingestion pipeline with orchestration at the center.
Layout
See the tree below; each top-level folder has a short README.md where it helps.
folio-notion/
├── README.md
├── pyproject.toml
├── .env.example
├── .gitignore
├── config/ # checked-in templates; user config lives outside or via env
├── docs/
│ ├── user-workflow.md # journey + milestones (OSS-friendly)
│ └── roadmap.md
├── src/folio_notion/ # application package
│ ├── cli.py # “Run pipeline” entrypoint
│ ├── pipeline.py # sequence: configure → … → Notion
│ ├── steps/ # one module per pipeline stage
│ └── integrations/ # NotebookLM, Notion API, filesystem
├── tests/
├── scripts/ # optional one-off maintenance / dev helpers
└── var/ # default local scratch (gitignored); exports, caches
Quick start
Install from PyPI (after you publish the package):
pip install folio-notion
# or: pip install folio_notion # same package (normalized name)
# NotebookLM + browser flow:
pip install "folio-notion[notebooklm]"
playwright install chromium
Install from a git clone (development):
cd folio-notion
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e .
Run pipeline (PDFs → NotebookLM chat → var/exports + Notion page):
pip install -e ".[notebooklm]" # NotebookLM API + browser login
playwright install chromium
# Put PDFs in var/inbox (or set input_dir in config/config.yaml)
fn run # same as: fn run -C .
fn run -C /path/to/folio-notion
python -m folio_notion
Run modes
fn run --dry-run— load config and verify NotebookLM storage (+ Notion, unless you combine with--skip-notion). Does not ingest PDFs or call NotebookLM.fn run --skip-notion— NotebookLM synthesis and local markdown underexport_dir; skips Notion API checks and page creation (useful for end-to-end LM testing without publishing).- PDFs — by default Folio looks in
input_dir(oftenvar/inbox). If it’s empty and your terminal is interactive, it prompts for file or folder path(s) (comma-separated). Non-interactive / CI:fn run --pdf /path/to/file.pdfor--pdf /path/to/folder(repeat--pdffor several paths).
Structured logs: set FOLIO_LOG_LEVEL to DEBUG or INFO (default).
Status — see whether Notion / NotebookLM look connected (uses a quick API check; needs network):
fn status
fn status -C /path/to/folio-notion
Requires .env with NOTION_TOKEN, NOTION_PARENT_PAGE_ID, and FOLIO_NOTEBOOKLM_STORAGE (from the connect commands). Config: config/config.yaml (merged with config/defaults.example.yaml).
Connect Notion — verify your internal integration secret with the API, optionally check a parent page id, optionally write .env:
fn connect notion
# alias:
fn notion connect
Flags: --token SECRET, --parent PAGE_ID, --save, --no-save, -C /path/to/project (where .env lives).
Interactive prompts use prompt-toolkit so arrow keys and normal line editing work (after pip install -e .). Parent may be a Notion page or database; database URLs are verified via the databases API.
Interactive parent step: bad input keeps prompting (a short guide appears every 3 failures). Enter skips parent; quit / exit / q leaves parent empty but continues to the save question.
Do not paste lines from pyproject.toml (like [project.scripts]) into the shell; those belong only in the file.
Connect NotebookLM (browser → storage → httpx)
Architecture: interactive Playwright session once → save storage_state.json → httpx loads cookies and fetches NotebookLM page tokens (SNlM0e, FdrFJe) for future RPC-style calls. Cookie parsing patterns are aligned with notebooklm-py; paths default to ~/.folio-notion/notebooklm/ (override with NOTEBOOKLM_HOME / FOLIO_NOTEBOOKLM_HOME).
pip install -e ".[notebooklm]"
playwright install chromium # if not already installed
fn connect notebooklm
# alias:
fn notebooklm connect
- Chromium uses stealth-ish flags (
AutomationControlled, no--enable-automationbanner); persistent profile underbrowser_profile/. - Session file is
storage_state.json(mode600where supported). Optional:NOTEBOOKLM_AUTH_JSONfor CI (same shape as Playwright storage). - Set
FOLIO_PLAYWRIGHT_AUTO_INSTALL=0to skip automaticplaywright install chromium. - Flags:
--storage PATH,--no-verify,--save/--no-save,-Cproject root (for.env).
Configuration
Paths, prompts, Notion parent page, optional notebook id — configure once via config/ templates and environment variables (see .env.example). After fn connect notion, NOTION_TOKEN and optionally NOTION_PARENT_PAGE_ID can live in .env. After fn connect notebooklm, FOLIO_NOTEBOOKLM_STORAGE can point at your saved storage_state.json.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file folio_notion-0.1.0.tar.gz.
File metadata
- Download URL: folio_notion-0.1.0.tar.gz
- Upload date:
- Size: 32.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1562a21ad0cdb9902623846a8a381d02508ac384d19c1757015652e9fa9b037
|
|
| MD5 |
a05829d46bcc4442ccf746dd6779be74
|
|
| BLAKE2b-256 |
d7ce649f51683f84f7d74c33f592e33c8568e9fed2756f7951d69790da696008
|
File details
Details for the file folio_notion-0.1.0-py3-none-any.whl.
File metadata
- Download URL: folio_notion-0.1.0-py3-none-any.whl
- Upload date:
- Size: 37.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4980c17037fb7b0d5fcef6809623635bdef56369c5db6e32de1dfe1acedb34c
|
|
| MD5 |
919e8eb25e8ce40c5eb4564753053dad
|
|
| BLAKE2b-256 |
2dcea7ba10b1f90a8ed99149359e656d6c24fbd1154608f89a5ebc111caf6db0
|