Export O'Reilly books as high-quality PDF via headless Chrome
Project description
📚 oreilly2pdf
Download any book from O'Reilly Learning as a single, high-quality PDF.
All images, cross-chapter links, table of contents, and index entries just work — exactly as you'd expect from a real book.
⚠️ Requires an active O'Reilly Learning subscription.
⚡ Quick Start
pip install oreilly2pdf
oreilly2pdf 9781098150952 --cookie-file cookies.json
That's it. You'll get a 9781098150952.pdf with all chapters merged into one file.
🔧 Installation
From PyPI
pip install oreilly2pdf
From source
git clone https://github.com/cruzlorite/oreilly2pdf.git
cd oreilly2pdf
pip install .
Requirements
- Python 3.10+
- Google Chrome (or Chromium)
- ChromeDriver — installed automatically by Selenium 4.20+
🍪 Getting Your Cookies
You need to provide your O'Reilly session cookies so the tool can access your account. There are three easy ways to get them:
Way 1 — DevTools Console (fastest)
- Log in to learning.oreilly.com in Chrome.
- Open DevTools (
F12) → go to the Console tab. - Paste this and press Enter:
copy(JSON.stringify(Object.fromEntries(document.cookie.split('; ').map(c => c.split('=')))))
- Your cookies are now in the clipboard. Save them to a file:
pbpaste > cookies.json # macOS
xclip -o > cookies.json # Linux
Way 2 — Cookie-Editor extension
- Install Cookie-Editor in your browser.
- Go to learning.oreilly.com and log in.
- Click the Cookie-Editor icon → Export → JSON.
- Paste into
cookies.jsonand reformat as{"name": "value"}pairs.
Way 3 — Manual
- Open DevTools (
F12) → Application tab → Cookies →https://learning.oreilly.com. - Create a
cookies.jsonwith the relevant cookie values:
{
"BrowserCookie": "...",
"orm-jwt": "...",
"orm-rt": "...",
"groot_sessionid": "..."
}
Note: The most important cookies are typically
orm-jwtandgroot_sessionid. If export fails, try adding more cookies from your browser.
📖 Finding the Book ID
Open any book on O'Reilly and look at the URL — the book ID is the ISBN number:
https://learning.oreilly.com/library/view/book-title/9781098150952/
^^^^^^^^^^^^^
book_id
🚀 Usage
# Basic usage
oreilly2pdf <book_id> --cookie-file cookies.json
# Custom output filename
oreilly2pdf 9781098150952 --cookie-file cookies.json -o my_book.pdf
# Inline cookies instead of a file
oreilly2pdf 9781098150952 --cookies "orm-jwt=eyJ...; groot_sessionid=xyz"
# Keep individual chapter PDFs alongside the merged output
oreilly2pdf 9781098150952 --cookie-file cookies.json --keep-chapters
All Options
| Option | Description |
|---|---|
book_id |
O'Reilly book identifier (ISBN) — required |
--cookie-file FILE |
Path to a cookies file (JSON or plain text) |
--cookies STRING |
Inline cookies (key=value; key2=value2) |
-o, --output FILE |
Output path (default: <book_id>.pdf) |
--keep-chapters |
Save individual chapter PDFs too |
✨ Features
| 📄 Full book | Cover, TOC, all chapters, appendices, index — everything |
| 🖼️ Images | Lazy-loaded and dynamic images fully resolved |
| �� Cross-chapter links | "See Section 4.3" actually jumps to Section 4.3 |
| 🧹 Clean output | No navigation bars, cookie banners, or popups |
| 🎨 Faithful rendering | Math, code blocks, tables, figures — pixel-perfect |
🔍 How It Works
- Fetches the book's table of contents from the O'Reilly API.
- Opens each chapter in headless Chrome with your session cookies.
- Waits for all images (including lazy-loaded ones) to fully render.
- Strips the O'Reilly UI — keeps only the book content.
- Prints each chapter to PDF via Chrome DevTools Protocol.
- Merges everything into a single PDF and rewrites cross-chapter links so they work as clickable in-document jumps.
🙏 Acknowledgements
Inspired by oreilly-epub-downloader by @tctibbs.
📄 License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file oreilly2pdf-0.1.1.tar.gz.
File metadata
- Download URL: oreilly2pdf-0.1.1.tar.gz
- Upload date:
- Size: 15.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d682a108e7be9d4beb24776c3b90bb16b8a00aa8da3dcd3c2d2b0ae3e0b4d43
|
|
| MD5 |
d9ec3e92771f0652ef693417848041a2
|
|
| BLAKE2b-256 |
bfce24fae53dcf02a44540a7034b0caea82fc8e953ad31f912e93ebe294c1c1a
|
Provenance
The following attestation bundles were made for oreilly2pdf-0.1.1.tar.gz:
Publisher:
publish.yml on cruzlorite/oreilly2pdf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
oreilly2pdf-0.1.1.tar.gz -
Subject digest:
8d682a108e7be9d4beb24776c3b90bb16b8a00aa8da3dcd3c2d2b0ae3e0b4d43 - Sigstore transparency entry: 1105524599
- Sigstore integration time:
-
Permalink:
cruzlorite/oreilly2pdf@26706cbccd81f208e80e3518d8b1c637cc73b8a5 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/cruzlorite
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@26706cbccd81f208e80e3518d8b1c637cc73b8a5 -
Trigger Event:
release
-
Statement type:
File details
Details for the file oreilly2pdf-0.1.1-py3-none-any.whl.
File metadata
- Download URL: oreilly2pdf-0.1.1-py3-none-any.whl
- Upload date:
- Size: 14.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e97e7350a104a15dbf95c5bdd0868cf0ec3d57f8ac312fc27df5a028c4892812
|
|
| MD5 |
4bec162f9518f8f78c2586a8ac5a0203
|
|
| BLAKE2b-256 |
6220e01d9e09832d067198ff683fe956001970888ce2131529c1904deba584b6
|
Provenance
The following attestation bundles were made for oreilly2pdf-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on cruzlorite/oreilly2pdf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
oreilly2pdf-0.1.1-py3-none-any.whl -
Subject digest:
e97e7350a104a15dbf95c5bdd0868cf0ec3d57f8ac312fc27df5a028c4892812 - Sigstore transparency entry: 1105524658
- Sigstore integration time:
-
Permalink:
cruzlorite/oreilly2pdf@26706cbccd81f208e80e3518d8b1c637cc73b8a5 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/cruzlorite
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@26706cbccd81f208e80e3518d8b1c637cc73b8a5 -
Trigger Event:
release
-
Statement type: