Skip to main content

Export O'Reilly books as high-quality PDF via headless Chrome

Project description

📚 oreilly2pdf

PyPI version Python versions License: MIT GitHub stars

Download any book from O'Reilly Learning as a single, high-quality PDF — perfect for offline reading.

All images, cross-chapter links, table of contents, and index entries just work — exactly as you'd expect from a real book.

⚠️ Requires an active O'Reilly Learning subscription.


⚡ Quick Start

pip install oreilly2pdf
oreilly2pdf 9781098150952 --cookie-file cookies.json

That's it. You'll get a 9781098150952.pdf with all chapters merged into one file.


🔧 Installation

From PyPI

pip install oreilly2pdf

From source

git clone https://github.com/cruzlorite/oreilly2pdf.git
cd oreilly2pdf
pip install .

Requirements

  • Python 3.10+
  • Google Chrome (or Chromium)
  • ChromeDriver — installed automatically by Selenium 4.20+

🍪 Getting Your Cookies

You need to provide your O'Reilly session cookies so the tool can access your account. There are three easy ways to get them:

Way 1 — DevTools Console (fastest)

  1. Log in to learning.oreilly.com in Chrome.
  2. Open DevTools (F12) → go to the Console tab.
  3. Paste this and press Enter:
copy(JSON.stringify(Object.fromEntries(document.cookie.split('; ').map(c => c.split('=')))))
  1. Your cookies are now in the clipboard. Save them to a file:
pbpaste > cookies.json   # macOS
xclip -o > cookies.json  # Linux

Way 2 — Cookie-Editor extension

  1. Install Cookie-Editor in your browser.
  2. Go to learning.oreilly.com and log in.
  3. Click the Cookie-Editor icon → ExportJSON.
  4. Paste into cookies.json and reformat as {"name": "value"} pairs.

Way 3 — Manual

  1. Open DevTools (F12) → Application tab → Cookieshttps://learning.oreilly.com.
  2. Create a cookies.json with the relevant cookie values:
{
  "BrowserCookie": "...",
  "orm-jwt": "...",
  "orm-rt": "...",
  "groot_sessionid": "..."
}

Note: The most important cookies are typically orm-jwt and groot_sessionid. If export fails, try adding more cookies from your browser.


📖 Finding the Book ID

Open any book on O'Reilly and look at the URL — the book ID is the ISBN number:

https://learning.oreilly.com/library/view/book-title/9781098150952/
                                                     ^^^^^^^^^^^^^
                                                        book_id

🚀 Usage

# Basic usage
oreilly2pdf <book_id> --cookie-file cookies.json

# Custom output filename
oreilly2pdf 9781098150952 --cookie-file cookies.json -o my_book.pdf

# Inline cookies instead of a file
oreilly2pdf 9781098150952 --cookies "orm-jwt=eyJ...; groot_sessionid=xyz"

# Keep individual chapter PDFs alongside the merged output
oreilly2pdf 9781098150952 --cookie-file cookies.json --keep-chapters

All Options

Option Description
book_id O'Reilly book identifier (ISBN) — required
--cookie-file FILE Path to a cookies file (JSON or plain text)
--cookies STRING Inline cookies (key=value; key2=value2)
-o, --output FILE Output path (default: <book_id>.pdf)
--keep-chapters Save individual chapter PDFs too

✨ Features

📄 Full book Cover, TOC, all chapters, appendices, index — everything
🖼️ Images Lazy-loaded and dynamic images fully resolved
🔗 Cross-chapter links "See Section 4.3" actually jumps to Section 4.3
🧹 Clean output No navigation bars, cookie banners, or popups
🎨 Faithful rendering Math, code blocks, tables, figures — pixel-perfect

🔍 How It Works

  1. Fetches the book's table of contents from the O'Reilly API.
  2. Opens each chapter in headless Chrome with your session cookies.
  3. Waits for all images (including lazy-loaded ones) to fully render.
  4. Strips the O'Reilly UI — keeps only the book content.
  5. Prints each chapter to PDF via Chrome DevTools Protocol.
  6. Merges everything into a single PDF and rewrites cross-chapter links so they work as clickable in-document jumps.

⚖️ Disclaimer

This tool is intended for personal, offline use only by users who hold a valid O'Reilly Learning subscription. It accesses content you are already entitled to read through your account.

  • Do not distribute, share, or upload exported PDFs.
  • Do not use this tool to circumvent access controls or pirate content.
  • You are solely responsible for complying with O'Reilly's Terms of Service and applicable copyright law.

The authors of this project are not affiliated with O'Reilly Media and assume no liability for misuse.


🙏 Acknowledgements

Inspired by oreilly-epub-downloader by @tctibbs.

📄 License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oreilly2pdf-0.1.2.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oreilly2pdf-0.1.2-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file oreilly2pdf-0.1.2.tar.gz.

File metadata

  • Download URL: oreilly2pdf-0.1.2.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for oreilly2pdf-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c2e4f8bde2eeb2cf2515ce89c79d9da62ce8381eca977f44dd31d624a4574b93
MD5 572aa61aeac59bae6769a319d691be11
BLAKE2b-256 2c039f17b11c2331c3c78d086fb42d7343f58022eb60f3d9e02f997a87f34994

See more details on using hashes here.

Provenance

The following attestation bundles were made for oreilly2pdf-0.1.2.tar.gz:

Publisher: publish.yml on cruzlorite/oreilly2pdf

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oreilly2pdf-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: oreilly2pdf-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 14.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for oreilly2pdf-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 44e93c4e89abea00eb69edef911762edc372a8f9e2fc267e281a44594fcfd04f
MD5 6e9b2d49b2edf81af8c6a49b93fad8d4
BLAKE2b-256 7f9d1b82e72a6838aec36d42fdf4087c924d139648a9081f56937e9f3d0239c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for oreilly2pdf-0.1.2-py3-none-any.whl:

Publisher: publish.yml on cruzlorite/oreilly2pdf

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page