Skip to main content

Summarize books chapter by chapter using AI

Project description

Condensr

Summarize books chapter by chapter using AI.

Condensr takes a PDF book as input and produces a structured Markdown summary. It detects chapters automatically and summarizes each one individually (max ~500 words per chapter) using Mistral AI.

Condensr Screenshot

Installation

pip install condensr

Quick Start

Python API

import condensr

# Preview detected chapters (no API calls)
chapters = condensr.get_chapters("book.pdf")
print(chapters)
# ["Introduction", "Chapter 1: Origins", "Chapter 2: Growth"]

# Summarize chapter by chapter
for title, summary in condensr.summarize("book.pdf"):
    print(f"## {title}\n{summary}\n")

CLI

condensr book.pdf
# Writes book-summary.md with progress output

API Reference

condensr.get_chapters(pdf_path, *, toc_level=1)

Detect and return chapter titles from a PDF. Uses heuristic detection only (no API calls). Returns an empty list if no chapters are found.

Parameters:

  • pdf_path (str) — Path to the PDF file.
  • toc_level (int) — Maximum TOC depth to include. Default: 1 (main chapters only). Use 2 to include sub-sections.

Returns: list[str] — Chapter titles.

condensr.summarize(pdf_path, *, model, api_key, on_chapter, toc_level=1)

Summarize a PDF book chapter by chapter. Returns a generator yielding (title, summary_markdown) tuples.

If no chapters are detected, summarizes the entire book as one unit.

Parameters:

  • pdf_path (str) — Path to the PDF file.
  • model (str) — Mistral model name. Default: "mistral-small-latest".
  • api_key (str | None) — Mistral API key. Falls back to MISTRAL_API_KEY env var.
  • on_chapter (callable | None) — Optional callback(title, summary) fired before each yield.
  • toc_level (int) — Maximum TOC depth to include. Default: 1. Use 2 to include sub-sections.

Yields: tuple[str, str](chapter_title, summary_markdown).

Callback Example

def on_chapter(title, summary):
    save_to_db(title, summary)

for title, summary in condensr.summarize("book.pdf", on_chapter=on_chapter):
    display(title, summary)

CLI Reference

Usage: condensr [OPTIONS] PDF_PATH

  Summarize a PDF book chapter by chapter.

Options:
  -o, --output PATH    Output file path. Default: <book>-summary.md
  -m, --model TEXT     Mistral model name.
  -t, --toc-level INT  Chapter detection depth: 1 for main chapters only,
                       2 to include sub-sections. [default: 1]
  --help               Show this message and exit.

Configuration

Set your Mistral API key as an environment variable:

export MISTRAL_API_KEY="your-api-key"

Or pass it programmatically:

for title, summary in condensr.summarize("book.pdf", api_key="your-api-key"):
    ...

Privacy Notice

Condensr sends the text content of your PDF to Mistral AI's API servers for summarization. Do not use Condensr with confidential or sensitive documents unless you are comfortable with this data being transmitted to a third-party service. Review Mistral's privacy policy for details on how your data is handled.

License

AGPL-3.0-or-later

Release Process

This project uses Commitizen for automated versioning and twine for local PyPI publishing.

How to release a new version

  1. Ensure all your changes are committed on the main branch.
  2. Run the bump command to update CHANGELOG.md and create a new Git tag:
    cz bump
    
  3. Build and upload the package:
    rm -rf dist/ build/
    python -m build
    twine upload dist/*
    
  4. Push the changes and new tag to GitLab:
    git push origin main --tags
    

For detailed instructions, see docs/RELEASE_PROCESS.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

condensr-0.3.0.tar.gz (237.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

condensr-0.3.0-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file condensr-0.3.0.tar.gz.

File metadata

  • Download URL: condensr-0.3.0.tar.gz
  • Upload date:
  • Size: 237.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for condensr-0.3.0.tar.gz
Algorithm Hash digest
SHA256 5ea8296c75384b5c8722c09649662d9c134954e7c5f88720ca38731574657f16
MD5 5d4691886108ab297ee0bf85a8fffcd2
BLAKE2b-256 5e788f08fdec3cd1b57691677b51661f684d7f5b5601b3098606279f3662073e

See more details on using hashes here.

File details

Details for the file condensr-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: condensr-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for condensr-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 72d1c02d3db97488decdcd37de4bcf9ec19ec4dbb215bbf1713318530599ca52
MD5 14d3d0d3e9ed2fb6dcdc1f4d45408298
BLAKE2b-256 a91d5cc2de381e1854586b6f7c0df47013371950134204d4c2d110064eb81bd7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page