Summarize books chapter by chapter using AI
Project description
Condensr
Summarize books chapter by chapter using AI.
Condensr takes a PDF book as input and produces a structured Markdown summary. It detects chapters automatically and summarizes each one individually (max ~500 words per chapter) using Mistral AI.
Installation
pip install condensr
Quick Start
Python API
import condensr
# Preview detected chapters (no API calls)
chapters = condensr.get_chapters("book.pdf")
print(chapters)
# ["Introduction", "Chapter 1: Origins", "Chapter 2: Growth"]
# Summarize chapter by chapter
for title, summary in condensr.summarize("book.pdf"):
print(f"## {title}\n{summary}\n")
CLI
condensr book.pdf
# Writes book-summary.md with progress output
API Reference
condensr.get_chapters(pdf_path, *, toc_level=1)
Detect and return chapter titles from a PDF. Uses heuristic detection only (no API calls). Returns an empty list if no chapters are found.
Parameters:
pdf_path(str) — Path to the PDF file.toc_level(int) — Maximum TOC depth to include. Default:1(main chapters only). Use2to include sub-sections.
Returns: list[str] — Chapter titles.
condensr.summarize(pdf_path, *, model, api_key, on_chapter, toc_level=1)
Summarize a PDF book chapter by chapter. Returns a generator yielding (title, summary_markdown) tuples.
If no chapters are detected, summarizes the entire book as one unit.
Parameters:
pdf_path(str) — Path to the PDF file.model(str) — Mistral model name. Default:"mistral-small-latest".api_key(str | None) — Mistral API key. Falls back toMISTRAL_API_KEYenv var.on_chapter(callable | None) — Optionalcallback(title, summary)fired before each yield.toc_level(int) — Maximum TOC depth to include. Default:1. Use2to include sub-sections.
Yields: tuple[str, str] — (chapter_title, summary_markdown).
Callback Example
def on_chapter(title, summary):
save_to_db(title, summary)
for title, summary in condensr.summarize("book.pdf", on_chapter=on_chapter):
display(title, summary)
CLI Reference
Usage: condensr [OPTIONS] PDF_PATH
Summarize a PDF book chapter by chapter.
Options:
-o, --output PATH Output file path. Default: <book>-summary.md
-m, --model TEXT Mistral model name.
-t, --toc-level INT Chapter detection depth: 1 for main chapters only,
2 to include sub-sections. [default: 1]
--help Show this message and exit.
Configuration
Set your Mistral API key as an environment variable:
export MISTRAL_API_KEY="your-api-key"
Or pass it programmatically:
for title, summary in condensr.summarize("book.pdf", api_key="your-api-key"):
...
Privacy Notice
Condensr sends the text content of your PDF to Mistral AI's API servers for summarization. Do not use Condensr with confidential or sensitive documents unless you are comfortable with this data being transmitted to a third-party service. Review Mistral's privacy policy for details on how your data is handled.
License
AGPL-3.0-or-later
Release Process
This project uses Commitizen for automated versioning and twine for local PyPI publishing.
How to release a new version
- Ensure all your changes are committed on the
mainbranch. - Run the bump command to update
CHANGELOG.mdand create a new Git tag:cz bump - Build and upload the package:
rm -rf dist/ build/ python -m build twine upload dist/*
- Push the changes and new tag to GitLab:
git push origin main --tags
For detailed instructions, see docs/RELEASE_PROCESS.md.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file condensr-0.3.0.tar.gz.
File metadata
- Download URL: condensr-0.3.0.tar.gz
- Upload date:
- Size: 237.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ea8296c75384b5c8722c09649662d9c134954e7c5f88720ca38731574657f16
|
|
| MD5 |
5d4691886108ab297ee0bf85a8fffcd2
|
|
| BLAKE2b-256 |
5e788f08fdec3cd1b57691677b51661f684d7f5b5601b3098606279f3662073e
|
File details
Details for the file condensr-0.3.0-py3-none-any.whl.
File metadata
- Download URL: condensr-0.3.0-py3-none-any.whl
- Upload date:
- Size: 22.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72d1c02d3db97488decdcd37de4bcf9ec19ec4dbb215bbf1713318530599ca52
|
|
| MD5 |
14d3d0d3e9ed2fb6dcdc1f4d45408298
|
|
| BLAKE2b-256 |
a91d5cc2de381e1854586b6f7c0df47013371950134204d4c2d110064eb81bd7
|