A smart PDF splitter that uses AI to extract chapters.

These details have not been verified by PyPI

Project links

Homepage

Project description

Folix ✂️

A smart, AI-powered PDF splitter.

Folix is a CLI tool designed to split large PDF textbooks and documents into clean, individual chapter files. Unlike standard splitters that blindly cut pages, Folix uses Mistral AI to parse the Table of Contents, automatically calculate page offsets, and handle complex layouts (like double‑column indices) with ease.

🚀 Features

📚 Smart Chapter Extraction Automatically detects chapters using native PDF bookmarks (ToC).
🤖 AI‑Powered Fallback If bookmarks are missing, Folix reads the visual Table of Contents page and uses Mistral AI to identify chapters.
🧠 Intelligent Offset Calculation Automatically aligns printed page numbers with the physical PDF structure .
👁️ Physical Layout Analysis Correctly parses multi‑column Tables of Contents that confuse standard PDF tools.
🔍 Interactive Inspection Visualizes the document structure so you can choose exactly which hierarchy level (Part, Chapter, Section) to extract.
🛠️ Zero‑Config CLI Simple commands for extracting, merging, and inspecting PDFs.

📦 Installation

Option A: Install via PyPI (Recommended)

pip install folix

Option B: Install from Source

git clone https://github.com/yourusername/folix.git
cd folix
pip install .

🔑 Setup (AI Features)

Folix works out‑of‑the‑box for PDFs that include standard bookmarks. For scanned books or files without metadata, you’ll need a free Mistral AI API key to enable automatic chapter detection.

1. Get an API Key

2. Set the Environment Variable

Mac / Linux

export MISTRAL_API_KEY="your_actual_key_here"

Windows (PowerShell)

$env:MISTRAL_API_KEY="your_actual_key_here"

📖 Usage

1. Extract Chapters

The primary command. Folix first attempts bookmark‑based extraction; if none are found, it automatically falls back to AI detection.

folix extract <file_name>

Options:

--level 1 → Extract top‑level items (e.g. Parts)
--level 2 → Extract chapters

2. Interactive Mode

If you’re unsure how the document is structured, run extraction normally and Folix will guide you.

folix extract <file_name>

Example Output:

📘  Analyzing structure of: complex_book.pdf
--------------------------------------------------------------------------------
Lvl  | Count  | Samples (First 3 items)
--------------------------------------------------------------------------------
1    | 5      | Part I, Part II, Part III...
2    | 32     | 1. Introduction, 2. The Basics, 3. Advanced Topics...
--------------------------------------------------------------------------------

Select a Level to extract (or 'q' to quit):

3. Merge PDFs

Combine multiple PDFs into a single file.

folix merge <pdf_names> -output <output_file_name>

4. Manual Split

Split a page range manually.

folix split input.pdf --start <start_page> --end <end_page> --output <output_file_name>

🛠️ How It Works

Folix uses a three‑stage fallback system to ensure accurate chapter extraction:

Metadata Scan Detects native PDF bookmarks (Table of Contents).
AI Analysis If metadata is missing, Folix locates the visual Contents page, cleans the extracted text to reduce token usage, and sends it to Mistral AI for chapter identification.
Visual Anchor & Offset Alignment
- The AI may say: "Chapter 1 starts on page 1"
- Folix scans the PDF to find where "Chapter 1" physically appears (e.g. page 18)
- A global offset is calculated and applied to all chapters, ensuring precise cuts

🤝 Contributing

Contributions are welcome!

Fork the repository

Create your feature branch:

git checkout -b feature/amazing-feature

Commit your changes:

git commit -m "Add some amazing feature"

Push to the branch:

git push origin feature/amazing-feature

Open a Pull Request

📄 License

Distributed under the MIT License. See LICENSE

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.0.1

Jan 9, 2026

This version

1.0.0

Jan 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

folix-1.0.0.tar.gz (9.7 kB view details)

Uploaded Jan 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

folix-1.0.0-py3-none-any.whl (9.1 kB view details)

Uploaded Jan 9, 2026 Python 3

File details

Details for the file folix-1.0.0.tar.gz.

File metadata

Download URL: folix-1.0.0.tar.gz
Upload date: Jan 9, 2026
Size: 9.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for folix-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`173527e6d09cbbd5b5d118a0d8f1393155b227c36d3ef6f6b899a3c360400647`
MD5	`213555c24f5790496152690036d5b044`
BLAKE2b-256	`f08a31a0346862e5beb6d151c4b8dc60dd78cc151a211e85131ad2805120fa53`

See more details on using hashes here.

File details

Details for the file folix-1.0.0-py3-none-any.whl.

File metadata

Download URL: folix-1.0.0-py3-none-any.whl
Upload date: Jan 9, 2026
Size: 9.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for folix-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`830c739b77ea7b2e67e2d81616328ae7f4dbc9e3173b801e3f02e5a91476b687`
MD5	`95911724ea8d3cb8ea459ba20d1a0645`
BLAKE2b-256	`f6578e13afdd70a19aaf1a5b30ded73fedf0d4684b588a4f1e6e06b3ee7e4683`

See more details on using hashes here.

folix 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Folix ✂️

🚀 Features

📦 Installation

Option A: Install via PyPI (Recommended)

Option B: Install from Source

🔑 Setup (AI Features)

1. Get an API Key

2. Set the Environment Variable

📖 Usage

1. Extract Chapters

2. Interactive Mode

3. Merge PDFs

4. Manual Split

🛠️ How It Works

🤝 Contributing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes