Your consulting portfolio, searchable and AI-ready.

These details have not been verified by PyPI

Project links

Project description

folio

Your consulting portfolio, searchable and AI-ready.

Python 3.10+ License: Apache 2.0

What It Does

Turn consulting decks into structured, searchable markdown -- with version tracking and optional AI analysis.

Folio converts PPTX, PPT, and PDF presentations into Markdown with YAML frontmatter, slide images, and optional LLM-powered analysis. Every conversion preserves three layers: exact verbatim text, slide images at configurable DPI, and per-slide analysis with evidence grounding.

Folio tracks versions automatically -- re-converting an updated deck increments the version, detects per-slide changes, and preserves history. Open library/ as an Obsidian vault and frontmatter is indexed automatically.

Quick Start

Prerequisites

Python 3.10+
LibreOffice or Microsoft PowerPoint (for PPTX/PPT conversion)
Poppler (for PDF image extraction)

# macOS
brew install --cask libreoffice
brew install poppler

# Ubuntu/Debian
sudo apt install libreoffice poppler-utils

If you're on a managed macOS laptop that blocks LibreOffice, Folio can use Microsoft PowerPoint as the PPTX/PPT renderer. The current PowerPoint path opens decks via Launch Services (open -a "Microsoft PowerPoint" ...) and then exports to PDF. In batch mode, Folio can also restart PowerPoint periodically during long PPTX runs when --dedicated-session is enabled (the default).

For managed-mac usage:

Run batch jobs from Terminal.app
Use a dedicated PowerPoint session with no unrelated presentations open
See docs/guides/managed_mac_workflow.md for the full workflow and PDF fallback guidance

You can force a specific renderer with pptx_renderer: powerpoint in folio.yaml. If neither renderer is available, export the deck to PDF in PowerPoint and run folio convert deck.pdf.

Install

pip install folio-love

The installed CLI command remains folio.

Or install from source:

git clone https://github.com/ohjonathan/folio.love.git
cd folio.love
pip install -e .

Anthropic support is included in the base install. If you want to use OpenAI or Google Gemini, install the optional provider SDKs too:

pip install "folio-love[llm]"

From source:

pip install -e ".[llm]"

First conversion

folio convert deck.pptx

✓ deck.pptx
  24 slides → library/deck/deck.md
  Version: 1 | ID: evidence_20260306_deck

Enable LLM analysis

export ANTHROPIC_API_KEY=sk-ant-...
folio convert deck.pptx --passes 2

Folio now supports Anthropic, OpenAI, and Google Gemini for slide analysis. Configure named profiles in folio.yaml, then either use the default convert route or override it per run with --llm-profile.

llm:
  profiles:
    high_quality_anthropic:
      provider: anthropic
      model: claude-sonnet-4-20250514
      api_key_env: ANTHROPIC_API_KEY

    fast_openai:
      provider: openai
      model: gpt-4o-mini
      api_key_env: OPENAI_API_KEY

    backup_google:
      provider: google
      model: gemini-2.5-pro
      api_key_env: GEMINI_API_KEY

  routing:
    default:
      primary: high_quality_anthropic
      fallbacks: []
    convert:
      primary: high_quality_anthropic
      fallbacks: [backup_google]

# Uses llm.routing.convert
folio convert deck.pptx --passes 2

# Force a specific profile for this run (disables route fallbacks)
folio convert deck.pptx --llm-profile fast_openai

Without a valid provider SDK or API key, analysis is skipped gracefully. The tool still completes conversion and writes provider-aware pending-analysis messages into the markdown output.

Commands

`folio convert`

Convert a single deck to Folio markdown.

# Basic
folio convert deck.pptx

# With client and engagement metadata
folio convert deck.pptx --client Acme --engagement "DD Q1 2026"

# Deep analysis (two-pass, selective re-analysis of dense slides)
folio convert deck.pptx --passes 2

# Force fresh analysis, ignore cache
folio convert deck.pptx --no-cache

# Full metadata
folio convert deck.pptx \
  --client Acme \
  --engagement "DD Q1 2026" \
  --subtype research \
  --industry "retail,ecommerce" \
  --tags "market-sizing,tam" \
  --note "Updated risk figures"

Flags

Flag	Description
`--client`	Client name (used in output path and frontmatter)
`--engagement`	Engagement identifier
`--note`, `-n`	Version note (e.g. "Updated per client feedback")
`--target`, `-t`	Override output directory
`--passes`, `-p`	Analysis depth: `1` = standard, `2` = deep (selective second pass on dense slides)
`--no-cache`	Force re-analysis; fresh results replace cached entries
`--subtype`	Evidence subtype: `research`, `data_extract`, `external_report`, `benchmark`
`--industry`	Industry tags, comma-separated
`--tags`	Manual tags to merge with auto-generated, comma-separated
`--llm-profile`	Override the configured LLM profile for this command

`folio batch`

Batch convert all matching files in a directory.

# Automated PPTX conversion
folio batch ./materials --client Acme

# PDF mitigation workflow (not Tier 1)
folio batch ./pdfs --pattern "*.pdf" --client Acme

# Skip restart automation if other presentations are open in PowerPoint
folio batch ./materials --no-dedicated-session

Converting 3 files...

✓ overview.pptx (18 slides, 4.1s)
✓ financials.pptx (32 slides, 7.8s)
✓ appendix.pptx (12 slides, 2.9s)

Automated PPTX: 3 succeeded, 0 failed

Accepts the same flags as convert (--client, --engagement, --passes, --llm-profile, etc.). Default pattern is *.pptx.

batch also supports --dedicated-session/--no-dedicated-session for the PowerPoint restart workflow on managed macOS. Operator-exported PDF batches are supported, but they are mitigation-only and do not count toward Tier 1 automated conversion goals.

`folio status`

Show library health -- which decks are current, stale, or missing their source file.

folio status
folio status Acme    # scope to a client

Library: 5 decks
  ✓ Current: 3
  ⚠ Stale: 1
  ✗ Missing source: 1

Stale:
  Acme/dd_q1_2026/financials/financials.md

Missing:
  Acme/dd_q1_2026/appendix/appendix.md (source: /materials/appendix.pptx)

Stale means the source file changed since the last conversion -- re-run folio convert on it. Missing means the source file can no longer be found at the original path.

Global flags: --verbose / -v (debug logging), --config / -c (path to folio.yaml)

Output Structure

library/
└── Acme/
    └── dd_q1_2026/
        └── market_overview/
            ├── market_overview.md        # Full markdown with frontmatter
            ├── slides/
            │   ├── slide-001.png
            │   ├── slide-002.png
            │   └── ...
            ├── .analysis_cache.json      # LLM response cache
            ├── .texts_cache.json         # Text extraction cache
            └── version_history.json      # Full version log

Example output (condensed):

---
id: acme_dd_q1_2026_evidence_20260306_market_overview
title: Market Overview
type: evidence
subtype: research
status: active
source: /materials/market_overview.pptx
source_hash: a1b2c3d4e5f6
version: 2
created: 2026-03-01T10:00:00Z
modified: 2026-03-06T14:30:00Z
client: Acme
engagement: DD Q1 2026
industry:
- ecommerce
- retail
frameworks:
- TAM/SAM/SOM
tags:
- ecommerce
- market-sizing
- retail
_llm_metadata:
  convert:
    requested_profile: high_quality_anthropic
    profile: high_quality_anthropic
    provider: anthropic
    model: claude-sonnet-4-20250514
    fallback_used: false
    status: executed
    pass2:
      status: skipped
      reason: pass_disabled
---

# Market Overview

**Source:** `/materials/market_overview.pptx`
**Version:** 2 | **Converted:** 2026-03-06
**Status:** △ Current

---

## Slide 1

![Slide 1](slides/slide-001.png)

### Text (Verbatim)

> Total Addressable Market: $4.2B
> Source: Industry Report 2025

### Analysis

**Slide Type:** data_heavy
**Framework:** TAM/SAM/SOM
**Key Data:** TAM $4.2B, SAM $1.8B, SOM $340M
**Main Insight:** Market sizing shows serviceable segment at 43% of TAM

**Evidence:**
- **TAM figure of $4.2B (high):** "Total Addressable Market: $4.2B" *(title)*
- **SAM represents 43% of TAM (medium, pass 2):** "SAM $1.8B" *(body)* [unverified]

---

Configuration

Folio looks for folio.yaml by walking up from the current directory. All fields are optional, and the example below shows a multi-provider setup rather than the minimal default config.

# folio.yaml — example multi-provider configuration
library_root: ./library              # Where converted decks are written

sources:                             # Optional; organize source directories
  - name: materials
    path: /path/to/source/decks
    target_prefix: ""

llm:
  profiles:
    high_quality_anthropic:
      provider: anthropic
      model: claude-sonnet-4-20250514
      api_key_env: ANTHROPIC_API_KEY

    fast_openai:
      provider: openai
      model: gpt-4o-mini
      api_key_env: OPENAI_API_KEY

    backup_google:
      provider: google
      model: gemini-2.5-pro
      api_key_env: GEMINI_API_KEY

  routing:
    default:
      primary: high_quality_anthropic
      fallbacks: []
    convert:
      primary: high_quality_anthropic
      fallbacks: [backup_google]

conversion:
  image_dpi: 150                     # Slide image resolution (px/in)
  image_format: png
  libreoffice_timeout: 60            # Seconds before conversion times out
  default_passes: 1                  # 1 = standard, 2 = deep
  density_threshold: 2.0             # Pass 2 density trigger
  pptx_renderer: auto                # auto | libreoffice | powerpoint

Legacy shorthand is still supported for Anthropic-only setups:

llm:
  provider: anthropic
  model: claude-sonnet-4-20250514

With no folio.yaml, Folio uses these defaults: output goes to ./library, images render at 150 DPI, and analysis runs a single Anthropic-backed pass if ANTHROPIC_API_KEY is present.

Environment Variable	Purpose
`ANTHROPIC_API_KEY`	Anthropic profile credentials
`OPENAI_API_KEY`	OpenAI profile credentials (`pip install "folio-love[llm]"` or source `pip install -e ".[llm]"`)
`GEMINI_API_KEY`	Google Gemini profile credentials (`pip install "folio-love[llm]"` or source `pip install -e ".[llm]"`)

How It Works

Input (.pptx/.ppt/.pdf)
  │
  ├─ Normalize ──→ Convert to PDF
  │                 LibreOffice (headless) or PowerPoint on macOS
  │                 PowerPoint path: Launch Services open + AppleScript export
  │                 PDF input: direct copy + warning heuristics
  │
  ├─ Images ─────→ Extract slide images, detect blank slides
  │
  ├─ Text ───────→ Extract structured text per slide, reconcile count
  │
  ├─ Analysis ───→ Route-based LLM classification + evidence extraction (cached)
  │                 Optional transient fallback to backup profiles
  │                 Pass 2: selective re-analysis of dense slides
  │
  ├─ Tracking ───→ Version detection, per-slide change diffing
  │
  └─ Assembly ───→ YAML frontmatter + Markdown output (atomic write)

Each stage is independent and testable. LLM analysis results are cached per-slide -- re-conversion only re-analyzes changed slides. Blank slides are detected via image histogram analysis and excluded from deep analysis.

Version Tracking

Re-converting an updated deck increments the version and records which slides were added, modified, or removed.

folio convert deck.pptx --note "Updated risk figures"

✓ deck.pptx
  24 slides → library/deck/deck.md
  Version: 2 | ID: evidence_20260306_deck
  Modified: slides 3, 7, 12
  Added: slides 24

Use folio status to find stale decks -- where the source file has changed since the last conversion.

Version history is recorded in both the markdown output and version_history.json:

Version	Date	Changes	Note
v2	2026-03-06	3 modified, 1 added	Updated risk figures
v1	2026-03-01	Initial (23 slides)	--

Development

pip install -e ".[dev]"
.venv/bin/python -m build
.venv/bin/python -m twine check dist/*
.venv/bin/python -m pytest
.venv/bin/python -m pytest --cov=folio

folio/
├── cli.py              # Click CLI (convert, batch, status)
├── config.py           # FolioConfig + folio.yaml loading
├── converter.py        # Pipeline orchestrator
├── pipeline/
│   ├── normalize.py    # PPTX/PPT → PDF
│   ├── images.py       # PDF → slide images + blank detection
│   ├── text.py         # Structured text extraction + reconciliation
│   └── analysis.py     # LLM analysis + caching
├── output/
│   ├── frontmatter.py  # YAML frontmatter (v2 schema)
│   └── markdown.py     # Markdown assembly
└── tracking/
    ├── sources.py      # Source file tracking + staleness
    └── versions.py     # Version detection + change sets

Validation

Folio has been validated against a 50-deck corpus of real consulting presentations (Tier 1 gate: 50/50 automated PPTX conversion, zero silent failures). Validation reports, session logs, and chat logs are preserved on the docs-internal branch.

Roadmap

Search and retrieval (folio search) is planned but not yet implemented. Today, converted decks are searchable via Obsidian, grep, or any tool that reads Markdown + YAML frontmatter.

License

Apache 2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.4

Apr 16, 2026

0.4.0

Mar 31, 2026

0.3.0

Mar 28, 2026

0.2.0

Mar 22, 2026

This version

0.1.0

Mar 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

folio_love-0.1.0.tar.gz (133.3 kB view details)

Uploaded Mar 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

folio_love-0.1.0-py3-none-any.whl (140.5 kB view details)

Uploaded Mar 16, 2026 Python 3

File details

Details for the file folio_love-0.1.0.tar.gz.

File metadata

Download URL: folio_love-0.1.0.tar.gz
Upload date: Mar 16, 2026
Size: 133.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for folio_love-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`3c661773499b01629ce3c2cf335f17614d49191e1a780a3e787241007b0030f5`
MD5	`dafdf181455a4646b59ce685eb8326f5`
BLAKE2b-256	`a1646a2f0c7e4a3ba8c7ee16a5a4f2cd161d6165de698346fe0f8fac4652f3e5`

See more details on using hashes here.

File details

Details for the file folio_love-0.1.0-py3-none-any.whl.

File metadata

Download URL: folio_love-0.1.0-py3-none-any.whl
Upload date: Mar 16, 2026
Size: 140.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for folio_love-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`37e3cff247d63dc1c66d06fbda69e873a3d286d5a756e92fcddc3c44d6cac5c3`
MD5	`684e5f0d5692d79ff996d7f36724bc24`
BLAKE2b-256	`ad5f0199c901da20f6aa886a410be15f2fe346819eeb3f0ff7ca6ade7bef1cca`

See more details on using hashes here.

folio-love 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

folio

What It Does

Quick Start

Commands

folio convert

folio batch

folio status

Output Structure

Configuration

How It Works

Version Tracking

Development

Validation

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`folio convert`

`folio batch`

`folio status`