CLI tool for organizing books and PDFs with AI-powered metadata

These details have not been verified by PyPI

Project links

Project description

wst — Wan Shi Tong

"I am Wan Shi Tong, he who knows ten thousand things."

_{Character from Avatar: The Last Airbender. Avatar: The Last Airbender is a trademark of Viacom International Inc. Image used for illustrative purposes only.}

CLI tool for organizing books and PDFs with AI-powered metadata generation.

Named after Wan Shi Tong, the ancient spirit who collected every piece of knowledge in the world and guarded the great library in the desert. This tool aspires to do the same for your PDFs — just with less hostility toward humans.

Features

AI-powered metadata: Automatically extracts and completes metadata (title, author, type, year, summary, tags, etc.) using Claude CLI with web search for missing fields (year, ISBN, publisher)
OCR support: Optionally OCR scanned PDFs before ingestion to extract text from image-based documents
Metadata enrichment: Fill in missing fields (ISBN, table of contents, publisher, year) on existing documents using AI + web search, individually or in batch
Organized library: Files sorted by type (books/, papers/, notes/, exercises/, guides/) with consistent naming (Author - Title (Year).pdf)
SQLite search index: Full-text search across title, author, tags, subject, and summary via FTS5
Coverage stats: See metadata completeness across your library, broken down by document type and field
Interactive browser: Fuzzy-search your library, view and edit metadata interactively
Cloud backup: Backup files to iCloud Drive or S3, with extensible provider system
Extensible backends: Abstract layers for AI (Claude CLI, future API/SDK) and storage (local filesystem, S3)

Installation

pipx (recommended, all platforms)

pipx install wst-library

pip

pip install wst-library

Desktop App (macOS)

Download Wan.Shi.Tong_*.dmg from the latest release, open it, and drag the app to /Applications.

Since the app is not yet notarized by Apple, macOS may show a "damaged" warning on first launch. Run this once in Terminal to clear the quarantine flag:

xattr -cr /Applications/Wan\ Shi\ Tong.app

Then open the app normally.

Homebrew (macOS/Linux)

brew tap cnexans/tap
brew install wst

Chocolatey (Windows)

choco install wst

From source

git clone https://github.com/cnexans/wst.git
cd wst
make install

Quick Start

# Ingest PDFs from a folder
wst ingest ~/Documents/papers/

# Ingest from default inbox (~/wst/inbox/)
wst ingest

# Ingest with OCR for scanned PDFs
wst ingest --ocr

# Ingest with manual confirmation for each file
wst ingest --confirm

# Re-ingest files with fresh AI metadata
wst ingest --reprocess

# Search
wst search "machine learning"
wst search --author "Knuth"
wst search --type textbook

# List and show
wst list
wst list --type paper --sort year
wst show 1

# Edit metadata
wst edit 1
wst edit "Player's Handbook"
wst edit 42 --enrich              # fill missing fields with AI + web search

# Enrich missing metadata in batch
wst fix --dry-run                 # preview what needs fixing
wst fix --type textbook           # fix all textbooks
wst fix --field isbn --field toc  # only fill ISBN and TOC
wst fix -y                        # auto-accept all changes

# Metadata coverage stats
wst stats
wst stats --type textbook

# Interactive browser
wst browse

# Backup
wst backup icloud
wst backup s3

Commands

Command	Description
`wst ingest [PATH]`	Ingest PDFs, generate metadata with AI. Options: `--ocr`, `--confirm`, `--reprocess`, `--verbose`
`wst search <query>`	Full-text search. Options: `--author`, `--type`, `--subject`
`wst list`	List all documents. Options: `--type`, `--sort`
`wst show <id-or-title>`	Show complete metadata for a document
`wst edit <id-or-title>`	Edit metadata interactively, or `--enrich` to fill missing fields with AI
`wst fix`	Batch enrich documents with missing metadata. Options: `--type`, `--field`, `--dry-run`, `-y`
`wst stats`	Show metadata coverage statistics. Options: `--type`
`wst browse`	Interactive TUI for browsing and editing documents
`wst ocr <id-or-path>`	Run OCR on scanned PDFs
`wst backup [provider]`	Backup files to iCloud or S3

How Ingestion Works

PDF file → [OCR (optional)] → Extract text + PDF metadata → AI generates metadata → Store + Index

OCR (optional, --ocr): Scanned PDFs are processed with ocrmypdf to extract text from images before metadata generation.
Text extraction: Reads existing PDF metadata and text from the first pages using PyMuPDF.
AI metadata generation: Sends the text sample to Claude CLI, which analyzes the content and uses web search to find ISBN, publisher, year, and other fields.
Storage: Files are moved to the library, organized by document type with consistent naming (Author - Title (Year).pdf).
Indexing: Metadata is stored in SQLite with full-text search (FTS5).

After ingestion, use wst fix to batch-enrich documents that are missing fields (ISBN, table of contents, etc.) — this is especially useful for scanned books where the initial AI pass may not have found all metadata.

Library Structure

~/wst/
├── inbox/           # PDFs pending ingestion
└── library/
    ├── books/       # book, novel, textbook
    ├── papers/      # paper
    ├── notes/       # class-notes
    ├── exercises/   # exercises
    ├── guides/      # guide-theory, guide-practice
    └── wst.db       # SQLite index

Documentation

See docs/README.md for architecture details and diagrams.

Requirements

Python 3.11+
AI backend (at least one):
- claude CLI (authenticated) — default backend
- codex CLI (authenticated) — use with wst -b codex
macOS, Windows, or Linux

Releasing

To publish a new version to PyPI:

# 1. Bump version in pyproject.toml
# 2. Trigger the release workflow from GitHub Actions:
gh workflow run "Create Tag and Release" \
  --field version="X.Y.Z" \
  --field release_notes="Release notes here"

This creates a git tag, a GitHub Release, and publishes to PyPI automatically.

License

MIT with Commons Clause — free to use, modify, and distribute. Commercial sale rights reserved to the author. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.14.0

May 6, 2026

0.13.0

May 6, 2026

0.12.4

May 6, 2026

0.12.3

May 6, 2026

0.12.2

May 6, 2026

0.12.1

May 6, 2026

0.12.0

May 6, 2026

0.11.0

May 6, 2026

This version

0.10.3

May 5, 2026

0.10.2

May 4, 2026

0.10.1

May 4, 2026

0.10.0

May 4, 2026

0.9.1

May 4, 2026

0.9.0

Apr 26, 2026

0.8.2

Apr 20, 2026

0.8.0

Apr 20, 2026

0.5.0

Apr 11, 2026

0.4.1

Apr 10, 2026

0.4.0

Apr 10, 2026

0.3.1

Apr 10, 2026

0.3.0

Apr 10, 2026

0.2.3

Apr 9, 2026

0.2.2

Apr 9, 2026

0.1.2

Apr 9, 2026

0.1.1

Apr 8, 2026

0.1.0

Apr 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wst_library-0.10.3.tar.gz (65.1 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

wst_library-0.10.3-py3-none-any.whl (58.8 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file wst_library-0.10.3.tar.gz.

File metadata

Download URL: wst_library-0.10.3.tar.gz
Upload date: May 5, 2026
Size: 65.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wst_library-0.10.3.tar.gz
Algorithm	Hash digest
SHA256	`6f94d7b8d19769d76c46a54f26a1d1218832c3f10256576ceadc329c72e5ddbe`
MD5	`b6b744425926ce11c03b33d83c2cce02`
BLAKE2b-256	`47d173643a3f7be85c5c6db6cdda190651cf02f1b2020bebda16da8e5f58120d`

See more details on using hashes here.

File details

Details for the file wst_library-0.10.3-py3-none-any.whl.

File metadata

Download URL: wst_library-0.10.3-py3-none-any.whl
Upload date: May 5, 2026
Size: 58.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wst_library-0.10.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b32f3f9d181aa6c23093e4b3b0757c2bedc1b4dd6581c273d3acc270f703d20c`
MD5	`5edb478b87d9360f7286b037a035c3e7`
BLAKE2b-256	`9c9afff7634f1513ec046cf825b5a856db9c47ec308ee7878ffca813b56aade1`

See more details on using hashes here.

wst-library 0.10.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

wst — Wan Shi Tong

Features

Installation

pipx (recommended, all platforms)

pip

Desktop App (macOS)

Homebrew (macOS/Linux)

Chocolatey (Windows)

From source

Quick Start

Commands

How Ingestion Works

Library Structure

Documentation

Requirements

Releasing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes