A package for automating the creation of comprehensive and organized domain knowledge bases for AI applications.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

btfranklin

These details have not been verified by PyPI

Project description

Compendium Scribe

Compendium Scribe banner

Compendium Scribe is a Click-driven command line tool and library that builds sourced research compendiums through a bounded OpenAI Agents SDK workflow. It decomposes a topic into planning, web research, verification, and synthesis stages, then renders the final Compendium as Markdown, XML, HTML, or PDF.

Features

Agents SDK research workflow - Runs planner, research manager, section researcher, verifier, and synthesis agents with structured Pydantic outputs.
Hosted web search where it belongs - Enables web search for research manager, section research, and verification agents; planner and synthesis stay source-controlled.
Stable renderer contract - Final agent output is validated and passed through the existing Compendium.from_payload() shape.
Citation ledger - Deduplicates URLs, assigns citation IDs, tracks section usage, and rejects final citations that are not ledger-backed.
Recoverable sidecars - Writes <base>.research.json after accepted artifacts and <base>.costs.json for usage/cost telemetry.
Local cost estimates - Uses a checked-in pricing catalog for GPT-5.5 and GPT-5.4 family token rates, long-context uplifts, and built-in tool call pricing when usage metadata is available.
Compendium Library publishing - Optionally publishes XML, Markdown, and metadata cards into a movable filesystem library with a root catalog.json.
Re-rendering - Ingest existing XML compendiums to generate new output formats without re-running research.
Offline tests - The workflow uses a runner adapter so tests can stub Agents SDK runs without live API calls.

Quick Start

1. Install

pdm install --dev

Ensure PDM_HOME points to a writable location when developing within a sandboxed environment.

2. Configure credentials

Create a .env file (untracked) with your OpenAI credentials and explicit research model settings:

OPENAI_API_KEY=sk-...
PLANNER_AGENT_MODEL=gpt-5.5
RESEARCH_AGENT_MODEL=gpt-5.5
VERIFIER_AGENT_MODEL=gpt-5.5
SYNTHESIS_AGENT_MODEL=gpt-5.5
MAX_AGENT_TURNS=12

All four model variables are required. If any are missing or blank, Compendium Scribe stops before client setup, cost report initialization, or research begins and names the missing setting.

The research workflow uses the OpenAI Agents SDK with hosted web search enabled on the manager, section, and verifier agents.

Cost reports use the local catalog in src/compendiumscribe/research/data/pricing.standard.json. The catalog currently covers GPT-5.5, GPT-5.4 family token pricing, long-context rates above the documented threshold, web search calls, and Responses API file search calls. If a model is missing from the catalog, token usage is still recorded and USD estimates are left unavailable.

3. Generate a compendium

pdm run compendium create "Lithium-ion battery recycling"

Options:

--output PATH - Base path/filename for the output. The extension is ignored.
--format FORMAT - Output format, defaulting to md. Available: md, xml, html, pdf. Repeat for multiple outputs.
--library PATH - Also publish the finished compendium into a Compendium Library directory.

If you pass --output report.md, Compendium Scribe writes:

report.md or the requested render formats
report.research.json
report.costs.json

Without --output, the base name is the slugified topic plus a UTC timestamp.

4. Publish to a Compendium Library

A Compendium Library is a directory agents can scan progressively. The root catalog.json is the compact card catalog. Each entry points to canonical XML, readable Markdown, and a richer card for one compendium:

research-library/
├── catalog.json
└── compendiums/
    └── lithium-ion-battery-recycling/
        ├── compendium.xml
        ├── compendium.md
        └── card.json

Creation works the same as usual unless --library is provided. When it is provided, requested outputs are still written normally, and the final compendium is also upserted into the library:

pdm run compendium create "Lithium-ion battery recycling" \
  --output report.md \
  --format md \
  --format xml \
  --library research-library

Import an existing XML compendium:

pdm run compendium library import research-library report.xml

Library entries are idempotent by slugified title. Re-publishing the same title updates the existing compendium.xml, compendium.md, card.json, and catalog.json entry. If another title would use the same slug, the new entry gets a numeric suffix such as -2.

5. Recover a research run

Recovery resumes from the next incomplete stage in the sidecar state file:

pdm run compendium recover --input report.research.json

The recover command writes outputs using the same base path as the sidecar. For example, report.research.json renders to report.md when the stored format is Markdown.

6. Render formats from existing XML

pdm run compendium render my-topic.xml --format html

Options:

--format FORMAT - Output format(s) to generate: md, xml, html, pdf.
--output PATH - Base path/filename for the output.

Python API Usage

from compendiumscribe import build_compendium, ResearchConfig, DeepResearchError

try:
    compendium = build_compendium(
        "Emerging pathogen surveillance",
        config=ResearchConfig(
            planner_agent_model="gpt-5.5",
            research_agent_model="gpt-5.5",
            verifier_agent_model="gpt-5.5",
            synthesis_agent_model="gpt-5.5",
        ),
    )
except DeepResearchError:
    raise

xml_payload = compendium.to_xml_string()
markdown_doc = compendium.to_markdown()
html_files = compendium.to_html_site()
pdf_bytes = compendium.to_pdf_bytes()

The returned Compendium object contains structured sections, insights, citations, and open questions.

Data Model Overview

Compendium Scribe produces XML shaped like:

<compendium topic="Lithium-ion Battery Recycling" generated_at="2026-04-23T14:32:33+00:00">
  <overview><![CDATA[Comprehensive synthesis of the state of lithium-ion recycling...]]></overview>
  <methodology>
    <step><![CDATA[Surveyed peer-reviewed literature and company disclosures.]]></step>
  </methodology>
  <sections>
    <section id="S01">
      <title><![CDATA[Technology Landscape]]></title>
      <summary><![CDATA[Dominant recycling modalities and throughput metrics...]]></summary>
      <insights>
        <insight>
          <title><![CDATA[Hydrometallurgy remains the throughput leader]]></title>
          <evidence><![CDATA[Commercial operators report high recovery rates for core battery metals.]]></evidence>
          <citations>
            <ref>C01</ref>
          </citations>
        </insight>
      </insights>
    </section>
  </sections>
  <citations>
    <citation id="C01">
      <title><![CDATA[Example Recycling Benchmark]]></title>
      <url><![CDATA[https://example.com/recycling-benchmark]]></url>
      <publisher><![CDATA[Example Publisher]]></publisher>
    </citation>
  </citations>
</compendium>

Testing & Quality

pdm run test - Executes the unit suite. Tests stub Agents SDK runs, so they run offline.
pdm run lint - Linting.
pdm run ruff check src tests - Direct lint command.
pdm build - Produce distributable artifacts.

Before marking implementation work complete, run:

pdm run pytest
pdm run ruff check src tests
pdm build

Contributing

Fork and clone the repository.
Run pdm install --group dev.
Make changes following the style guide and update/add tests.
Run pdm run pytest, pdm run ruff check src tests, and pdm build.
Raise a pull request with a concise description, verification commands, and representative output samples when user-facing structure changes.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

btfranklin

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.1

May 7, 2026

This version

0.4.0

May 7, 2026

0.3.1

Jan 9, 2026

0.3.0

Dec 31, 2025

0.2.0

Dec 29, 2025

0.1.0

Dec 7, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

compendiumscribe-0.4.0.tar.gz (41.2 kB view details)

Uploaded May 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

compendiumscribe-0.4.0-py3-none-any.whl (53.4 kB view details)

Uploaded May 7, 2026 Python 3

File details

Details for the file compendiumscribe-0.4.0.tar.gz.

File metadata

Download URL: compendiumscribe-0.4.0.tar.gz
Upload date: May 7, 2026
Size: 41.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for compendiumscribe-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`b07f0c222f5f7cc2016bc671c848367c1a17f58714f340b47adda96e69ad8482`
MD5	`f137ca5de76588465b09301fbce11a15`
BLAKE2b-256	`01c3e3d4a2f7f3c7a62a09eae8e13b79167943816829adf9e288cfb0e59b1d45`

See more details on using hashes here.

Provenance

The following attestation bundles were made for compendiumscribe-0.4.0.tar.gz:

Publisher: python-publish.yml on btfranklin/compendiumscribe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: compendiumscribe-0.4.0.tar.gz
- Subject digest: b07f0c222f5f7cc2016bc671c848367c1a17f58714f340b47adda96e69ad8482
- Sigstore transparency entry: 1455216887
- Sigstore integration time: May 7, 2026
Source repository:
- Permalink: btfranklin/compendiumscribe@f13d60099ba89f6d6c19b180a21915b670d4f6f2
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/btfranklin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@f13d60099ba89f6d6c19b180a21915b670d4f6f2
- Trigger Event: release

File details

Details for the file compendiumscribe-0.4.0-py3-none-any.whl.

File metadata

Download URL: compendiumscribe-0.4.0-py3-none-any.whl
Upload date: May 7, 2026
Size: 53.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for compendiumscribe-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f444d9cc5136617a2957744079c5f8acaa975327b90450ba17309078ef854812`
MD5	`764104812cee41ab28d3b51a1da36c81`
BLAKE2b-256	`6083c3f21b8b9f07dce95109fe11cfc32c6daf204548dc99464b79f251c8a673`

See more details on using hashes here.

Provenance

The following attestation bundles were made for compendiumscribe-0.4.0-py3-none-any.whl:

Publisher: python-publish.yml on btfranklin/compendiumscribe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: compendiumscribe-0.4.0-py3-none-any.whl
- Subject digest: f444d9cc5136617a2957744079c5f8acaa975327b90450ba17309078ef854812
- Sigstore transparency entry: 1455216944
- Sigstore integration time: May 7, 2026
Source repository:
- Permalink: btfranklin/compendiumscribe@f13d60099ba89f6d6c19b180a21915b670d4f6f2
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/btfranklin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@f13d60099ba89f6d6c19b180a21915b670d4f6f2
- Trigger Event: release

compendiumscribe 0.4.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Compendium Scribe

Features

Quick Start

1. Install

2. Configure credentials

3. Generate a compendium

4. Publish to a Compendium Library

5. Recover a research run

6. Render formats from existing XML

Python API Usage

Data Model Overview

Testing & Quality

Contributing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance