A package for automating the creation of comprehensive and organized domain knowledge bases for AI applications.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

btfranklin

These details have not been verified by PyPI

Project description

Compendium Scribe

Compendium Scribe banner

Compendium Scribe is a Click-driven command line tool and library that uses OpenAI's deep research models to assemble a comprehensive research compendium for any topic. The workflow combines optional prompt refinement, a "deep research" call with web search tooling, and deterministic post-processing. It produces human-readable Markdown by default, backed by a rich XML data model that can also be exported.

Features

🔍 Deep research pipeline — Orchestrates prompt planning, background execution, and tool-call capture with o3-deep-research.
🧱 Rich data model — Includes sections, insights, and citations for cross-format rendering.
🧾 Structured XML output — Produces a schema-friendly document ready for downstream conversion (HTML, Markdown, PDF pipelines, etc.).
🌐 HTML Site Export — Generates a static, multi-page HTML site with navigation and semantic structure.
🔄 Re-rendering — Ingest existing XML compendiums to generate new output formats without re-running costly research.
⚙️ Configurable CLI — Control background execution, tool call limits, and output paths via a unified command structure.
🧪 Testable architecture — Research orchestration is decoupled from the OpenAI client, making it simple to stub in tests.

Quick Start

1. Install

pdm install --group dev

Ensure PDM_HOME points to a writable location when developing within a sandboxed environment.

2. Configure credentials

Create a .env file (untracked) with your OpenAI credentials:

OPENAI_API_KEY=sk-...
PROMPT_REFINER_MODEL=gpt-5.2
DEEP_RESEARCH_MODEL=o3-deep-research
POLLING_INTERVAL_IN_SECONDS=10
MAX_POLL_TIME_IN_MINUTES=60

Deep research requires an OpenAI account with the browsing tooling enabled. Document any environment keys for additional tooling in the repo as you add them.

3. Generate a compendium

Use the create subcommand to verify a topic and run the research process:

pdm run compendium create "Lithium-ion battery recycling"

Options:

--output PATH — Base path/filename for the output (extension is ignored).
--no-background — Force synchronous execution (useful for short or restricted queries).
--max-tool-calls N — Cap the total number of tool calls for cost control.
--format FORMAT — Output format (defaults to md). Available: md, xml, html, pdf. Can be repeated for multiple outputs.

Example output file name: lithium-ion-battery-recycling.md.

4. Render formats from existing XML

If you have an existing XML compendium (e.g., my-topic.xml), you can re-render it into other formats:

pdm run compendium render my-topic.xml --format html

Options:

--format FORMAT — Output format(s) to generate (md, xml, html, pdf).
--output PATH — Base path/filename for the output.

5. Recover from a timeout

If a research task times out (exceeding MAX_POLL_TIME_IN_MINUTES), recovery information is saved to timed_out_research.json. You can resume checking for its completion without starting over:

pdm run compendium recover

Options:

--input PATH — Path to the recovery JSON file (defaults to timed_out_research.json).

Library Usage

from compendiumscribe import build_compendium, ResearchConfig, DeepResearchError

try:
    compendium = build_compendium(
        "Emerging pathogen surveillance",
        config=ResearchConfig(
            background=False, 
            max_tool_calls=30,
            max_poll_time_minutes=15,
        ),
    )
except DeepResearchError as exc:
    # Handle or log deep research failures
    raise

xml_payload = compendium.to_xml_string()

# Alternate exports
markdown_doc = compendium.to_markdown()
html_files = compendium.to_html_site()  # Returns dict of filename -> content
pdf_bytes = compendium.to_pdf_bytes()

The returned Compendium object contains structured sections, insights, citations, and open questions.

Data Model Overview

Compendium Scribe produces XML shaped like:

<compendium topic="Lithium-ion Battery Recycling" generated_at="2025-01-07T14:32:33+00:00">
  <overview><![CDATA[Comprehensive synthesis of the state of lithium-ion recycling...]]></overview>
  <methodology>
    <step><![CDATA[Surveyed peer-reviewed literature from 2022–2025]]></step>
    <step><![CDATA[Corroborated industrial capacity data with regulatory filings]]></step>
  </methodology>
  <sections>
    <section id="S01">
      <title><![CDATA[Technology Landscape]]></title>
      <summary><![CDATA[Dominant recycling modalities and throughput metrics...]]></summary>
      <key_terms>
        <term><![CDATA[hydrometallurgy]]></term>
        <term><![CDATA[direct recycling]]></term>
      </key_terms>
      <guiding_questions>
        <question><![CDATA[Which processes yield the highest cobalt recovery rates?]]></question>
      </guiding_questions>
      <insights>
        <insight>
          <title><![CDATA[Hydrometallurgy remains the throughput leader]]></title>
          <evidence><![CDATA[EPRI 2024 data shows >95% cobalt recovery in commercial plants.]]></evidence>
          <implications><![CDATA[Capital efficiency favors hydrometallurgy for near-term scaling.]]></implications>
          <citations>
            <ref>C1</ref>
          </citations>
        </insight>
      </insights>
    </section>
  </sections>
  <citations>
    <citation id="C1">
      <title><![CDATA[EPRI Lithium-ion Recycling Benchmarking 2024]]></title>
      <url><![CDATA[https://example.com/epri-li-benchmark]]></url>
      <publisher><![CDATA[EPRI]]></publisher>
      <published_at><![CDATA[2024-09-01]]></published_at>
      <summary><![CDATA[Performance metrics for recycling modalities across 12 facilities.]]></summary>
    </citation>
  </citations>
  <open_questions>
    <question><![CDATA[How will policy incentives shape regional plant siting post-2025?]]></question>
  </open_questions>
</compendium>

This format is intentionally verbose to support downstream transformation. Markdown links within text (e.g., [Label](URL)) are preserved in the XML to ensure they render correctly in final outputs.

Testing & Quality

pdm run pytest — Executes the unit suite. Tests stub the OpenAI client, so they run offline.
pdm run flake8 src tests — Linting.
pdm build — Produce distributable artifacts.

If pdm fails to write log files in restricted environments, set PDM_HOME to a writable directory (for example, export PDM_HOME=.pdm_home).

Contributing

Fork and clone the repository.
Run pdm install --group dev.
Make changes following the style guide and update/add tests.
Run pdm run pytest and pdm run flake8 src tests.
Raise a pull request with:
- A concise description of the change.
- Verification commands executed locally.
- Representative XML samples if the user-facing structure changes.

License

MIT © B.T. Franklin and contributors.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

btfranklin

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.1

May 7, 2026

0.4.0

May 7, 2026

0.3.1

Jan 9, 2026

0.3.0

Dec 31, 2025

This version

0.2.0

Dec 29, 2025

0.1.0

Dec 7, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

compendiumscribe-0.2.0.tar.gz (31.2 kB view details)

Uploaded Dec 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

compendiumscribe-0.2.0-py3-none-any.whl (46.1 kB view details)

Uploaded Dec 29, 2025 Python 3

File details

Details for the file compendiumscribe-0.2.0.tar.gz.

File metadata

Download URL: compendiumscribe-0.2.0.tar.gz
Upload date: Dec 29, 2025
Size: 31.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for compendiumscribe-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`13aa74f186e606258ea8a78abc68ab66a59b26111b241879cebe0a33aaf08380`
MD5	`d3a35bf32afe8a62a33e626f7521b480`
BLAKE2b-256	`142f5ca78f0bf9b82c8bb6adf9761b18aa4dcb0d0abce82f4014fc66761c6a65`

See more details on using hashes here.

Provenance

The following attestation bundles were made for compendiumscribe-0.2.0.tar.gz:

Publisher: python-publish.yml on btfranklin/compendiumscribe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: compendiumscribe-0.2.0.tar.gz
- Subject digest: 13aa74f186e606258ea8a78abc68ab66a59b26111b241879cebe0a33aaf08380
- Sigstore transparency entry: 781067251
- Sigstore integration time: Dec 29, 2025
Source repository:
- Permalink: btfranklin/compendiumscribe@e00a322c0effaff2418d0af9ce585ed3584f9735
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/btfranklin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@e00a322c0effaff2418d0af9ce585ed3584f9735
- Trigger Event: release

File details

Details for the file compendiumscribe-0.2.0-py3-none-any.whl.

File metadata

Download URL: compendiumscribe-0.2.0-py3-none-any.whl
Upload date: Dec 29, 2025
Size: 46.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for compendiumscribe-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`20031a9b7f3544d5478ad685263e45e944c69e9687d3b8fa6d4809e4d51c2269`
MD5	`3269e838b43d978614748108b93766dc`
BLAKE2b-256	`4cb374c5d6e49757231b1706dd7c485dc8f5b43aa29399308f04a29c91927744`

See more details on using hashes here.

Provenance

The following attestation bundles were made for compendiumscribe-0.2.0-py3-none-any.whl:

Publisher: python-publish.yml on btfranklin/compendiumscribe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: compendiumscribe-0.2.0-py3-none-any.whl
- Subject digest: 20031a9b7f3544d5478ad685263e45e944c69e9687d3b8fa6d4809e4d51c2269
- Sigstore transparency entry: 781067253
- Sigstore integration time: Dec 29, 2025
Source repository:
- Permalink: btfranklin/compendiumscribe@e00a322c0effaff2418d0af9ce585ed3584f9735
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/btfranklin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@e00a322c0effaff2418d0af9ce585ed3584f9735
- Trigger Event: release

compendiumscribe 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Compendium Scribe

Features

Quick Start

1. Install

2. Configure credentials

3. Generate a compendium

4. Render formats from existing XML

5. Recover from a timeout

Library Usage

Data Model Overview

Testing & Quality

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance