Skip to main content

EPO patent prosecution history CLI — download, parse, and analyze European patent documents for AI-assisted analysis

Project description

epo-oa-cli

EPO patent prosecution history CLI — Download, parse, and analyze European Patent Office (EPO) prosecution documents for AI-assisted patent analysis.

pip install epo-oa-cli
epo-oa run EP21841218

Overview

epo-oa fetches the complete prosecution history of any EP patent from the EPO Register, extracts PDF text (with optional OCR), and generates a structured prosecution.md file ready for AI analysis (Claude, GPT-4, etc.).

epo-oa run EP21841218
  → Downloads 40 documents as ZIP
  → Extracts & parses toc.xml
  → Generates file/EP21841218/EP21841218_prosecution.md

Installation

pip install epo-oa-cli

# With OCR support (for image-based PDFs)
pip install "epo-oa-cli[ocr]"

Requires Python 3.13+.


Quick Start

# 1. List all documents
epo-oa list EP21841218

# 2. Download as ZIP + extract
epo-oa download EP21841218

# 3. Parse PDFs → prosecution.md
epo-oa extract EP21841218

# 4. All-in-one
epo-oa run EP21841218

With OCR (for image-based PDFs)

EPO PDFs are full-page image scans. Run OCR first to embed text into the analysis file:

# OCR key documents only
epo-oa ocr EP21841218 --codes 1703,1224,ABEX

# OCR all documents
epo-oa ocr EP21841218

# Extract with OCR text embedded
epo-oa extract EP21841218 --with-ocr

Commands

Command Description
epo-oa list <EP> List prosecution documents from EPO Register
epo-oa download <EP> Download all documents as ZIP archive
epo-oa extract <EP> Parse PDFs → prosecution.md / prosecution.json
epo-oa ocr <EP> OCR image-based PDFs → searchable *_ocr.pdf
epo-oa run <EP> Download + extract in one step

Options

epo-oa list EP21841218 --format json          # JSON output
epo-oa download EP21841218 --force            # Re-download
epo-oa extract EP21841218 --format json       # JSON output
epo-oa extract EP21841218 --with-ocr          # Embed OCR text
epo-oa ocr EP21841218 --codes 1703,ABEX       # Selective OCR
epo-oa ocr EP21841218 --in-place              # Overwrite originals

Output: prosecution.md

The generated markdown file is structured for AI agents:

# EPO Prosecution Analysis — EP21841218

## Summary
| Item | Count |
|------|-------|
| Total documents | 40 |
| 🔴 Office Actions | 2 |
| 🔵 Amendments | 13 |
| ✅ Grant / Decision | 8 |

## Timeline
| Date | Cat | Document | File |
|------|-----|----------|------|
| 2023-10-30 | 🔍 | European Search Opinion (1703) 🖼️ | ... |
| 2024-02-15 | 🔵 | Amended Claims (CLMSABEX) 🖼️ | ... |
| 2026-02-05 | ✅ | Decision to Grant (2006A) 🖼️ | ... |

## 🔴 Office Action Documents
### European Search Opinion — 2023-10-30
**OCR Text:**
```text
D1 WO 2020/138918 A1 (SAMSUNG ELECTRONICS CO LTD)
1.1 D1 discloses an electronic device with the following features...
` `` `

Politeness & Rate Limiting

This tool accesses a public EPO server. It enforces:

  • Random delays (1.5–3.0s) between requests
  • Browser-like headers
  • ZIP archive download (minimises HTTP requests)

Please do not run this tool in tight loops or CI pipelines without appropriate throttling.


Notes for AI Agents

  • Image-only PDFs show 🖼️ — provide the path field directly to vision-capable models
  • Run epo-oa ocr + --with-ocr to embed text for language models
  • JSON output (--format json) includes full path and text fields for programmatic access
  • The prosecution.md is designed to fit within typical LLM context windows for smaller dockets

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epo_oa_cli-0.2.0.tar.gz (27.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

epo_oa_cli-0.2.0-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file epo_oa_cli-0.2.0.tar.gz.

File metadata

  • Download URL: epo_oa_cli-0.2.0.tar.gz
  • Upload date:
  • Size: 27.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for epo_oa_cli-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bdc780be8878f894172a0af0c32402a12b6890376df2e8a851197cbd5fa2ab14
MD5 316829498163c44132428ecbf396a8b4
BLAKE2b-256 65697a9e07862fd4560f1a0de95e222259e3b57c2ba00c47f3813a680e10e50d

See more details on using hashes here.

File details

Details for the file epo_oa_cli-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: epo_oa_cli-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for epo_oa_cli-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 34259ae6e2179ef015bb8d8a14eb2e81f27276559282e28d07f03e18682100e7
MD5 61d98d48d5b17fd9543a1163cbef1cf1
BLAKE2b-256 2ce88d9cdfd72324826a4b288b4f1862179fdba45ade979c4fba51a7de2a1073

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page