Skip to main content

EPO patent prosecution history CLI — download, parse, and analyze European patent documents for AI-assisted analysis

Project description

epo-oa-cli

EPO patent prosecution history CLI — Download, parse, and analyze European Patent Office (EPO) prosecution documents for AI-assisted patent analysis.

pip install epo-oa-cli
epo-oa run EP21841218

Overview

epo-oa fetches the complete prosecution history of any EP patent from the EPO Register, extracts PDF text (with optional OCR), and generates a structured prosecution.md file ready for AI analysis (Claude, GPT-4, etc.).

epo-oa run EP21841218
  → Downloads 40 documents as ZIP
  → Extracts & parses toc.xml
  → Generates file/EP21841218/EP21841218_prosecution.md

Installation

pip install epo-oa-cli

# With OCR support (for image-based PDFs)
pip install "epo-oa-cli[ocr]"

Requires Python 3.12+.


Quick Start

# 1. List all documents
epo-oa list EP21841218

# 2. Download as ZIP + extract
epo-oa download EP21841218

# 3. Parse PDFs → prosecution.md
epo-oa extract EP21841218

# 4. All-in-one
epo-oa run EP21841218

With OCR (for image-based PDFs)

EPO PDFs are full-page image scans. Run OCR first to embed text into the analysis file:

# OCR key documents only
epo-oa ocr EP21841218 --codes 1703,1224,ABEX

# OCR all documents
epo-oa ocr EP21841218

# Extract with OCR text embedded
epo-oa extract EP21841218 --with-ocr

Commands

Command Description
epo-oa list <EP> List prosecution documents from EPO Register
epo-oa download <EP> Download all documents as ZIP archive
epo-oa extract <EP> Parse PDFs → prosecution.md / prosecution.json
epo-oa ocr <EP> OCR image-based PDFs → searchable *_ocr.pdf
epo-oa run <EP> Download + extract in one step

Options

epo-oa list EP21841218 --format json          # JSON output
epo-oa download EP21841218 --force            # Re-download
epo-oa extract EP21841218 --format json       # JSON output
epo-oa extract EP21841218 --with-ocr          # Embed OCR text
epo-oa ocr EP21841218 --codes 1703,ABEX       # Selective OCR
epo-oa ocr EP21841218 --in-place              # Overwrite originals

Output: prosecution.md

The generated markdown file is structured for AI agents:

# EPO Prosecution Analysis — EP21841218

## Summary
| Item | Count |
|------|-------|
| Total documents | 40 |
| 🔴 Office Actions | 2 |
| 🔵 Amendments | 13 |
| ✅ Grant / Decision | 8 |

## Timeline
| Date | Cat | Document | File |
|------|-----|----------|------|
| 2023-10-30 | 🔍 | European Search Opinion (1703) 🖼️ | ... |
| 2024-02-15 | 🔵 | Amended Claims (CLMSABEX) 🖼️ | ... |
| 2026-02-05 | ✅ | Decision to Grant (2006A) 🖼️ | ... |

## 🔴 Office Action Documents
### European Search Opinion — 2023-10-30
**OCR Text:**
```text
D1 WO 2020/138918 A1 (SAMSUNG ELECTRONICS CO LTD)
1.1 D1 discloses an electronic device with the following features...
` `` `

Politeness & Rate Limiting

This tool accesses a public EPO server. It enforces:

  • Random delays (1.5–3.0s) between requests
  • Browser-like headers
  • ZIP archive download (minimises HTTP requests)

Please do not run this tool in tight loops or CI pipelines without appropriate throttling.


Notes for AI Agents

  • Image-only PDFs show 🖼️ — provide the path field directly to vision-capable models
  • Run epo-oa ocr + --with-ocr to embed text for language models
  • JSON output (--format json) includes full path and text fields for programmatic access
  • The prosecution.md is designed to fit within typical LLM context windows for smaller dockets

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epo_oa_cli-0.2.1.tar.gz (82.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

epo_oa_cli-0.2.1-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file epo_oa_cli-0.2.1.tar.gz.

File metadata

  • Download URL: epo_oa_cli-0.2.1.tar.gz
  • Upload date:
  • Size: 82.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for epo_oa_cli-0.2.1.tar.gz
Algorithm Hash digest
SHA256 fbcac2ef03f465d93e9dedd5dadb7d47271dde61b8f222703cc0c123e388583f
MD5 ec9837233a00dffb6216d4f77a5f41d4
BLAKE2b-256 9f93a31bb4e85c6243ae882d31dd6380f58404f9cbfeb221e0e69d12973b8ba2

See more details on using hashes here.

File details

Details for the file epo_oa_cli-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: epo_oa_cli-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for epo_oa_cli-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0720aea7a97d15ff251b503454cde2ea725000039c6d170ad63a0e35465efa82
MD5 b51a6db2a5f221c0b4cad7a069193c63
BLAKE2b-256 32a7607bccf9e51730562483790d2a4316e69b986cce7798876839b048621a5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page