Skip to main content

EPO patent prosecution history CLI — download, parse, and analyze European patent documents for AI-assisted analysis

Project description

epo-oa-cli

EPO patent prosecution history CLI — Download, parse, and analyze European Patent Office (EPO) prosecution documents for AI-assisted patent analysis.

pip install epo-oa-cli
epo-oa run EP21841218

Overview

epo-oa fetches the complete prosecution history of any EP patent from the EPO Register, extracts PDF text (with optional OCR), and generates a structured prosecution.md file ready for AI analysis (Claude, GPT-4, etc.).

epo-oa run EP21841218
  → Downloads 40 documents as ZIP
  → Extracts & parses toc.xml
  → Generates file/EP21841218/EP21841218_prosecution.md

Installation

pip install epo-oa-cli

# With OCR support (for image-based PDFs)
pip install "epo-oa-cli[ocr]"

Requires Python 3.12+.


Quick Start

# 1. List all documents
epo-oa list EP21841218

# 2. Download as ZIP + extract
epo-oa download EP21841218

# 3. Parse PDFs → prosecution.md
epo-oa extract EP21841218

# 4. All-in-one
epo-oa run EP21841218

With OCR (for image-based PDFs)

EPO PDFs are full-page image scans. Run OCR first to embed text into the analysis file:

# OCR key documents only
epo-oa ocr EP21841218 --codes 1703,1224,ABEX

# OCR all documents
epo-oa ocr EP21841218

# Extract with OCR text embedded
epo-oa extract EP21841218 --with-ocr

Commands

Command Description
epo-oa list <EP> List prosecution documents from EPO Register
epo-oa download <EP> Download all documents as ZIP archive
epo-oa extract <EP> Parse PDFs → prosecution.md / prosecution.json
epo-oa ocr <EP> OCR image-based PDFs → searchable *_ocr.pdf
epo-oa run <EP> Download + extract in one step
epo-oa configure Set proxy / CA-cert options

Options

epo-oa list EP21841218 --format json          # JSON output
epo-oa download EP21841218 --force            # Re-download
epo-oa extract EP21841218 --format json       # JSON output
epo-oa extract EP21841218 --with-ocr          # Embed OCR text
epo-oa ocr EP21841218 --codes 1703,ABEX       # Selective OCR
epo-oa ocr EP21841218 --in-place              # Overwrite originals

Proxy & SSL Configuration

For corporate networks or environments that require a proxy or custom CA certificate:

epo-oa configure

This interactively prompts for:

  • HTTPS proxy URL — e.g. http://proxy.corp.example.com:8080
  • HTTP proxy URL — e.g. http://proxy.corp.example.com:8080
  • CA bundle file path — path to a custom .pem / .crt file

Settings are saved to ~/.epo-oa.toml:

[proxy]
https = "http://proxy.corp.example.com:8080"
http  = "http://proxy.corp.example.com:8080"

[ssl]
ca_bundle = "/etc/ssl/certs/corp-ca.pem"

If no config file exists, requests falls back to the standard environment variables (HTTPS_PROXY, HTTP_PROXY, REQUESTS_CA_BUNDLE).


Output: prosecution.md

The generated markdown file is structured for AI agents:

# EPO Prosecution Analysis — EP21841218

## Summary
| Item | Count |
|------|-------|
| Total documents | 40 |
| 🔴 Office Actions | 2 |
| 🔵 Amendments | 13 |
| ✅ Grant / Decision | 8 |

## Timeline
| Date | Cat | Document | File |
|------|-----|----------|------|
| 2023-10-30 | 🔍 | European Search Opinion (1703) 🖼️ | ... |
| 2024-02-15 | 🔵 | Amended Claims (CLMSABEX) 🖼️ | ... |
| 2026-02-05 | ✅ | Decision to Grant (2006A) 🖼️ | ... |

## 🔴 Office Action Documents
### European Search Opinion — 2023-10-30
**OCR Text:**
```text
D1 WO 2020/138918 A1 (SAMSUNG ELECTRONICS CO LTD)
1.1 D1 discloses an electronic device with the following features...
` `` `

Politeness & Rate Limiting

This tool accesses a public EPO server. It enforces:

  • Random delays (1.5–3.0s) between requests
  • Browser-like headers
  • ZIP archive download (minimises HTTP requests)

Please do not run this tool in tight loops or CI pipelines without appropriate throttling.


Document Categories

Icon Category Description
🔴 Office Action Examination notices, search opinions (1224, 1703, 2003–2006, etc.)
🔵 Amendment Amendments, observations, responses (CLMSABEX, DESCABEX, ABEX, etc.)
Grant Grant decisions, certificates (2006A, 2066, 2047, etc.)
🔍 Search Search reports (1503, 1503SS, ISR, IPRP, etc.)
💬 Interview Interview summaries (INTERV, EXIN)
Other Receipts, administrative notices, miscellaneous

Notes for AI Agents

  • Image-only PDFs show 🖼️ — provide the path field directly to vision-capable models
  • Run epo-oa ocr + --with-ocr to embed text for language models
  • JSON output (--format json) includes full path and text fields for programmatic access
  • The prosecution.md is designed to fit within typical LLM context windows for smaller dockets

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epo_oa_cli-0.2.3.tar.gz (84.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

epo_oa_cli-0.2.3-py3-none-any.whl (20.6 kB view details)

Uploaded Python 3

File details

Details for the file epo_oa_cli-0.2.3.tar.gz.

File metadata

  • Download URL: epo_oa_cli-0.2.3.tar.gz
  • Upload date:
  • Size: 84.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for epo_oa_cli-0.2.3.tar.gz
Algorithm Hash digest
SHA256 d341f45adec7d21cfea9d6efc65d123e2afe8b1c96a76e872412a00e3fe8e872
MD5 971d59c07d92eab43e01b0c44d5759a8
BLAKE2b-256 06f57653027d57c9d8d90ad5628a08f0eccad2bfac6636107e1e052c8f6ad78f

See more details on using hashes here.

File details

Details for the file epo_oa_cli-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: epo_oa_cli-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 20.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for epo_oa_cli-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4ebfd58646774f1afe783f1aa3a4079fe169fc15ddc796e8ebe6c84b58f52953
MD5 8630beed1080773ad035f5a8fd4016a2
BLAKE2b-256 b6e0dca1c01baa1ec03f2be71349575f7603748bd7e0b25c0d764cda9d0edf78

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page