EPO patent prosecution history CLI — download, parse, and analyze European patent documents for AI-assisted analysis
Project description
epo-oa-cli
EPO patent prosecution history CLI — Download, parse, and analyze European Patent Office (EPO) prosecution documents for AI-assisted patent analysis.
pip install epo-oa-cli
epo-oa run EP21841218
Overview
epo-oa fetches the complete prosecution history of any EP patent from the EPO Register, extracts PDF text (with optional OCR), and generates a structured prosecution.md file ready for AI analysis (Claude, GPT-4, etc.).
epo-oa run EP21841218
→ Downloads 40 documents as ZIP
→ Extracts & parses toc.xml
→ Generates file/EP21841218/EP21841218_prosecution.md
Installation
pip install epo-oa-cli
# With OCR support (for image-based PDFs)
pip install "epo-oa-cli[ocr]"
Requires Python 3.12+.
Quick Start
# 1. List all documents
epo-oa list EP21841218
# 2. Download as ZIP + extract
epo-oa download EP21841218
# 3. Parse PDFs → prosecution.md
epo-oa extract EP21841218
# 4. All-in-one
epo-oa run EP21841218
With OCR (for image-based PDFs)
EPO PDFs are full-page image scans. Run OCR first to embed text into the analysis file:
# OCR key documents only
epo-oa ocr EP21841218 --codes 1703,1224,ABEX
# OCR all documents
epo-oa ocr EP21841218
# Extract with OCR text embedded
epo-oa extract EP21841218 --with-ocr
Commands
| Command | Description |
|---|---|
epo-oa list <EP> |
List prosecution documents from EPO Register |
epo-oa download <EP> |
Download all documents as ZIP archive |
epo-oa extract <EP> |
Parse PDFs → prosecution.md / prosecution.json |
epo-oa ocr <EP> |
OCR image-based PDFs → searchable *_ocr.pdf |
epo-oa run <EP> |
Download + extract in one step |
epo-oa configure |
Set proxy / CA-cert options |
Options
epo-oa list EP21841218 --format json # JSON output
epo-oa download EP21841218 --force # Re-download
epo-oa extract EP21841218 --format json # JSON output
epo-oa extract EP21841218 --with-ocr # Embed OCR text
epo-oa ocr EP21841218 --codes 1703,ABEX # Selective OCR
epo-oa ocr EP21841218 --in-place # Overwrite originals
Proxy & SSL Configuration
For corporate networks or environments that require a proxy or custom CA certificate:
epo-oa configure
This interactively prompts for:
- HTTPS proxy URL — e.g.
http://proxy.corp.example.com:8080 - HTTP proxy URL — e.g.
http://proxy.corp.example.com:8080 - CA bundle file path — path to a custom
.pem/.crtfile
Settings are saved to ~/.epo-oa.toml:
[proxy]
https = "http://proxy.corp.example.com:8080"
http = "http://proxy.corp.example.com:8080"
[ssl]
ca_bundle = "/etc/ssl/certs/corp-ca.pem"
If no config file exists, requests falls back to the standard environment variables (HTTPS_PROXY, HTTP_PROXY, REQUESTS_CA_BUNDLE).
Output: prosecution.md
The generated markdown file is structured for AI agents:
# EPO Prosecution Analysis — EP21841218
## Summary
| Item | Count |
|------|-------|
| Total documents | 40 |
| 🔴 Office Actions | 2 |
| 🔵 Amendments | 13 |
| ✅ Grant / Decision | 8 |
## Timeline
| Date | Cat | Document | File |
|------|-----|----------|------|
| 2023-10-30 | 🔍 | European Search Opinion (1703) 🖼️ | ... |
| 2024-02-15 | 🔵 | Amended Claims (CLMSABEX) 🖼️ | ... |
| 2026-02-05 | ✅ | Decision to Grant (2006A) 🖼️ | ... |
## 🔴 Office Action Documents
### European Search Opinion — 2023-10-30
**OCR Text:**
```text
D1 WO 2020/138918 A1 (SAMSUNG ELECTRONICS CO LTD)
1.1 D1 discloses an electronic device with the following features...
` `` `
Politeness & Rate Limiting
This tool accesses a public EPO server. It enforces:
- Random delays (1.5–3.0s) between requests
- Browser-like headers
- ZIP archive download (minimises HTTP requests)
Please do not run this tool in tight loops or CI pipelines without appropriate throttling.
Document Categories
| Icon | Category | Description |
|---|---|---|
| 🔴 | Office Action | Examination notices, search opinions (1224, 1703, 2003–2006, etc.) |
| 🔵 | Amendment | Amendments, observations, responses (CLMSABEX, DESCABEX, ABEX, etc.) |
| ✅ | Grant | Grant decisions, certificates (2006A, 2066, 2047, etc.) |
| 🔍 | Search | Search reports (1503, 1503SS, ISR, IPRP, etc.) |
| 💬 | Interview | Interview summaries (INTERV, EXIN) |
| ⚪ | Other | Receipts, administrative notices, miscellaneous |
Notes for AI Agents
- Image-only PDFs show
🖼️— provide thepathfield directly to vision-capable models - Run
epo-oa ocr+--with-ocrto embed text for language models - JSON output (
--format json) includes fullpathandtextfields for programmatic access - The
prosecution.mdis designed to fit within typical LLM context windows for smaller dockets
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file epo_oa_cli-0.2.3.tar.gz.
File metadata
- Download URL: epo_oa_cli-0.2.3.tar.gz
- Upload date:
- Size: 84.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d341f45adec7d21cfea9d6efc65d123e2afe8b1c96a76e872412a00e3fe8e872
|
|
| MD5 |
971d59c07d92eab43e01b0c44d5759a8
|
|
| BLAKE2b-256 |
06f57653027d57c9d8d90ad5628a08f0eccad2bfac6636107e1e052c8f6ad78f
|
File details
Details for the file epo_oa_cli-0.2.3-py3-none-any.whl.
File metadata
- Download URL: epo_oa_cli-0.2.3-py3-none-any.whl
- Upload date:
- Size: 20.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ebfd58646774f1afe783f1aa3a4079fe169fc15ddc796e8ebe6c84b58f52953
|
|
| MD5 |
8630beed1080773ad035f5a8fd4016a2
|
|
| BLAKE2b-256 |
b6e0dca1c01baa1ec03f2be71349575f7603748bd7e0b25c0d764cda9d0edf78
|