USPTO 특허 심사과정 분석 CLI — 문서 다운로드 · XML 파싱 · MD 생성
Project description
uspto-oa-cli
A CLI tool that downloads USPTO patent prosecution documents via the ODP (Open Data Portal) API, parses the XML, and converts them into structured Markdown.
Supports a workflow where the generated MD file is passed to AI agents (Claude Code, Gemini CLI, etc.) for prosecution strategy analysis.
Requirements
- Python 3.12+
- uv
- USPTO API key (issued via the ODP portal)
Installation
# pip
pip install uspto-oa-cli
# uv (global install)
uv tool install uspto-oa-cli
# uv (add as project dependency)
uv add uspto-oa-cli
# local development
uv sync
API Key Setup
# Interactive setup (recommended) — saved to ~/.oa-cli.toml
uspto-oa configure
# Show current configuration
uspto-oa configure --show
Or set via environment variable:
export USPTO_API_KEY=your_api_key_here
Usage
# 0. Check document list before downloading
uspto-oa list 16330077
# 1. Download documents (saved to file/{app_num}/)
uspto-oa download 16330077
# 2. Parse XML → generate prosecution.md
uspto-oa extract 16330077
# Output: file/16330077/16330077_prosecution.md
# Extract in JSON format
uspto-oa extract 16330077 --format json
# 3. (Optional) OCR image-based PDFs → searchable PDFs
uspto-oa ocr 16330077
# 4. (Optional) Embed OCR text into prosecution.md for AI analysis
# Run after step 3. Selectively include high-value doc codes to
# avoid filling up the AI context window.
uspto-oa extract 16330077 --with-ocr --ocr-codes CTNF,CTFR,REM,EXIN,CTAV
# Download specific document codes only
uspto-oa download 16330077 --doc-codes CTNF,CTFR,NOA
# Force re-download (overwrite existing files)
uspto-oa download 16330077 --force
# Verbose logging
uspto-oa -v download 16330077
# One-time API key override
uspto-oa download 16330077 --api-key YOUR_KEY
Command Options
uspto-oa list <application>
| Option | Description |
|---|---|
--all |
Show all documents without prosecution-related filter |
--format [table|json] |
Output format (default: table) |
--api-key TEXT |
API key |
uspto-oa download <application>
| Option | Description |
|---|---|
--doc-codes CODES |
Comma-separated document codes (e.g. CTNF,CTFR,NOA). All prosecution docs if omitted |
--output-dir DIR |
Save path (default: file/{app_num}/) |
--force |
Re-download even if file already exists |
--api-key TEXT |
API key (overrides config file and environment variable) |
uspto-oa extract <application>
| Option | Description |
|---|---|
--format [md|json] |
Output format (default: md) |
--output-dir DIR |
File directory (default: file/{app_num}/) |
--with-ocr |
Embed OCR text from *_ocr.pdf files into prosecution.md (run ocr first) |
--ocr-codes CODES |
Comma-separated doc codes to embed (default: CTNF,CTFR,NOA,NACT,EXIN,REM,CTAV + A*) |
Doc code guide for --ocr-codes — choosing the right codes prevents AI context overflow:
| Code | Description | OCR value | Default included |
|---|---|---|---|
CTNF |
Non-Final Office Action | High — core rejection grounds | ✓ |
CTFR |
Final Office Action | High — core rejection grounds | ✓ |
NOA / NACT |
Notice of Allowance | Medium — allowance reasons | ✓ |
EXIN |
Examiner Interview Summary | High — often PDF-only in modern apps | ✓ |
REM |
Remarks (applicant arguments) | High — often PDF-only | ✓ |
CTAV |
Advisory Action | Medium — examiner's response to after-final amendment | ✓ |
A* |
All Amendment variants | High — when XML parsing fails | ✓ |
ABN |
Abandonment | Low | — |
RCE / RCEX |
Request for Continued Examination | Low | — |
SRNT / SRFW |
Search Report | Low — very long, little analysis value | — |
892 / 1449 / IDS |
Prior Art / IDS | Low — very long, reference lists | — |
uspto-oa ocr <application> (requires pip install ocrmypdf)
USPTO PDF documents are full-page image scans — standard text extraction fails. This command runs OCR on every PDF in the application directory and produces searchable PDFs alongside the originals.
| Option | Description |
|---|---|
--force |
Re-OCR even if output already exists |
--in-place |
Overwrite original PDFs instead of creating *_ocr.pdf copies |
--no-deskew |
Skip deskew correction (faster) |
--output-dir DIR |
File directory (default: file/{app_num}/) |
# Install OCR dependency
pip install ocrmypdf
# or: uv pip install ocrmypdf
# or: pip install uspto-oa-cli[ocr]
# Run OCR (creates {original}_ocr.pdf next to each PDF)
uspto-oa ocr 16330077
# Overwrite originals in place
uspto-oa ocr 16330077 --in-place
Workflow
uspto-oa list {app_num} # Check document list before downloading
└─ Browse prosecution document codes and formats
uspto-oa download {app_num}
└─ Save XML / PDF to file/{app_num}/
uspto-oa extract {app_num} # XML-only (fast, default)
└─ Generate file/{app_num}/{app_num}_prosecution.md
└─ AI agent (Claude Code / Gemini CLI)
└─ Prosecution strategy analysis, summaries, Q&A
# ── Optional: include PDF-only documents in prosecution.md ──────────────────
uspto-oa ocr {app_num} # Step A: OCR all PDFs → *_ocr.pdf
└─ Generate {original}_ocr.pdf next to each PDF
uspto-oa extract {app_num} \ # Step B: embed selected OCR text
--with-ocr \
--ocr-codes CTNF,CTFR,REM,EXIN,CTAV
└─ prosecution.md now includes full text of selected PDF documents
└─ AI agent reads one file, gets complete prosecution history
Collected Document Codes
| Code | Description |
|---|---|
CTNF |
Non-Final Office Action |
CTFR |
Final Office Action |
NOA / NACT |
Notice of Allowance |
REM |
Remarks |
ABN |
Abandonment |
SRNT / SRFW |
Search Report |
EXIN |
Examiner Interview |
RCE / RCEX |
Request for Continued Examination |
CTAV |
Advisory Action |
892 / 1449 / IDS |
Prior Art / IDS |
A* |
All Amendment variants |
Generated File Structure
file/{app_num}/{app_num}_prosecution.md:
| Section | Content |
|---|---|
| Timeline | All documents sorted by date (XML/PDF format shown) |
| Office Action Details | Full rejection grounds from CTNF/CTFR |
| Amendment Details | Amended claims (CLM) + Remarks (REM) |
| Examiner Interview Details | Full EXIN text |
| Notice of Allowance Details | Allowed claims + Examiner's Statement |
| PDF-only Documents | Image PDF list (for direct AI agent delivery) |
PyPI Release
uv build
uv run twine upload dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file uspto_oa_cli-0.1.7.tar.gz.
File metadata
- Download URL: uspto_oa_cli-0.1.7.tar.gz
- Upload date:
- Size: 85.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90cb98d1e429f6494e5ce270ab893d74dbda6b8fa0241e74c27e1cff1f4f648e
|
|
| MD5 |
21fe8e9fa33b16b63648c1ef7a6e2f2a
|
|
| BLAKE2b-256 |
160990f038444a42d8cbbf1a64c4033682ffa7caf6f93a6b85ae80ca3c249bd2
|
File details
Details for the file uspto_oa_cli-0.1.7-py3-none-any.whl.
File metadata
- Download URL: uspto_oa_cli-0.1.7-py3-none-any.whl
- Upload date:
- Size: 21.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a176c7cccb21fb42540f00c2ecb67c3c1f240f4f3e1762ce49650d2ab10c5e84
|
|
| MD5 |
c0c55502482b75de5241d5f9b9fb85fe
|
|
| BLAKE2b-256 |
d9f1021cbd3f79bc7a7c3925ad92e5e96c188028a3be7d4eaaac3a1ea74e7d22
|