Skip to main content

Python bindings for oxidize-pdf — generate, parse, split, merge, and manipulate PDF files

Project description

oxidize-pdf

PyPI version CI License: MIT Python Typed MCP

oxidize-python MCP server

Rust-powered PDF library for Python. Generate, parse, split, merge, and manipulate PDFs with native performance. Ships with a built-in MCP server so AI agents can work with PDFs out of the box.

No C dependencies. No Java. No subprocess calls.

Installation

pip install oxidize-pdf            # Core library
pip install "oxidize-pdf[mcp]"     # + MCP server for AI agents

Platforms: Linux (x86_64, aarch64) | macOS (x86_64, Apple Silicon) | Windows (x86_64) Requires: Python 3.10+

Why oxidize-pdf?

oxidize-pdf Pure-Python libs C/Java wrappers
Performance Native (compiled Rust) Interpreted Native but heavy
Dependencies Zero Varies Poppler, Java, Ghostscript
Memory safety Rust ownership model GC-dependent Manual / GC
Type stubs Full (mypy/pyright) Partial Rare
AI-ready (MCP) Built-in No No

MCP Server

Give your AI agent full PDF capabilities in one line:

oxidize-mcp

The built-in Model Context Protocol server exposes 12 tools, 6 resources, and 5 prompts — compatible with Claude, GPT, and any MCP client.

Claude Desktop integration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "oxidize-pdf": {
      "command": "oxidize-mcp",
      "env": {
        "OXIDIZE_WORKSPACE": "/path/to/your/pdfs"
      }
    }
  }
}

Available tools

Tool What it does
read_pdf Read metadata — page count, version, encryption status, title, author
extract_text Extract text from all pages or a specific page
convert_pdf Convert to markdown, chunks, or RAG-optimized format
create_pdf Create a new PDF with optional metadata
save_pdf Save a session to disk, with optional encryption
add_content Add pages, text, and graphics to a session
annotate_pdf Add text annotations and highlights
manipulate_pdf Split, merge, rotate, extract pages, reverse, overlay
manage_forms Create, fill, read, and validate form fields
secure_pdf Encrypt, check permissions, verify signatures
extract_entities Extract structured entities from pages
analyze_pdf Validate structure, detect corruption, check PDF/A compliance

The server also exposes resources (session data, capabilities, version info) and prompts (guided workflows for summarization, data extraction, form filling, and more).

Configuration

OXIDIZE_WORKSPACE=/path/to/pdfs oxidize-mcp

Or start programmatically:

from oxidize_pdf.mcp.server import run
run()

Python API

Create a PDF

from oxidize_pdf import Document, Page, Font, Color

doc = Document()
doc.set_title("My Document")
doc.set_author("Jane Doe")

page = Page.a4()
page.set_font(Font.HELVETICA, 24.0)
page.set_text_color(Color.black())
page.text_at(72.0, 750.0, "Hello from oxidize-pdf!")

page.set_font(Font.TIMES_ROMAN, 12.0)
page.text_at(72.0, 700.0, "Generated with Python + Rust.")

doc.add_page(page)
doc.save("output.pdf")

Parse an existing PDF

from oxidize_pdf import PdfReader

reader = PdfReader.open("document.pdf")
print(f"Pages: {reader.page_count}, Version: {reader.version}")

for i, text in enumerate(reader.extract_text()):
    print(f"--- Page {i + 1} ---")
    print(text)

Operations

from oxidize_pdf import split_pdf, merge_pdfs, rotate_pdf, extract_pages

split_pdf("input.pdf", "output_dir/")                       # Split into individual pages
merge_pdfs(["part1.pdf", "part2.pdf"], "merged.pdf")         # Merge multiple PDFs
rotate_pdf("input.pdf", "rotated.pdf", 90)                   # Rotate all pages
extract_pages("input.pdf", "subset.pdf", [0, 2, 4])          # Extract specific pages

Graphics

from oxidize_pdf import Document, Page, Color

doc = Document()
page = Page.a4()

page.set_fill_color(Color.hex("#3498db"))
page.draw_rect(72.0, 700.0, 200.0, 100.0)
page.fill()

page.set_stroke_color(Color.red())
page.set_line_width(2.0)
page.draw_circle(300.0, 500.0, 50.0)
page.stroke()

doc.add_page(page)
doc.save("graphics.pdf")

Types

from oxidize_pdf import Color, Point, Rectangle, Margins, Font

# Colors
Color.rgb(1.0, 0.0, 0.0)          # RGB
Color.hex("#ff6600")               # Hex
Color.cmyk(0.0, 1.0, 1.0, 0.0)   # CMYK

# Geometry
Point(72.0, 720.0)
Rectangle.from_xywh(72.0, 72.0, 468.0, 648.0)
Margins.uniform(72.0)

# Fonts — all 14 standard PDF fonts
Font.HELVETICA    # Font.HELVETICA_BOLD
Font.TIMES_ROMAN  # Font.TIMES_BOLD
Font.COURIER      # Font.COURIER_BOLD

Error handling

from oxidize_pdf import PdfReader, PdfError, PdfIoError, PdfParseError

try:
    reader = PdfReader.open("missing.pdf")
except PdfIoError as e:
    print(f"I/O error: {e}")
except PdfParseError as e:
    print(f"Parse error: {e}")
except PdfError as e:
    print(f"PDF error: {e}")

Exception hierarchy: PdfError > PdfIoError, PdfParseError, PdfEncryptionError, PdfPermissionError

MCP Server

oxidize-pdf includes an MCP server that exposes PDF capabilities to AI assistants like Claude. Install with the mcp extra:

pip install oxidize-pdf[mcp]

Claude Desktop

Add this to your claude_desktop_config.json:

{
  "mcpServers": {
    "oxidize-pdf": {
      "command": "uvx",
      "args": ["--from", "oxidize-pdf[mcp]", "oxidize-mcp"]
    }
  }
}

Claude Code

claude mcp add oxidize-pdf -- uvx --from "oxidize-pdf[mcp]" oxidize-mcp

Available tools

Tool Description
read_pdf Open a PDF and get metadata (pages, version, encryption)
extract_text Extract text content from PDF pages
convert_pdf Convert between PDF versions
analyze_pdf Analyze structure, fonts, images, and compliance
extract_entities Extract images and digital signatures
manipulate_pdf Split, merge, rotate, extract, and reorder pages
annotate_pdf Add text annotations, highlights, and stamps
manage_forms Create, fill, and read PDF form fields
secure_pdf Encrypt, decrypt, and set document permissions
create_pdf Create a new PDF document with pages
add_pdf_content Add text, shapes, and images to pages
save_pdf Save the document to file or bytes

Resources

  • oxidize://fonts — Available built-in PDF fonts
  • oxidize://page-sizes — Standard page sizes with dimensions
  • oxidize://capabilities — Server capabilities and tool listing
  • oxidize://version — Version information
  • oxidize://workspace — PDF files in the workspace directory
  • oxidize://session/{id} — Session data by ID

Known limitations

  • Encryption write support: Document.encrypt() configures encryption parameters but the underlying Rust library does not yet serialize the encryption dictionary to the PDF output. Reading encrypted PDFs works correctly.
  • CPython only: PyPy and GraalPy are not supported.

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oxidize_pdf-0.12.0.tar.gz (633.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

oxidize_pdf-0.12.0-cp310-abi3-win_amd64.whl (4.9 MB view details)

Uploaded CPython 3.10+Windows x86-64

oxidize_pdf-0.12.0-cp310-abi3-manylinux_2_28_x86_64.whl (5.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

oxidize_pdf-0.12.0-cp310-abi3-manylinux_2_28_aarch64.whl (5.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

oxidize_pdf-0.12.0-cp310-abi3-macosx_11_0_arm64.whl (5.2 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

oxidize_pdf-0.12.0-cp310-abi3-macosx_10_12_x86_64.whl (5.4 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file oxidize_pdf-0.12.0.tar.gz.

File metadata

  • Download URL: oxidize_pdf-0.12.0.tar.gz
  • Upload date:
  • Size: 633.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oxidize_pdf-0.12.0.tar.gz
Algorithm Hash digest
SHA256 f8f60bc86d0ecd0104c898a977acf56cc0ebe857f50cf110f6e6f7848d682f5e
MD5 5ff7305d6d3ad8e94489f7a0474e06eb
BLAKE2b-256 41221798857a07dee663ce56571c4a32ac8fa56075bc0db46215559f93c687e7

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.12.0.tar.gz:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxidize_pdf-0.12.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: oxidize_pdf-0.12.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 4.9 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oxidize_pdf-0.12.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 eec67564322594b86a62a502f9e30d18bcb4af1dbbe6fa18d70062af458ee99d
MD5 62d8fec97ae0f5cf550401ac2b6e509d
BLAKE2b-256 f0a39a3372095ca334d303c0a86de7d5541c27a240b495869f52284714210456

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.12.0-cp310-abi3-win_amd64.whl:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxidize_pdf-0.12.0-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for oxidize_pdf-0.12.0-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4bb9cdfa58cafb8c116cb236ec2fe6936cfa20f5c2e7e09b2c69ef42838fd396
MD5 f08f4849b8f5bd9284b5462a7c03cbbc
BLAKE2b-256 ebf76edc19207e87b773636b93ae96ce13fc82539672018621acf9b1b83c879d

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.12.0-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxidize_pdf-0.12.0-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for oxidize_pdf-0.12.0-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6bd073324a6b54d87005c9664283d6546287bb01d86a24c3821151fa751e080f
MD5 3b628353903f887b8074d8e2ad3e78c4
BLAKE2b-256 4a5fcebcc3230571c80d78af895f3ab329b41d2dc5cf04f5e65dbfba9043ba1e

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.12.0-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxidize_pdf-0.12.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for oxidize_pdf-0.12.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 61fb6b95b9a407ae4760296ed5b24f06afb3ccdba8d634d26c6b1af184287df7
MD5 07f3f8ece47d3808716bd19ecfa33fd6
BLAKE2b-256 d284677b844515e75410f204393ff960f027b976d2526396981095e306418995

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.12.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxidize_pdf-0.12.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for oxidize_pdf-0.12.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 8a8f053ca965d50be4223955b17ceab6acdde7fe2b46fa6e31ed1a11e36cba1f
MD5 fccbd52d29018ba3524a954fa5534958
BLAKE2b-256 468959ca0db1ea8b6a103efda73f544a28eac51ae053b125e60f12b5b4ece6ec

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.12.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page