Skip to main content

Python bindings for oxidize-pdf — generate, parse, split, merge, and manipulate PDF files

Project description

oxidize-pdf

PyPI version CI License: MIT Python Typed MCP

Rust-powered PDF library for Python. Generate, parse, split, merge, and manipulate PDFs with native performance. Ships with a built-in MCP server so AI agents can work with PDFs out of the box.

No C dependencies. No Java. No subprocess calls.

Installation

pip install oxidize-pdf            # Core library
pip install "oxidize-pdf[mcp]"     # + MCP server for AI agents

Platforms: Linux (x86_64, aarch64) | macOS (x86_64, Apple Silicon) | Windows (x86_64) Requires: Python 3.10+

Why oxidize-pdf?

oxidize-pdf Pure-Python libs C/Java wrappers
Performance Native (compiled Rust) Interpreted Native but heavy
Dependencies Zero Varies Poppler, Java, Ghostscript
Memory safety Rust ownership model GC-dependent Manual / GC
Type stubs Full (mypy/pyright) Partial Rare
AI-ready (MCP) Built-in No No

MCP Server

Give your AI agent full PDF capabilities in one line:

oxidize-mcp

The built-in Model Context Protocol server exposes 12 tools, 6 resources, and 5 prompts — compatible with Claude, GPT, and any MCP client.

Claude Desktop integration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "oxidize-pdf": {
      "command": "oxidize-mcp",
      "env": {
        "OXIDIZE_WORKSPACE": "/path/to/your/pdfs"
      }
    }
  }
}

Available tools

Tool What it does
read_pdf Read metadata — page count, version, encryption status, title, author
extract_text Extract text from all pages or a specific page
convert_pdf Convert to markdown, chunks, or RAG-optimized format
create_pdf Create a new PDF with optional metadata
save_pdf Save a session to disk, with optional encryption
add_content Add pages, text, and graphics to a session
annotate_pdf Add text annotations and highlights
manipulate_pdf Split, merge, rotate, extract pages, reverse, overlay
manage_forms Create, fill, read, and validate form fields
secure_pdf Encrypt, check permissions, verify signatures
extract_entities Extract structured entities from pages
analyze_pdf Validate structure, detect corruption, check PDF/A compliance

The server also exposes resources (session data, capabilities, version info) and prompts (guided workflows for summarization, data extraction, form filling, and more).

Configuration

OXIDIZE_WORKSPACE=/path/to/pdfs oxidize-mcp

Or start programmatically:

from oxidize_pdf.mcp.server import run
run()

Python API

Create a PDF

from oxidize_pdf import Document, Page, Font, Color

doc = Document()
doc.set_title("My Document")
doc.set_author("Jane Doe")

page = Page.a4()
page.set_font(Font.HELVETICA, 24.0)
page.set_text_color(Color.black())
page.text_at(72.0, 750.0, "Hello from oxidize-pdf!")

page.set_font(Font.TIMES_ROMAN, 12.0)
page.text_at(72.0, 700.0, "Generated with Python + Rust.")

doc.add_page(page)
doc.save("output.pdf")

Parse an existing PDF

from oxidize_pdf import PdfReader

reader = PdfReader.open("document.pdf")
print(f"Pages: {reader.page_count}, Version: {reader.version}")

for i, text in enumerate(reader.extract_text()):
    print(f"--- Page {i + 1} ---")
    print(text)

Operations

from oxidize_pdf import split_pdf, merge_pdfs, rotate_pdf, extract_pages

split_pdf("input.pdf", "output_dir/")                       # Split into individual pages
merge_pdfs(["part1.pdf", "part2.pdf"], "merged.pdf")         # Merge multiple PDFs
rotate_pdf("input.pdf", "rotated.pdf", 90)                   # Rotate all pages
extract_pages("input.pdf", "subset.pdf", [0, 2, 4])          # Extract specific pages

Graphics

from oxidize_pdf import Document, Page, Color

doc = Document()
page = Page.a4()

page.set_fill_color(Color.hex("#3498db"))
page.draw_rect(72.0, 700.0, 200.0, 100.0)
page.fill()

page.set_stroke_color(Color.red())
page.set_line_width(2.0)
page.draw_circle(300.0, 500.0, 50.0)
page.stroke()

doc.add_page(page)
doc.save("graphics.pdf")

Types

from oxidize_pdf import Color, Point, Rectangle, Margins, Font

# Colors
Color.rgb(1.0, 0.0, 0.0)          # RGB
Color.hex("#ff6600")               # Hex
Color.cmyk(0.0, 1.0, 1.0, 0.0)   # CMYK

# Geometry
Point(72.0, 720.0)
Rectangle.from_xywh(72.0, 72.0, 468.0, 648.0)
Margins.uniform(72.0)

# Fonts — all 14 standard PDF fonts
Font.HELVETICA    # Font.HELVETICA_BOLD
Font.TIMES_ROMAN  # Font.TIMES_BOLD
Font.COURIER      # Font.COURIER_BOLD

Error handling

from oxidize_pdf import PdfReader, PdfError, PdfIoError, PdfParseError

try:
    reader = PdfReader.open("missing.pdf")
except PdfIoError as e:
    print(f"I/O error: {e}")
except PdfParseError as e:
    print(f"Parse error: {e}")
except PdfError as e:
    print(f"PDF error: {e}")

Exception hierarchy: PdfError > PdfIoError, PdfParseError, PdfEncryptionError, PdfPermissionError

MCP Server

oxidize-pdf includes an MCP server that exposes PDF capabilities to AI assistants like Claude. Install with the mcp extra:

pip install oxidize-pdf[mcp]

Claude Desktop

Add this to your claude_desktop_config.json:

{
  "mcpServers": {
    "oxidize-pdf": {
      "command": "uvx",
      "args": ["--from", "oxidize-pdf[mcp]", "oxidize-mcp"]
    }
  }
}

Claude Code

claude mcp add oxidize-pdf -- uvx --from "oxidize-pdf[mcp]" oxidize-mcp

Available tools

Tool Description
read_pdf Open a PDF and get metadata (pages, version, encryption)
extract_text Extract text content from PDF pages
convert_pdf Convert between PDF versions
analyze_pdf Analyze structure, fonts, images, and compliance
extract_entities Extract images and digital signatures
manipulate_pdf Split, merge, rotate, extract, and reorder pages
annotate_pdf Add text annotations, highlights, and stamps
manage_forms Create, fill, and read PDF form fields
secure_pdf Encrypt, decrypt, and set document permissions
create_pdf Create a new PDF document with pages
add_pdf_content Add text, shapes, and images to pages
save_pdf Save the document to file or bytes

Resources

  • oxidize://fonts — Available built-in PDF fonts
  • oxidize://page-sizes — Standard page sizes with dimensions
  • oxidize://capabilities — Server capabilities and tool listing
  • oxidize://version — Version information
  • oxidize://workspace — PDF files in the workspace directory
  • oxidize://session/{id} — Session data by ID

Known limitations

  • Encryption write support: Document.encrypt() configures encryption parameters but the underlying Rust library does not yet serialize the encryption dictionary to the PDF output. Reading encrypted PDFs works correctly.
  • CPython only: PyPy and GraalPy are not supported.

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oxidize_pdf-0.11.0.tar.gz (627.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

oxidize_pdf-0.11.0-cp310-abi3-win_amd64.whl (4.9 MB view details)

Uploaded CPython 3.10+Windows x86-64

oxidize_pdf-0.11.0-cp310-abi3-manylinux_2_28_x86_64.whl (5.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

oxidize_pdf-0.11.0-cp310-abi3-manylinux_2_28_aarch64.whl (5.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

oxidize_pdf-0.11.0-cp310-abi3-macosx_11_0_arm64.whl (5.2 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

oxidize_pdf-0.11.0-cp310-abi3-macosx_10_12_x86_64.whl (5.4 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file oxidize_pdf-0.11.0.tar.gz.

File metadata

  • Download URL: oxidize_pdf-0.11.0.tar.gz
  • Upload date:
  • Size: 627.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oxidize_pdf-0.11.0.tar.gz
Algorithm Hash digest
SHA256 b4ef5e991068cab885987409cf35e870b6743eae8e01372080a4e1cecca2e3e3
MD5 b4f8298a07cc48f06b0e35375469331f
BLAKE2b-256 702e262599d656bd12626c687e32eeac79c5640198f0c2bc03f74fe29e0f6e0e

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.11.0.tar.gz:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxidize_pdf-0.11.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: oxidize_pdf-0.11.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 4.9 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oxidize_pdf-0.11.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 b182793b6e909be42210bd0ce38e1e65cdf9cd48222450c382e88c86da028347
MD5 50f578d619a9f47d664503b28daa2378
BLAKE2b-256 549875ab5caac391540641c5f991870af1c639624ed4951945163f74c3874518

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.11.0-cp310-abi3-win_amd64.whl:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxidize_pdf-0.11.0-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for oxidize_pdf-0.11.0-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a55eb6bf10757a11e2116060c524317971aacfd063d36993ffb1099f8fea2161
MD5 7505d719b7dda5fa1313b5fc481a29c9
BLAKE2b-256 b484a5e74dd3620dfc1f6b9de9d44c21c744a6f5af4c7cf53c4cfc03c2f7841d

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.11.0-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxidize_pdf-0.11.0-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for oxidize_pdf-0.11.0-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 360a9e91cde6394c7e74bf38ec776741272dfaf75cb113d91e70fdce3949e6ed
MD5 36dd28d871717cd405c6a5bb26db3184
BLAKE2b-256 f1563231dd2518110f8ab645c35d5c9db858f114002c665dd810f4b0ca9491fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.11.0-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxidize_pdf-0.11.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for oxidize_pdf-0.11.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a114b8839fa18c433a81ec7cf7bc8faa42095d8a2833d2828e6798838e42e927
MD5 3f7c1133735d84dd4b1e0585aa1ac0fa
BLAKE2b-256 42fc566631deea124111989bb7654c408e805c40ac5d52fa629c635aacfb5a03

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.11.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxidize_pdf-0.11.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for oxidize_pdf-0.11.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 662cb133c20b1c715f7ffcf66ab25e728543073a9533bf89336bcbd0d30fba49
MD5 9d308e446cc5fd1a26f02d1e9172fab9
BLAKE2b-256 ff13fe93b33e956bb2ec93e00c006e3edf9f8478e2e88ed102e5c09982356b68

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxidize_pdf-0.11.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on bzsanti/oxidize-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page