Skip to main content

Convert arbitrary PDF files to PDF/A-1b

Project description

pdf2pdfa

pdf2pdfa converts ordinary PDF documents into fully compliant PDF/A‑1b files. The library offers both a simple command line tool and a Python API so that archival conversion can easily be automated or integrated into larger systems.

Features

  • Converts any PDF into valid PDF/A‑1b
  • Embeds missing fonts automatically (using a fallback TrueType font)
  • Attaches an sRGB ICC profile for consistent colour management
  • Cleans and synchronises document metadata
  • Usable from the command line or as a library

Requirements

Python 3.7 or newer is needed. The following packages are installed automatically when using pip:

For validation of the generated files you can optionally install verapdf.

Installation

Install the latest release from PyPI:

pip install pdf2pdfa

Command line usage

pdf2pdfa convert input.pdf output.pdf

--icc PATH can be used to specify a custom ICC profile.

Library usage

from pdf2pdfa import Converter

conv = Converter()
conv.convert("input.pdf", "output.pdf")

If fonts are missing from the source PDF the converter tries to embed a common system font (e.g. DejaVu Sans). You may also supply a specific font path.

Development and testing

Clone the repository and install the development requirements:

pip install -e .[test]

Run the unit tests using pytest. When verapdf is available the tests also check that the output conforms to PDF/A‑1b.

pytest

License

This project is released under the MIT License. See LICENSE for details.

Contributing

Contributions are very welcome. Feel free to open issues or submit pull requests on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf2pdfa-0.1.0.tar.gz (18.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdf2pdfa-0.1.0-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file pdf2pdfa-0.1.0.tar.gz.

File metadata

  • Download URL: pdf2pdfa-0.1.0.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pdf2pdfa-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ae4f88ee80e20c84128255e8a678fbfff3b30754f5da5ffab2e918f2212c56e4
MD5 16a17c1dace46c103dfe090cce009760
BLAKE2b-256 ac3896a4cb54594965f4282cccb696e5367a9cf02b3daeb97e1f98406f951686

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdf2pdfa-0.1.0.tar.gz:

Publisher: release.yml on nks1990/pdf2pdfa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pdf2pdfa-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pdf2pdfa-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pdf2pdfa-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b870fa3ad3e18bb360085260e4eaff0fe6fd25258bbf397a93d16b9857ec5e70
MD5 991844adfa394cec1bb472dc6d24ff01
BLAKE2b-256 2f6494fe9743ad92c42cdf0aad0754928530a1bf055aa39c5e439cc46477a26f

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdf2pdfa-0.1.0-py3-none-any.whl:

Publisher: release.yml on nks1990/pdf2pdfa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page