Convert arbitrary PDF files to PDF/A-1b
Project description
pdf2pdfa
pdf2pdfa converts ordinary PDF documents into fully compliant PDF/A‑1b files. The library offers both a simple command line tool and a Python API so that archival conversion can easily be automated or integrated into larger systems.
Features
- Converts any PDF into valid PDF/A‑1b
- Embeds missing fonts automatically (using a fallback TrueType font)
- Attaches an sRGB ICC profile for consistent colour management
- Cleans and synchronises document metadata
- Usable from the command line or as a library
Requirements
Python 3.7 or newer is needed. The following packages are installed automatically when using pip:
For validation of the generated files you can optionally install verapdf.
Installation
Install the latest release from PyPI:
pip install pdf2pdfa
Command line usage
pdf2pdfa convert input.pdf output.pdf
--icc PATH can be used to specify a custom ICC profile.
Library usage
from pdf2pdfa import Converter
conv = Converter()
conv.convert("input.pdf", "output.pdf")
If fonts are missing from the source PDF the converter tries to embed a common system font (e.g. DejaVu Sans). You may also supply a specific font path.
Development and testing
Clone the repository and install the development requirements:
pip install -e .[test]
Run the unit tests using pytest. When verapdf is available the tests also check that the output conforms to PDF/A‑1b.
pytest
License
This project is released under the MIT License. See LICENSE for details.
Contributing
Contributions are very welcome. Feel free to open issues or submit pull requests on GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf2pdfa-0.1.0.tar.gz.
File metadata
- Download URL: pdf2pdfa-0.1.0.tar.gz
- Upload date:
- Size: 18.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae4f88ee80e20c84128255e8a678fbfff3b30754f5da5ffab2e918f2212c56e4
|
|
| MD5 |
16a17c1dace46c103dfe090cce009760
|
|
| BLAKE2b-256 |
ac3896a4cb54594965f4282cccb696e5367a9cf02b3daeb97e1f98406f951686
|
Provenance
The following attestation bundles were made for pdf2pdfa-0.1.0.tar.gz:
Publisher:
release.yml on nks1990/pdf2pdfa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pdf2pdfa-0.1.0.tar.gz -
Subject digest:
ae4f88ee80e20c84128255e8a678fbfff3b30754f5da5ffab2e918f2212c56e4 - Sigstore transparency entry: 319911491
- Sigstore integration time:
-
Permalink:
nks1990/pdf2pdfa@cbaf1459184e932aa364070d63bb0da6f05d207c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/nks1990
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@cbaf1459184e932aa364070d63bb0da6f05d207c -
Trigger Event:
push
-
Statement type:
File details
Details for the file pdf2pdfa-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pdf2pdfa-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b870fa3ad3e18bb360085260e4eaff0fe6fd25258bbf397a93d16b9857ec5e70
|
|
| MD5 |
991844adfa394cec1bb472dc6d24ff01
|
|
| BLAKE2b-256 |
2f6494fe9743ad92c42cdf0aad0754928530a1bf055aa39c5e439cc46477a26f
|
Provenance
The following attestation bundles were made for pdf2pdfa-0.1.0-py3-none-any.whl:
Publisher:
release.yml on nks1990/pdf2pdfa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pdf2pdfa-0.1.0-py3-none-any.whl -
Subject digest:
b870fa3ad3e18bb360085260e4eaff0fe6fd25258bbf397a93d16b9857ec5e70 - Sigstore transparency entry: 319911506
- Sigstore integration time:
-
Permalink:
nks1990/pdf2pdfa@cbaf1459184e932aa364070d63bb0da6f05d207c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/nks1990
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@cbaf1459184e932aa364070d63bb0da6f05d207c -
Trigger Event:
push
-
Statement type: