Skip to main content

A command-line tool for validating PDF accessibility, analyzing document structure, and generating detailed reports

Project description

avalpdf - PDF Accessibility Validator

A command-line tool for validating PDF accessibility, analyzing document structure, and generating detailed reports.

Features

  • Full PDF accessibility validation
  • Document structure analysis
  • Support for both local and remote PDF files
  • Validation of:
    • Document tags and metadata:
      • Document tagging status
      • Title presence
      • Language declaration
    • Heading hierarchy:
      • H1 presence
      • Correct heading levels sequence
    • Figure alt text:
      • Missing alternative text detection
    • Tables structure:
      • Header presence and proper structure
      • Empty cells detection
      • Duplicate headers check
      • Multiple header rows warning
      • Empty tables detection
    • Lists structure:
      • Proper list tagging
      • Detection of untagged lists (consecutive paragraphs with bullets/numbers)
      • Misused list types (numbered items in unordered lists)
      • List hierarchy consistency
    • Formatting issues:
      • Excessive underscores (used for underlining)
      • Spaced capital letters (like "T E S T")
      • Extra spaces used for layout
    • Empty elements:
      • Empty paragraphs
      • Whitespace-only elements
      • Empty headings
      • Empty table cells
  • Multiple output formats (JSON, console reports)

Requirements

  • Python 3.6+
  • PDFix SDK

Installation

[!WARNING] The installation procedure is currently a work in progress and may not be fully stable. Improvements are being made to make the installation process more robust.

  1. Install PDFix SDK:
pip install pdfix-sdk
  1. Download and install avalpdf:
sudo sh -c 'wget https://raw.githubusercontent.com/dennisangemi/avalpdf/main/avalpdf -O /usr/local/bin/avalpdf && chmod +x /usr/local/bin/avalpdf'

Usage

After installation, you can run avalpdf from any directory.

Quick start

Simply run

avalpdf thesis.pdf

or

avalpdf https://example.com/document.pdf

to get a report like this

accessibility report

and a preview of the structure

pdf structure preview

Details

# Basic validation with console output
avalpdf document.pdf

# Complete analysis with all outputs
avalpdf document.pdf --full --simple --report

# Save reports to specific directory
avalpdf document.pdf -o /path/to/output --report --simple

# Show document structure only
avalpdf document.pdf --show-structure

Command Line Options

  • --full: Save full JSON structure
  • --simple: Save simplified JSON structure
  • --report: Save validation report
  • --output-dir, -o: Specify output directory
  • --show-structure: Display document structure
  • --show-validation: Display validation results
  • --show-all: Show all information
  • --quiet, -q: Suppress console output

Examples

  1. Quick accessibility check:
avalpdf thesis.pdf
  1. Generate all reports:
avalpdf report.pdf --full --simple --report -o ./analysis
  1. Silent operation with report generation:
avalpdf document.pdf --report -q
  1. Analyze multiple files:
for file in *.pdf; do avalpdf "$file" --report --quiet; done

Validation Output

The tool provides three types of findings:

  • ✅ Successes: Correctly implemented accessibility features
  • ⚠️ Warnings: Potential issues that need attention
  • ❌ Issues: Problems that must be fixed

Report Format

{
  "validation_results": {
    "issues": ["..."],
    "warnings": ["..."],
    "successes": ["..."]
  }
}

License

MIT License

Support

For issues or suggestions:

  • Open an issue on GitHub
  • Provide the PDF file (if possible) and the complete error message
  • Include the command you used and your operating system information

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

avalpdf-0.1.1.tar.gz (17.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

avalpdf-0.1.1-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file avalpdf-0.1.1.tar.gz.

File metadata

  • Download URL: avalpdf-0.1.1.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for avalpdf-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f39a9c0dc1bce69a88731796afc9babc330a696fb20a4d4b602948ef47fc58d8
MD5 5e3e23f3911bfb35440796a21868fb51
BLAKE2b-256 1e594332c95b134f35019b6bbc50b576ca825881e069c3e6c040e23648f804b0

See more details on using hashes here.

File details

Details for the file avalpdf-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: avalpdf-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for avalpdf-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d376d1348f78fba93b89b2e31e03e7997f173727281752dc7e838a94d1bc5fac
MD5 323879dd83c1ca357401653b67bcca77
BLAKE2b-256 c677b652756bcb644564aa769e78a71bf3ea0166772c6c8841ea15cdaae5964a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page