Skip to main content

A command-line tool for validating PDF accessibility, analyzing document structure, and generating detailed reports

Project description

avalpdf - PDF Accessibility Validator

A command-line tool for validating PDF accessibility, analyzing document structure, and generating detailed reports.

Features

  • Document structure analysis
  • Support for both local and remote PDF files
  • Validation of:
    • Document tags and metadata:
      • Document tagging status
      • Title presence
      • Language declaration
    • Heading hierarchy:
      • H1 presence
      • Correct heading levels sequence
    • Figure alt text:
      • Missing alternative text detection
    • Tables structure:
      • Header presence and proper structure
      • Empty cells detection
      • Duplicate headers check
      • Multiple header rows warning
      • Empty tables detection
    • Lists structure:
      • Proper list tagging
      • Detection of untagged lists (consecutive paragraphs with bullets/numbers)
      • Misused list types (numbered items in unordered lists)
      • List hierarchy consistency
    • Formatting issues:
      • Excessive underscores (used for underlining)
      • Spaced capital letters (like "T E S T")
      • Extra spaces used for layout
    • Empty elements:
      • Empty paragraphs
      • Whitespace-only elements
      • Empty headings
      • Empty table cells
  • Multiple output formats (JSON, console reports)

Installation

pip install avalpdf

Usage

After installation, you can run avalpdf from any directory.

Quick start

Simply run

avalpdf thesis.pdf

or

avalpdf https://example.com/document.pdf

to get a report like this

accessibility report

and a preview of the structure

pdf structure preview

Details

# Basic validation with console output
avalpdf document.pdf

# Complete analysis with all outputs
avalpdf document.pdf --full --simple --report

# Save reports to specific directory
avalpdf document.pdf -o /path/to/output --report --simple

# Show document structure only
avalpdf document.pdf --show-structure

Command Line Options

  • --full: Save full JSON structure
  • --simple: Save simplified JSON structure
  • --report: Save validation report
  • --output-dir, -o: Specify output directory
  • --show-structure: Display document structure
  • --show-validation: Display validation results
  • --quiet, -q: Suppress console output

Examples

  1. Quick accessibility check:
avalpdf thesis.pdf
  1. Generate all reports:
avalpdf report.pdf --full --simple --report -o ./analysis
  1. Silent operation with report generation:
avalpdf document.pdf --report -q
  1. Analyze multiple files:
for file in *.pdf; do avalpdf "$file" --report --quiet; done

Validation Output

The tool provides three types of findings:

  • ✅ Successes: Correctly implemented accessibility features
  • ⚠️ Warnings: Potential issues that need attention
  • ❌ Issues: Problems that must be fixed

Report Format

{
  "validation_results": {
    "issues": ["..."],
    "warnings": ["..."],
    "successes": ["..."]
  }
}

License

MIT License

Support

For issues or suggestions:

  • Open an issue on GitHub
  • Provide the PDF file (if possible) and the complete error message
  • Include the command you used and your operating system information

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

avalpdf-0.1.2.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

avalpdf-0.1.2-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file avalpdf-0.1.2.tar.gz.

File metadata

  • Download URL: avalpdf-0.1.2.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for avalpdf-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a42e6d45778707aa26c3d45d5428ece7edd793ee1da41e1ac1d2174486a644cb
MD5 ae75c608d39bc9393861ca72c9b7ab7b
BLAKE2b-256 4ede19a18c46e84310ebe15be7fad3e7ef6f212bb69b0075ac236e96a2923442

See more details on using hashes here.

File details

Details for the file avalpdf-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: avalpdf-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for avalpdf-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 249f691b745a359c75affdaa8475af970278d29dc448e3e04aa948c1b548ac0a
MD5 47188b7ded2e34b856f808b7bcf3be6f
BLAKE2b-256 a829d7225e195d2289f0ee4558322b9da527b80ecc8a9c7085ab298d82cccb0c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page