A command-line tool for validating PDF accessibility, analyzing document structure, and generating detailed reports
Project description
avalpdf - PDF Accessibility Validator
A command-line tool for validating PDF accessibility, analyzing document structure, and generating detailed reports.
Features
- Document structure analysis
- Support for both local and remote PDF files
- Validation of:
- Document tags and metadata:
- Document tagging status
- Title presence
- Language declaration (Italian)
- Heading hierarchy:
- H1 presence
- Correct heading levels sequence
- Figure alt text:
- Missing alternative text detection
- Complex or problematic alt text patterns
- Tables structure:
- Header presence and proper structure
- Empty cells detection
- Duplicate headers check
- Multiple header rows warning
- Empty tables detection
- Lists structure:
- Proper list tagging
- Detection of untagged lists (consecutive paragraphs with bullets/numbers)
- Misused list types (numbered items in unordered lists)
- List hierarchy consistency
- Links:
- Detection of non-descriptive links
- Raw URL text warnings
- Email and institutional domain exceptions
- Formatting issues:
- Excessive underscores (used for underlining)
- Spaced capital letters (like "T E S T")
- Extra spaces used for layout (3+ consecutive spaces)
- Empty elements:
- Empty paragraphs
- Whitespace-only elements
- Empty headings
- Empty spans
- Empty table cells
- Document tags and metadata:
- Multiple output formats:
- Detailed JSON structure
- Simplified JSON
- Accessibility validation report
- Console reports with color-coded structure visualization
- Weighted scoring system based on accessibility criteria
- Detailed issue categorization (issues, warnings, successes)
Installation
pip install avalpdf
Usage
After installation, you can run avalpdf from any directory.
Quick start
Simply run
avalpdf thesis.pdf
or
avalpdf https://example.com/document.pdf
to get a report like this
and a preview of the structure
Details
# Basic validation with console output
avalpdf document.pdf
# Complete analysis with all outputs
avalpdf document.pdf --full --simple --report
# Save reports to specific directory
avalpdf document.pdf -o /path/to/output --report --simple
# Show document structure only
avalpdf document.pdf --show-structure
Command Line Options
--full: Save full JSON structure--simple: Save simplified JSON structure--report: Save validation report--output-dir,-o: Specify output directory--show-structure: Display document structure--show-validation: Display validation results--quiet,-q: Suppress console output
Examples
- Quick accessibility check:
avalpdf thesis.pdf
- Generate all reports:
avalpdf report.pdf --full --simple --report -o ./analysis
- Silent operation with report generation:
avalpdf document.pdf --report -q
- Analyze multiple files:
for file in *.pdf; do avalpdf "$file" --report --quiet; done
Validation Output
The tool provides three types of findings:
- ✅ Successes: Correctly implemented accessibility features
- ⚠️ Warnings: Potential issues that need attention
- ❌ Issues: Problems that must be fixed
Report Format
{
"validation_results": {
"issues": ["..."],
"warnings": ["..."],
"successes": ["..."]
}
}
License
MIT License
Support
For issues or suggestions:
- Open an issue on GitHub
- Provide the PDF file (if possible) and the complete error message
- Include the command you used and your operating system information
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file avalpdf-0.1.3.tar.gz.
File metadata
- Download URL: avalpdf-0.1.3.tar.gz
- Upload date:
- Size: 38.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
711dc10d373e980d2be89c207bcb25fe8d9487157ed82219fc3e1dc6ddd00c1e
|
|
| MD5 |
7f9a8207c4570e0df9d6e224e2df82a5
|
|
| BLAKE2b-256 |
92b5bbb19eb7521a315907e873890c9824d0971d25b3088f2edd794f1453d91c
|
File details
Details for the file avalpdf-0.1.3-py3-none-any.whl.
File metadata
- Download URL: avalpdf-0.1.3-py3-none-any.whl
- Upload date:
- Size: 37.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e083c10f9f1410e1130ba900238f9af3bf5df637d80c062796743dbb9b172973
|
|
| MD5 |
dd834b343cca65b15f89fd60d9390a2e
|
|
| BLAKE2b-256 |
8136fb4789797ab05d7de5bc68a448ac4662ce9717641884bd2d0ab3616a0caa
|