Skip to main content

A Python module to clean PDF files by disabling active content (javascript, launch, etc), using the Ruby Origami PDF parser.

Project description

ORIGAPY module:

origapy is a Python module to clean PDF files by disabling active content (javascript, launch, etc), using the Ruby Origami PDF parser.

It includes a partial version of the origami PDF parser.

origapy website: http://www.decalage.info/python/origapy origami website: http://www.security-labs.org/origami

REQUIREMENTS:

  • Python v2.x

  • Ruby v1.8.x

INSTALLATION:

  • on Windows, launch install.bat

  • on other systems, launch: setup.py install

HOW TO USE THIS MODULE:

import origapy pc = origapy.PDF_Cleaner() pc.clean(‘file.pdf’, ‘cleaned.pdf’)

See also the main code at the end of the module, and docstrings.

LICENSE:

GPL v3 - See COPYING.txt.

AUTHOR:

Philippe Lagadec - http://www.decalage.info

Project details


Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page