Skip to main content

A Python 3 tool to explore, analyse, and disassemble PDF files

Project description

peepdf-3 - peepdf for Python 3

peepdf-3 is a Python 3 tool to explore PDF files in order to find out if the file can be harmful or not. The aim of this tool is to provide all the necessary components that a security researcher could need in a PDF analysis without using 3 or 4 tools to make all the tasks.

With peepdf it's possible to see all the objects in the document showing the suspicious elements, supports all the most used filters and encodings, it can parse different versions of a file, object streams and encrypted files. With the installation of STPyV8 and Pylibemu it provides Javascript and shellcode analysis wrappers too. Apart of this it's able to create new PDF files and to modify/obfuscate existent ones.

As of version 3.0.0, peepdf-3 no longer includes pylibemu as a requirement in order to use peepdf-3 on Windows systems. However, all functionality of libemu, pylibemu, and sctest still exist, and will function on Linux systems.

PyPI - Version

Features

The main functionalities of peepdf are the following:

Analysis:

  • Decodings: hexadecimal, octal, name objects
  • More used filters
  • References in objects and where an object is referenced
  • Strings search (including streams)
  • Physical structure (offsets)
  • Logical tree structure
  • Metadata
  • Modifications between versions (changelog)
  • Compressed objects (object streams)
  • Analysis and modification of Javascript (STPyV8): unescape, replace, join
  • Shellcode analysis (Libemu python wrapper, pylibemu)
  • Variables (set command)
  • Extraction of old versions of the document
  • Easy extraction of objects, Javascript code, shellcodes (>, >>, $>, $>>)
  • Checking hashes on VirusTotal
  • Detection of common encryption methods
  • Output of XML and JSON data

Creation/Modification:

  • Basic PDF creation
  • Creation of PDF with Javascript executed wen the document is opened
  • Creation of object streams to compress objects
  • Embedded PDFs
  • Strings and names obfuscation
  • Malformed PDF output: without endobj, garbage in the header, bad header...
  • Filters modification
  • Objects modification

Ways to use peepdf:

  • Basic execution
  • Interactive console
  • Script mode
  • JSON Output
  • XML Output
  • VirusTotal analysis
  • OCR

TODO:

  • Embedded PDFs analysis
  • Improving automatic Javascript analysis
  • GUI

Related articles:

Included in:

Installation

You can install / use peepdf-3 via these methods:

  • From PyPI via pip - python3 -m pip install peepdf-3
  • From GitHub via pip and git - python3 -m pip install git+https://github.com/digitalsleuth/peepdf-3.git
  • Clone the GitHub repo, cd into the peepdf-3 folder, chmod +x peepdf.py and ./peepdf.py

Current Known Limitations

  • As of version 3.0.0, there are no limitations with the installation and usage of peepdf-3, as the hard requirement for pylibemu has been lifted. The functionality still remains for Linux systems.

Notes

  • The current maintainer of this project (Corey Forman - digitalsleuth), does not receive any funding for, and is not currently seeking any monetary contributions for this work. If you are willing to provide assistance, programming contributions, and feedback, that is always welcome.
  • As this project originated with Jose Miguel Esparza, I will continue to leave his sentiments, and PayPal link, below to acknowledge his original product.

You are free to contribute with feedback, bugs, patches, etc. Any help is welcome. Also, if you really enjoy using peepdf, you think it is worth it and you feel really generous today you can donate some bucks to the project ;) Thanks!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

peepdf_3-5.3.0.tar.gz (138.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

peepdf_3-5.3.0-py3-none-any.whl (143.6 kB view details)

Uploaded Python 3

File details

Details for the file peepdf_3-5.3.0.tar.gz.

File metadata

  • Download URL: peepdf_3-5.3.0.tar.gz
  • Upload date:
  • Size: 138.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for peepdf_3-5.3.0.tar.gz
Algorithm Hash digest
SHA256 bf6670c42c9c79d3c265c0b8f1782b89b9884ce7d782d3c769a45077e04b62d2
MD5 454afb5a2f355d1e5990f5acd03b90ec
BLAKE2b-256 0f49e9b4c0872502b5c78eeb4bdb99bc2e334c25e4bfc86e0578ccb42ed4f666

See more details on using hashes here.

File details

Details for the file peepdf_3-5.3.0-py3-none-any.whl.

File metadata

  • Download URL: peepdf_3-5.3.0-py3-none-any.whl
  • Upload date:
  • Size: 143.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for peepdf_3-5.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 09bfcf7436fe5e117b28ff7d24f4267515e336a5dce85ef1613c4a0fe9178a40
MD5 0f0be86f2f02429fa12841e1461b95f6
BLAKE2b-256 24873ea04b40df185407be0b2d373d8743bc18de6ca6811f443b228c50d18cee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page