Skip to main content

Python un-freezing and bytecode extraction + analysis framework

Project description

############################################################ pydecipher: unfreeze and deobfuscate your frozen python code ############################################################

What is pydecipher?

pydecipher is a Python package to unpack/unfreeze and analyze frozen Python artifacts with the ultimate goal of producing the artifact's underlying, high-level, Python source code.

pydecipher can be used as a direct replacement for tools like unpy2exe and pyinstxtractor, and as an alternative to pyREtic for situations where you need to analyze opcode-obfuscation and have the compiled Python files on disk (as opposed to live Python objects in memory). Currently, pydecipher supports the analysis of PE files, PyInstaller artifacts, Py2Exe artifacts, individual bytecode files (.pyc), and zip files of Python bytecode files.

How do I use pydecipher?

pydecipher can be run on the command line in Python 3.8 or newer environments on macOS and Linux. Windows should also theoretically be supported, but it has not been tested thoroughly yet.

.. code-block:: console

$ pydecipher example.exe
[*] Unpacking /home/user/example.exe
[+] Dumped this PE's overlay data to pydecipher_output_example/overlay_data
[*] Unpacking /home/user/pydecipher_output_example/overlay_data
[!] Potential entrypoint found at script example_main.py
[*] Unpacking /home/user/pydecipher_output_example/overlay_data_output/PYZ-00.pyz
[+] Successfully extracted 133 files from this ZlibArchive.
[+] Successfully extracted 7 files from this CArchive.
[+] Successfully decompiled 6 .pyc files.

For more examples, see the documentation's User Guide. Additionally, it can be run from other Python code by importing the relevant parts of the API.

During execution, pydecipher will recursively search the input artifact for Python bytecode, dump that bytecode using xdis and attempt to convert any dumped bytecode to high-level Python source code using uncompyle6. For example, the output directory of the example above looks like this:

.. code-block:: console

$ tree pydecipher_output_example/ -L 2
pydecipher_output_example/
├── log_18_18_33_Dec_04_2019.txt
├── overlay_data
└── overlay_data_output
    ├── PYZ-00.pyz
    ├── pyiboot01_bootstrap.py
    ├── pyiboot01_bootstrap.pyc
    ├── pyimod01_os_path.py
    ├── pyimod01_os_path.pyc
    ├── pyimod02_archive.py
    ├── pyimod02_archive.pyc
    ├── pyimod03_importers.py
    ├── pyimod03_importers.pyc
    ├── pyz-00_output
    ├── struct.py
    ├── struct.pyc
    ├── example_main.py
    └── example_main.pyc
2 directories, 15 files

pydecipher also implements certain deobfuscation techniques on any recovered bytecode. Basic tampering with bytecode file headers can be automatically reversed in pydecipher's processing pipeline. Additionally, bytecode that has been produced with a custom interpreter that has remapped its opcodes can be studied using pydecipher's remap module.

.. _what-is-python-freezing:

What is Python freezing?

To 'freeze' Python code is to take Python source code and package it with a Python interpreter, typically bundled into a single executable binary (PE, ELF, Mach-O, etc.).

There are several different tools that can be used to freeze Python code. As of pydecipher's initial writing (2019), PyInstaller is the most popular and best-maintained. It is also cross-platform, working on Windows, macOS, and Linux. Some other commonly used freezers are Py2Exe (Windows), py2app (macOS), cx_Freeze (cross-platform) and bbFreeze (cross-platform). The primary reason Python code is frozen is so developers do not have to rely on end-users' systems to have the right version of Python installed (or any version at all) in order to run Python code. Python-freezing tools have also lowered the bar for malware development.

For a full overview on Python freezing, check out python-guide.org's primer on freezing: https://docs.python-guide.org/shipping/freezing/

Why was pydecipher created?

Python's increasing popularity, combined with the advent of freezing tools, has led to an increase in Python-based malware. There are existing open-source tools that handle the different stages of analyzing these frozen Python binaries (extraction vs. disassembly vs. deobfuscation vs. decompilation), however many of those tools are no longer maintained, have cumbersome set-up processes, only work within a narrow range of Python versions, or generally leave other things to be desired. pydecipher aims to be the quickest possible solution for a reverse-engineer to recover Python source code by handling and automating as many of those analysis stages as possible.

For more information, see the docs/ directory.

RELEASE STATEMENT

Approved for Public Release; Distribution Unlimited. Public Release Case Number 20-2370

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydecipher-1.0.0.tar.gz (46.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydecipher-1.0.0-py3-none-any.whl (64.3 kB view details)

Uploaded Python 3

File details

Details for the file pydecipher-1.0.0.tar.gz.

File metadata

  • Download URL: pydecipher-1.0.0.tar.gz
  • Upload date:
  • Size: 46.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.0 importlib_metadata/3.7.3 packaging/20.9 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for pydecipher-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ff7c72a20b31d9639c576769e5a4e479c4456b771d43deea7bef493101a9f34f
MD5 3ee918f21e2db650caa93eec7f922cb7
BLAKE2b-256 2110dcd01a10948d7652cd42fc594216cacf05d55bce83708cb43b13d072fdc6

See more details on using hashes here.

File details

Details for the file pydecipher-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: pydecipher-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 64.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.0 importlib_metadata/3.7.3 packaging/20.9 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for pydecipher-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 759675bd93e51eeeb7db2b4775e22155c8a00940d5042e2c32a0798eac944819
MD5 f22aba61c1622e9be7eb66cf332960b5
BLAKE2b-256 5d1118df67e66b8f9e44b3e4c46ba6ea9ba21185456678d4fb3678db799b5c6f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page