pyc-zipper is a complete toolchain for compressing, obfuscating, and unpacking pyc files based on Python's underlying bytecode.pyc-zipper是基于Python的底层字节码,实现的一套完整的pyc文件的压缩、混淆和脱壳工具链。
Project description
[English | 中文]
This repository implements a complete toolchain for compressing, packing, obfuscating and unpacking pyc files based on Python’s underlying bytecode.
0. Installation and Dependencies
Open the terminal and enter the command:
pip install pyc-zipper
1. Usage and Command Line
pyc-zipper [options] [file1 file2 ...]
The available options are:
pyc-zipper [-h] [--obfuscate] [--obfuscate-global]
[--obfuscate-lineno] [--obfuscate-filename]
[--obfuscate-code-name] [--obfuscate-bytecode]
[--obfuscate-argname] [--unpack] [--version]
[--compress-module COMPRESS_MODULE] [--no-obfuscation]
file1 [file2 ...]
Decompression and Unpacking - unpack: Decompress previously compressed .pyc files. pyc-zipper will automatically detect the module name, which can also be manually provided through the compress-module parameter. Note that the unpack switch can only be used with compress-module and cannot be combined with other switches.
Additionally, if the terminal prompts that the pyc-zipper command cannot be found, you can use python -m pyc_zipper as an alternative.
For PyInstaller
from pyc_zipper import hook_pyinstaller
hook_pyinstaller()
Alternatively, you can customize your own parameters, such as:
hook_pyinstaller(comp_module="lzma", no_obfuscation=False,
obfuscate_global=True, obfuscate_lineno=True,
obfuscate_filename=True, obfuscate_code_name=True,
obfuscate_bytecode=True, obfuscate_argname=False)
pyinstaller file.spec
3926 INFO: checking PKG
3927 INFO: Building PKG because PKG-00.toc is non existent
3927 INFO: Building PKG (CArchive) PKG-00.pkg
pyc-zipper: processing ('pyiboot01_bootstrap', 'D:\\Users\\Administrator\\AppData\\Local\\Programs\\Python\\Python37-32\\lib\\site-packages\\PyInstaller\\loader\\pyiboot01_bootstrap.py') in _load_code
Obfuscating code '<module>'
Obfuscating code 'NullWriter'
Obfuscating code 'write'
Obfuscating code 'flush'
Obfuscating code 'isatty'
Obfuscating code '_frozen_name'
Obfuscating code 'PyInstallerImportError'
Obfuscating code '__init__'
...
Then the obfuscation is successful.
2. Compression Packing
pyc_zipper/compress.py is responsible for adding a compression pack to .pyc files. The packed .pyc files will call Python’s built-in bz2, lzma, or zlib modules to decompress the bytecode during execution.
Self-Extracting Program
In the packed .pyc file, there is a “compression pack” that first decompresses and restores the original bytecode before execution.
For example, using zlib, the self-extraction program is as follows:
import zlib, marshal
exec(marshal.loads(zlib.decompress(b'x\xda...'))) # b'x\xda...' is the compressed bytecode data
For bz2 and lzma:
import bz2, marshal
exec(marshal.loads(bz2.decompress(b'BZh9...')))
import lzma, marshal
exec(marshal.loads(lzma.decompress(b'\xfd7zXZ...')))
Compression Efficiency Comparison
My tests have shown that the .pyc file compressed with lzma results in the smallest size, followed by bz2, with zlib performing the least efficiently.
Compatibility
These compression tools are compatible with all versions of Python 3, as they do not rely on specific bytecode versions.
3. Obfuscation and Anti-Decompilation Packing
The previous compression tools cannot prevent .pyc files from being decompiled by libraries like uncompyle6. To prevent decompilation, an obfuscation tool in pyc_zipper/obfuscate.py is used to obfuscate the bytecode instructions and variable names.
A Brief Introduction to the Obfuscation Principles
1. Obfuscating Code Metadata and Anti-Debugging
if obfuscate_lineno:
co.co_lnotab = b''
co.co_firstlineno = 1
if obfuscate_filename: co.co_filename = ''
if obfuscate_code_name: co.co_name = ''
Set co_lnotab to an empty byte string to clear the line number mapping table. (For Python 3.10+, the pyobject library automatically converts co_lnotab to co_linetable, so compatibility is not an issue.)
Set co_firstlineno to 1, as line numbers are calculated by adding co_firstlineno and the results from co_lnotab.
Set co_filename to an empty string to hide the file path of the code source.
Set co_name to an empty string to hide the name of the code object (e.g., function name).
This completely hides the filename, line number, and function name information in Traceback error outputs, increasing the difficulty of reverse engineering.
2. Obfuscating Binary Bytecode
if obfuscate_bytecode and co.co_code[-len(RET_INSTRUCTION)*2:] != RET_INSTRUCTION*2:
co.co_code += RET_INSTRUCTION
Check if the binary bytecode (co_code) already contains two consecutive return instructions (RET_INSTRUCTION) at the end. If not, append a redundant return instruction to disrupt the parsing of decompilation tools.
3. Obfuscating Local Variable Names
For example:
def f():
x, y = 1, 2; z = 3
def g():
print(x, y)
g()
f.__code__.co_cellvars will include the exported variable names ("x", "y") but not "z", which is only used within f.
f.__code__.co_varnames will include the variable name ("z",).
g.__code__.co_freevars will include the imported variable names ("x", "y").
The code replaces local variable names with sequential numbers in the following order: 1. Free variables inherited from the outer scope, stored in the closure_vars dictionary. 2. Newly defined co_cellvars within the function. 3. Ordinary variables defined in co_varnames.
Additionally, since obfuscating parameter names can prevent proper keyword argument passing, this feature is optional.
4. Obfuscating Global Variable Names
The code: - Uses the dis.get_instructions function to retrieve all bytecode instructions. - Identifies the operands of STORE_NAME instructions (global variable names). - Analyzes operands of instructions like IMPORT_NAME, IMPORT_FROM, and LOAD_ATTR that also reference co_names to avoid obfuscating them and causing naming conflicts. - Ensures that names imported via from ... import * (handled by the IMPORT_STAR instruction) are not obfuscated, as they introduce many names.
5. Recursively Processing Nested Bytecode
The code: - Iterates through co_consts to find nested bytecode objects (e.g., nested functions, classes). - Recursively calls process_code on the nested bytecode objects.
6. Effectiveness on Formatted Strings (f-strings)
Python’s formatted strings are compiled into bytecode without storing variable names as a whole. Instead, they are split into multiple substrings, like this:
>>> from dis import dis
>>> dis("f'start{x!r}end'")
0 RESUME 0
1 LOAD_CONST 0 ('start')
LOAD_NAME 0 (x)
CONVERT_VALUE 2 (repr)
FORMAT_SIMPLE
LOAD_CONST 1 ('end')
BUILD_STRING 3
RETURN_VALUE
Since the variable name x is stored as the operand of the LOAD_NAME instruction in the co_names array, it can still be obfuscated.
Example of Obfuscation Results
-- Stacks of completed symbols:
START ::= |- stmts .
and ::= expr . JUMP_IF_FALSE_OR_POP expr \e_come_from_opt
and ::= expr . JUMP_IF_FALSE_OR_POP expr come_from_opt
and ::= expr . jifop_come_from expr
and ::= expr . jmp_false expr
and ::= expr . jmp_false expr COME_FROM
and ::= expr . jmp_false expr jmp_false
...
Instruction context:
60 STORE_FAST 'l3'
62 LOAD_GLOBAL g18
64 LOAD_FAST 'l3'
66 CALL_FUNCTION_1 1 '1 positional argument'
68 RETURN_VALUE
import functools
try:
from timer_tool import timer
except ImportError:
def (func):
return func
g4 = False
def (l0, l1, l2=[], l3=False):
for l4 in dir(l0):
if (l3 or l4.startswith)("_"):
pass
elif l4 in l2:
pass
else:
l1[l4] = getattr(l0, l4)
g9 = {}
for g13 in range(len(g8.priority)):
for g14 in g8.priority[g13]:
g9[g14] = g13
g5(g8, globals(), ["priority"])
def (l0, l1):
l2 = g9[l1]
l3 = g9[getattr(l0, "_DynObj__last_symbol", HIGHEST)]
l4 = "({!r})" if l2 > l3 else "{!r}"
return l4.format(l0)
class :
_cache = {}
if g4:
def (l0, l1, l2=HIGHEST):
if l1 in l0._cache:
return l0._cache[l1]
l3 = super().__new__(l0)
l0._cache[l1] = l3
return l3
def (l0, l1, l2=HIGHEST):
l0._DynObj__code = l1
l0._DynObj__last_symbol = l2
def Parse error at or near `LOAD_FAST' instruction at offset 16
def (l0, l1):
l2 = "{}.{}".format(l0, l1)
return g18(l2)
def (l0, l1):
return g18(f"{g16(l0, ADD)} + {g16(l1, ADD)}", ADD)
...
# Deparsing stopped due to parse error
Compatibility
This obfuscation tool is also compatible with all versions of Python 3, as it does not depend on specific bytecode versions.
4. Unpacking Tool
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pyc_zipper-1.0.8.tar.gz.
File metadata
- Download URL: pyc_zipper-1.0.8.tar.gz
- Upload date:
- Size: 24.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06318cca1763711be2f09f10fc0e06eafb8006b820f48ac922ddfcb677b58a61
|
|
| MD5 |
b4207be458e913fa20d3976624f04d46
|
|
| BLAKE2b-256 |
0f16ba037eb2198e2ea8bc2fb1824d1fbd6ae5afa2a1daeb4ec94790f218a6b8
|