A library for parsing and unparsing smali files for programatic modification
Project description
PySmali
PySmali is a Python library for parsing and unparsing smali files for programatic modification.
It is a line (not token) based parser. Its primary goal is for parsed files to maintain 100% equality with their original forms when reconstructed.
PySmali's main usage is for smali file patching. You are able to parse, search, extract, replace, and unparse blocks of a smali file.
Parsing is based on the ANTLR files found in the JesusFreke/smali repository.
Since this is a line, and not token, based parser, there are likely to be edge cases where PySmali fails to properly parse or unparse a file. There are currently 6,846 smali files that are used in the tests
folder (tests/tests.tar.xz
).
If you run into a smali file that does not parse or unparse properly, please submit a new issue with the complete smali file(s) attached as a zip
or gz
archive.
Requirements
- Python 3.8 or newer
Installation
pip install smali
Simple Example
import time
from smali import SmaliFile
from smali.statements import Statement
smali_file = SmaliFile.parse_file('/path/to/file.smali')
new_lines = Statement.parse_lines(f'''
# This file was modified by PySmali
# Modified: {time.ctime()}
''')
smali_file.root.extend(new_lines)
with open('/path/to/file.smali', 'w') as f:
f.write(str(smali_file))
Status
-
[UPCOMING] v0.4.0
- Complete parsing of body statements
-
[UPCOMING] v0.3.0
Statement
andBlock
searching by method and field names- Simplified
Statement
andBlock
extraction and insertion
-
v0.2.5
- Removed all dependencies and reorganized utility code
-
v0.2.4
- Complete parsing and unparsing of non-body statements validated by current test suite
Methodology
- The smali file is ingested on a line by line basis
- Each line is parsed into one or more
Statement
instances.super Ljava/lang/Object;
would become a singleStatement
instancevalue = { LFormat31c; }
would become 4Statement
instancesvalue
,{
,LFormat31c;
,}
- Each
Statement
instance is subclassed based on its type- E.g.
FieldStatement
orMethodStatement
- E.g.
- A
Statement
can have zero or moreStatementAttributes
that indicate its intent and format- E.g.
BLOCK_START
,ASSIGNMENT_LHS
, orNO_BREAK
- E.g.
- Multiple
Statement
instances can be joined into aBlock
and nested where appropriate- A
Block
example would be a smali method, comprised of beginning, body, and endStatement
instances
- A
- A
Statement
parses its source line, split by whitespace - Parsing is done in two passes. This is due to the fact that the same line can be the start of a block, or a solo line depending on the existence of a matching
EndStatement
.- The first pass builds a flat list of
Statement
instances from the input lines. - A
Statement
that can be either aBlock
start or a solo line is marked with theMAYBE_BLOCK_START
attribute - If an
EndStatement
is generated, and matches a previously markedStatement
, the markedStatement
is switched fromMAYBE_BLOCK_START
toBLOCK_START
. - After the first pass, any remaining
Statement
instances that are still marked withMAYBE_BLOCK_START
are switched toSINGLE_LINE
, - The second pass iterates over the flat list of
Statement
instances and groups them intoBlock
instances and nesting when appropriate based on theSINGLE_LINE
andBLOCK_START
attributes.
- The first pass builds a flat list of
- Unparsing is done in a single pass
- Each
Statement
stringifies itself using its own local information - The
SmaliFile
instance uses the attributes of eachStatement
to stitch lines together and indent blocks where necessary
- Each
License
OSS Attribution
JesusFreke/smali by Ben Gruver
Licensed Under: Various Licenses
Tests
Smali files used as tests in the tests/tests.tar.xz
archive have been obtained from the following projects:
- Android
- AndroidX
- FasterXML
- Java
- JavaX
- OkHttp
- Smali
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.