A library for parsing and unparsing smali files for programatic modification
Project description
PySmali
PySmali is a Python library for parsing and unparsing smali files for programatic modification.
It is a line (not token) based parser. Its primary goal is for parsed files to maintain 100% equality with their original forms when reconstructed.
PySmali's main usage is for smali file patching. You are able to parse, search, extract, replace, and unparse blocks of a smali file.
Parsing is based on the ANTLR files found in the JesusFreke/smali repository.
Since this is a line and not token based parser, there are likely to be edge cases where PySmali fails to properly parse or unparse a file. There are currently 6,846 various smali files that are used in the tests
folder (tests/tests.tar.xz
). If you run into a smali file that does not parse or unparse properly, please submit a new issue with the complete smali file.
Requirements
- Python 3.8 or newer
- Up to date
pip
Installation
pip install smali
Simple Example
import time
from smali import SmaliFile
from smali.statements import Statement
smali_file = SmaliFile.parse_file('/path/to/file.smali')
insert_str = f'''
# This file was modified by PySmali
# Modified: {time.ctime()}
'''
smali_file.root.extend(Statement.parse_lines(insert_str.splitlines()))
with open('/path/to/file.smali', 'w') as f:
f.write(str(smali_file))
Status
-
v0.2.4
- Complete parsing and unparsing of non-body statements validated by current test suite
-
[UPCOMING] v0.3.0
Statement
andBlock
searching by method and field names- Simplified
Statement
andBlock
extraction and insertion
-
[UPCOMING] v0.4.0
- Complete parsing of body statements
Methodology
- The smali file is ingested on a line by line basis
- Each line is parsed into one or more
Statement
instances.super Ljava/lang/Object;
would become a singleStatement
instancevalue = { LFormat31c; }
would become 4Statement
instancesvalue
,{
,LFormat31c;
,}
- Each
Statement
instance is subclassed based on the type- E.g.
FieldStatement
orMethodStatement
- E.g.
- A
Statement
can have zero or moreStatementAttributes
that indicate it's intent and format- E.g.
BLOCK_START
,ASSIGNMENT_LHS
, orNO_BREAK
- E.g.
- Multiple
Statement
instances can be joined into aBlock
and nested where appropriate- A
Block
example would be a smali method, comprised of the beginning, body, and endStatement
instances
- A
- A
Statement
parses its source line split by whitespace - Parsing is done in two passes. This is due to the fact that the same line can be the start of a block or by itself depending on the existence of an
EndStatement
.- The first pass builds a flat list of
Statement
instances from the input lines. - A
Statement
that can be either are marked. - If an
EndStatement
is generated, and matches a markedStatement
, the markedStatement
is switch from theMAYBE_BLOCK_START
toBLOCK_START
attribute. - After the first pass, any remaining
Statement
instances that are still marked withMAYBE_BLOCK_START
are switch toSINGLE_LINE
, - The second pass iterates over the flat list of
Statement
instances and groups them intoBlock
instances and nesting when appropriate.
- The first pass builds a flat list of
- Unparsing is done in a single pass
- Each
Statement
stringifies itself using its own local information - The
SmaliFile
instance usesStatementAttributes
of eachStatement
to stitch lines together and intent blocks where necessary
- Each
License
OSS Attribution
pyca/cryptography by Multiple contributors
Licensed Under: Apache-2.0 License
tqdm/tqdm by Multiple contributors
Licensed Under: Various Licenses
JesusFreke/smali by Ben Gruver
Licensed Under: Various Licenses
Tests in the tests/tests.tar.xz
file have been obtained from the following projects:
- Android
- AndroidX
- FasterXML
- Java
- JavaX
- OkHttp
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.