DEX/APK analyzer
Project description
DexTrace
DexTrace is a lightweight core for Android APK / DEX parsing and call-tracing.
It does not decide whether an APK is malicious.
Instead, DexTrace focuses on producing a clean, standardized, and reproducible representation of:
- APK metadata
- AndroidManifest structure
- DEX internal tables
- API call evidence and method-level cross-references (XREF)
These results are designed to be consumed by higher-level engines, such as
👉 Quark Engine or other static / hybrid analysis frameworks.
Goals
DexTrace is intended to provide:
- lightweight APK and DEX parsing without depending on a full Android analysis framework
- deterministic Dalvik bytecode disassembly and inspection
- structured API extraction from DEX bytecode
- manifest and APK metadata parsing
- a Python API and CLI for inspection, debugging, and integration
- reproducible outputs suitable for downstream rule engines
✨ Current Features
APK Support
- File hashes (MD5 / SHA1 / SHA256)
- File size and ZIP entries
- APK archive reading for downstream parsing workflows
AndroidManifest Parsing
- Supports binary AXML and plain XML
- Extracts:
- package name
- permissions
- activities
- services
- receivers
- providers
- Safe fallback for malformed or missing manifests
DEX Header Parsing
- Strict DEX magic validation (
dex\n035,cdex) - Full header field decoding
- Defensive handling of truncated / invalid DEX files
DEX Bytecode Parsing (Core)
code_itemparsing- instruction iteration
- offset-aware bytecode handling
- method/code mapping support
- designed to scale toward richer control-flow or data-flow analysis later
🔍 API Call Tracing (Quark-aligned)
DexTrace implements progressive API tracing stages aligned with
Quark Engine’s 5-stage detection model.
Stage 2 – API Calls
- extracts all
invoke-*instructions - resolves:
- caller class / method / prototype
- callee class / method / prototype
- opcode type and bytecode offset
- produces structured XREF output
- safe against malformed indices and corrupted tables
Stage 3 – API Sets (Per Method)
- groups APIs per caller method
- represents which APIs are used together
- order-independent
- designed for combination-based rule matching
Stage 4 – API Call Sequences
- preserves static call order within each method
- offset-aware ordering of
invoke-*instructions - method-local (no CFG explosion)
- designed for sequence-based rule matching
⚙️ Dynamic Execution (Dalvik VM)
DexTrace includes an iterative Dalvik bytecode interpreter (src/dextrace/vm/)
that can actually execute a method instead of only statically inspecting it.
- executes a single entry method by signature, with caller-supplied arguments
- supports integer/long/float/double arithmetic, branches, comparisons, array and
field access, type checks/conversions,
throw, and try/catch exception flow - resolves virtual calls through a constructed class hierarchy / vtable
- simulates common Android/Java framework calls via Android API stubs
(
vm/android_stubs/) so malware-style flows can run without a device - records a per-instruction execution trace and a call tree of internal calls and stubbed API calls
Exposed through the dextrace run command (see below).
Repository Structure
src/dextrace/
api.py # public Python API entry point
cli/ # CLI entry points
core/ # APK/DEX parsing and API extraction core
dalvik/ # Dalvik disassembly and opcode utilities
vm/ # Dalvik bytecode execution engine, handlers, Android stubs
manifest/ # binary AndroidManifest parsing
errors.py # shared project exceptions
version.py # package version
tests/
fixtures/ # synthetic fixtures used by tests
test_*.py # pytest-based test suite
docs/
modules-overview.md # module-by-module handoff guide
development-workflow.md # contributor workflow and validation notes
current-status.md # current state, known gaps, handoff notes
Key areas
src/dextrace/cli/
Command-line entry points.
main.py: top-level CLI dispatchercmd_meta.py: metadata-oriented inspectioncmd_disasm.py: disassembly-oriented inspectioncmd_dex.py: DEX/API-oriented inspection
src/dextrace/core/
Core APK / DEX parsing and API extraction logic.
Includes:
- APK reading
- APK metadata extraction
- manifest parsing bridge
- DEX structure parsing
- method/code mapping
- API extraction
- method/API resolution
src/dextrace/dalvik/
Dalvik bytecode internals.
Includes:
- opcode format metadata
- operand decoding
- instruction size handling
- payload decoding
- disassembly support
- smali-oriented helpers
src/dextrace/vm/
Dalvik bytecode execution (dynamic analysis), distinct from dalvik/ disassembly.
Includes:
engine.py: the iterativeDalvikVMexecution engine- opcode handlers under
handlers/(arithmetic, array, branch, compare, field, move, throw, type-check, type-conversion) - simulated Android/Java framework methods under
android_stubs/(content, filesystem, intent, network, runtime, sms, telephony, text) - execution state, register file, object heap, call frames, class hierarchy / vtable resolution, and execution tracing
This subsystem powers the dextrace run command.
src/dextrace/manifest/
Low-level binary AXML parsing used by manifest-related workflows.
📦 Installation
Development install (editable mode)
git clone https://github.com/ev-flow/DexTrace.git
cd DexTrace
pip install -e .
Optional Pipenv workflow
pipenv install --dev
pipenv shell
CLI Usage
DexTrace exposes a single CLI entry point:
dextrace --help
APK Metadata
Show hashes, manifest summary, and DEX presence:
dextrace meta sample.apk
DEX Header
Parse and display full DEX header fields:
dextrace dex --header sample.apk
DEX Summary
Show a concise overview of DEX structure:
dextrace dex --summary sample.apk
🔗 API Tracing Commands
Stage 2 – API Calls
dextrace dex --apis sample.apk
Stage 3 – API Sets
dextrace dex --api-sets sample.apk
Stage 4 – API Sequences
dextrace dex --api-seq sample.apk
JSON Output
All commands support structured JSON output:
dextrace dex --api-seq --json sample.apk
⚙️ VM Execution (dextrace run)
Execute a single method with the Dalvik VM and print its return value.
The input may be a .dex file or a .apk (the embedded DEX is loaded automatically).
dextrace run --help
Run an entry method by signature:
dextrace run sample.dex --entry 'Lp1;->main()I'
Pass arguments (--arg/-a, repeatable; ints are auto-detected from decimal or 0x hex,
everything else is a string). Use --args for an explicit JSON list of mixed int/string:
dextrace run sample.dex --entry 'Lp2/Fib;->fib(I)I' --arg 10
dextrace run sample.dex --entry 'Lp1;->main()I' --args '["+15555550100","hi"]'
Useful flags:
--json— emit the result (and, with--trace,api_calls) as JSON--trace— print the call tree of internal calls and stubbed API calls--strict-stubs— treat every unstubbed external call as an error (default: void misses are silent no-ops)--dump-regs— print non-zero registers after execution--verbose/-v— print[INFO]progress to stderr
Exit codes: 0 success, 1 user error (bad args / method not found), 2 VM runtime error, 3 parse error.
Example Output
Stage 2
{
"dex": {
"summary": {
"magic": "dex\n035\u0000",
"version": "035",
"file_size": 717940,
"string_ids_size": 6285,
"method_ids_size": 5455,
"class_defs_size": 534
},
"api_calls": [
{
"caller": {
"class": "Landroid/support/v4/accessibilityservice/AccessibilityServiceInfoCompat;",
"method": "<clinit>",
"proto": "()V"
},
"invoke": {
"opcode": "invoke-direct",
"offset": 16
},
"callee": {
"class": "Landroid/support/v4/accessibilityservice/AccessibilityServiceInfoCompat$AccessibilityServiceInfoJellyBeanMr2;",
"method": "<init>",
"proto": "()V"
}
}
],
"api_calls_count": 1
}
}
Running Tests
Run the full test suite:
pytest
If you use the Pipenv workflow, run it through Pipenv instead:
pipenv run pytest
Run a targeted test file:
pytest tests/test_dex_parser.py
Run tests by keyword:
pytest -k api_extractor
Suggested subsystem-oriented test runs
-
CLI changes:
pytest tests/test_cli_meta.py tests/test_smoke.py
-
APK / metadata changes:
pytest tests/test_apk_reader.py tests/test_apk_metadata.py
-
manifest changes:
pytest tests/test_manifest_parser.py -
DEX parser changes:
pytest tests/test_dex_parser.py tests/test_dex_header.py
-
API extraction changes:
pytest tests/test_dex_api_extractor.py -
Dalvik / disassembly changes:
pytest -k disassembler
-
VM execution /
dextrace runchanges:pytest -k vm
Development Notes
DexTrace is organized by subsystem, so contributors should usually:
- identify the affected subsystem first
- make the smallest targeted change possible
- run the closest subsystem tests first
- broaden validation only if the change touches shared logic
- update documentation when contributor-facing behavior changes
When DexTrace is used under Quark Engine, Quark-facing mismatches should be investigated conservatively and evidence-first. Preserve:
- APK identifier or sample path
- rule IDs
- exact commands used
- DexTrace output
- comparison-core output such as Androguard
- diff excerpts
- current hypothesis
Prefer wording such as:
- inconsistent API matching
- resolution difference
- invoke extraction gap
until the exact root cause is verified in code and tests.
Documentation
Additional contributor documentation:
Samples and Build Artifacts
The repository may include:
- extracted sample APK directories for validation or reproduction
- generated build artifacts under
dist/
These are useful for testing and packaging, but they are not the core implementation surface.
Relationship with Quark Engine
DexTrace can be used as an analysis core under Quark Engine. In that setup:
- DexTrace is responsible for parsing APK / DEX input and extracting evidence
- Quark Engine is responsible for higher-level rule matching and scoring
When validating Quark-facing behavior, comparisons should keep the APK, rule set, and Quark version fixed while only changing the analysis core.
License
See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dextrace-26.6.1.tar.gz.
File metadata
- Download URL: dextrace-26.6.1.tar.gz
- Upload date:
- Size: 138.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b246b9410b83fb36b2bcefcd44b3e22287a123bda15e5e125d72dee95d5de488
|
|
| MD5 |
75093a2e118f75db199304acdbf33975
|
|
| BLAKE2b-256 |
857420fde903b9b8469d4e1b6d2fbbac738ffd962eb34d3ccc1aca5b693bf4fe
|
File details
Details for the file dextrace-26.6.1-py3-none-any.whl.
File metadata
- Download URL: dextrace-26.6.1-py3-none-any.whl
- Upload date:
- Size: 148.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2777a28bfb1c2ef8cc99ec5f9586f851a14479a24dc1d41a9a7cef390db7359a
|
|
| MD5 |
d2ac403058bbfc28222295727785b727
|
|
| BLAKE2b-256 |
0e87bd507f10e2748450f331b1a633ba5eaa5b97466c1aa323d22b34602f35ba
|