Skip to main content

Write signatures to automatically match java classes and methods between versions

Project description

sigmatcher

PyPI - Version PyPI - Python Version


Sigmatcher is a powerful tool designed to automate the process of matching Java classes and methods across different versions of an application. It leverages signature on the smali (disassembled java code) to identify and correlate code elements, making it an invaluable resource for long-running reverse engineering projects.

Table of Contents

Installation

Before installing sigmatcher, ensure you have the following prerequisites installed:

  • ripgrep: A command-line search tool that recursively searches your current directory for a regex pattern. Install ripgrep by following the instructions on its GitHub page.
  • apktool: A tool for reverse engineering and disassembling Android apk files. Install apktool by following the instructions on its official website.
git clone https://github.com/oriori1703/sigmatcher.git
pip install ./sigmatcher

Quick Usage

To get started with sigmatcher, follow these steps:

  1. Create a Signature File: Signature files (.yaml) define the patterns and signatures that Sigmatcher will use to analyze the APK files. These files should specify the classes, methods, and fields you're interested in, along with any version-specific information. See the Creating Signature Files section example for the format.

  2. Analyze your app package: With your signature file ready, you can analyze different Android package inputs:

    • single APK (.apk)
    • bundle archive (.apkm / .xapk)
    • directory containing split APK parts (*.apk, searched recursively)

    Use the sigmatcher analyze command, specifying the input path and the signature file(s):

    sigmatcher analyze path/to/your/app.apk --signatures path/to/your/signature_file.yaml
    
    sigmatcher analyze path/to/your/app.apkm --signatures path/to/your/signature_file.yaml
    
    sigmatcher analyze path/to/your/split-apks/ --signatures path/to/your/signature_file.yaml
    

    This command will decode the package parts, apply the signatures, and output the analysis results, highlighting matched classes, methods and fields.

Creating Signature Files

Signature files are YAML formatted documents that sigmatcher uses to identify and match Java classes, methods, and fields in APK files. These files allow you to specify the elements you're interested in tracking across different versions of an application.

Signature File JSON Schema

To help you create a signature file sigmatcher provides a JSON schema that you can use to validate your signature, and get autocompletion and intellisense from your IDE. You can get it by running the following command:

sigmatcher schema --output definitions.schema.json

You can add the one of the following comments to the top of your signature file depending on your IDE:

Intellij IDEs:

# $schema: ./definitions.schema.json

yaml-language-server IDEs (vs-code, neovim, etc):

# yaml-language-server: $schema=./definitions.schema.json

You can also combine them to support both:

# $schema: ./definitions.schema.json
# yaml-language-server: $schema=./definitions.schema.json

Structure of a Signature File

A signature file consists of a list of definitions, where each definition represents a class, method, or field you want to match. Each definition can include one or more signatures, which are patterns sigmatcher will use to find matches in the smali code.

Here's a basic example of what a signature file looks like:

# $schema: ./definitions.schema.json
# yaml-language-server: $schema=./definitions.schema.json

- name: "ConnectionManager"
  package: "com.example.package.network"
  signatures:
    - signature: 'ConnectionManager/openConnection: could not open connection due to a DNS error'
      type: regex
      count: 1
  methods:
    - name: "read"
      signatures:
        - signature: 'const-string v\d+, "Failed to read data from the server"'
          type: regex
          count: 1
          version_range: ">=1.0.0, <1.3.7"
        - signature: 'const-string v\d+, "Failed to read data because of a network error"'
          type: regex
          count: 1
          version_range: ">=1.3.7"
  fields:
    - name: "socket"
      signatures:
        - signature: '^\.field private final (?P<match>.+:Ljava/net/Socket;)'
          type: regex
          count: 1

Key Components

  • name: The name of the class, method, or field.
  • methods: A list of method definitions within a class. Follows a similar structure to the class definition.
  • fields: A list of field definitions within a class. Follows a similar structure to the class definition.
  • exports: A list of export definitions within a class. Exports can be any string in the code. They are mainly used in combination with macros to create more complex signatures.
  • signatures: A list of signatures for the class, method, or field. Each signature includes:
    • type: The type of signature (for now only regex and glob).
    • signature: The pattern to match, depending on the signature type. For classes and methods they just need to match anywhere within the class/method. For fields and exports, they need to match the full field expression/export string, i.e. using the match capture group for regex signatures.
    • count: The number of times the signature should appear to be considered a match. Can be either an integer or a string of the form "min-max". Defaults to 1.
    • version_range: Optional. Specifies the application versions this signature applies to, using version specifiers like those used by pip and described in PEP-440. This could also contains a list of specifers, which act like a the logical "or" operator.

Most of those fields are optional, and you can use them as needed.

Using Macros in Signatures

Macros allow you to reference properties from other matched results within your signatures, enabling dynamic and context-aware pattern matching. Macros are particularly useful when you need to create signatures that depend on information from previously matched classes, methods, fields, or exports.

Macro Syntax

Macros use the format ${<result_name>.<property>}, where:

  • result_name is the name of another definition in your signature file
  • property is a property of the matched result object

Available Properties

Depending on the type of result, different properties are available:

For Classes:

  • name: The class name (e.g., "ConnectionManager")
  • package: The package name (e.g., "com.example.package.network")
  • full_name: The complete class name with package (e.g., "com.example.package.network.ConnectionManager")
  • java: The Java representation (e.g., "Lcom/example/package/network/ConnectionManager;")
  • fields.FieldName: Access to specific field results (e.g., fields.socket returns the matched field object)
  • methods.MethodName: Access to specific method results (e.g., methods.read returns the matched method object)
  • exports.ExportName: Access to specific export results (e.g., exports.someExport returns the matched export object)

For Methods:

  • name: The method name (e.g., "read")
  • argument_types: The method argument types (e.g., "Ljava/lang/String;")
  • return_type: The method return type (e.g., "V")
  • java: The complete Java representation (e.g., "read(Ljava/lang/String;)V")

For Fields:

  • name: The field name (e.g., "socket")
  • type: The field type (e.g., "Ljava/net/Socket;")
  • java: The complete Java representation (e.g., "socket:Ljava/net/Socket;")

For Exports:

  • value: The exported string value

Macro Example

Here's an example showing how macros can be used to create interdependent signatures:

# $schema: ./definitions.schema.json
# yaml-language-server: $schema=./definitions.schema.json

- name: "ConnectionManager"
package: "com.example.package.network"
signatures:
 - signature: 'ConnectionManager/openConnection: could not open connection due to a DNS error'
   type: regex
   count: 1
fields:
 - name: "socket"
   signatures:
     - signature: '^\.field private final (?P<match>.+:Ljava/net/Socket;)'
       type: regex
       count: 1

- name: "NetworkHandler"
package: "com.example.package.network"
signatures:
 - signature: 'new-instance v\d+, ${ConnectionManager.java}'
   type: regex
   count: 1
methods:
 - name: "handleConnection"
   signatures:
     - signature: 'iget-object v\d+, v\d+, ${ConnectionManager.fields.socket.java}'
       type: regex
       count: 1

In this example:

  • The NetworkHandler class uses a macro to reference the Java representation of the ConnectionManager class
  • The handleConnection method uses a macro to reference the socket field from the ConnectionManager class

Important Notes

  • Definition Order Doesn't Matter: Sigmatcher automatically sorts the dependency graph, so macros can reference results that are defined later in the YAML file
  • Macros are resolved at analysis time after the dependency graph is sorted
  • If a macro references a result that cannot be matched, the signature will fail to match
  • Use the java property when you need the complete Java/Smali representation of a class, method, or field
  • Macros work with both regex and glob signature types

License

sigmatcher is distributed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sigmatcher-1.9.0.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sigmatcher-1.9.0-py3-none-any.whl (27.1 kB view details)

Uploaded Python 3

File details

Details for the file sigmatcher-1.9.0.tar.gz.

File metadata

  • Download URL: sigmatcher-1.9.0.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sigmatcher-1.9.0.tar.gz
Algorithm Hash digest
SHA256 e81b017a59a19282b32326ff8680b1c2d4c0c968666c3ff78251ba42c308420f
MD5 fdbf0f70c059002d51b64922e4159eb8
BLAKE2b-256 d0cdc9347d378b8f9f72d6cac843d5c09b6a3b6a4061d5348f2fad51d36f503e

See more details on using hashes here.

Provenance

The following attestation bundles were made for sigmatcher-1.9.0.tar.gz:

Publisher: build.yml on oriori1703/sigmatcher

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sigmatcher-1.9.0-py3-none-any.whl.

File metadata

  • Download URL: sigmatcher-1.9.0-py3-none-any.whl
  • Upload date:
  • Size: 27.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sigmatcher-1.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 31a28fd9da1ebf0397f30fc25c3c6faa538045df7e6bcd8c28e04db880d738b6
MD5 7fceeb4b9c99d34cd8611bf8806c97a2
BLAKE2b-256 8e90a0e8f083d6a2f16573689f393819107b109f947122339590d17134aa0554

See more details on using hashes here.

Provenance

The following attestation bundles were made for sigmatcher-1.9.0-py3-none-any.whl:

Publisher: build.yml on oriori1703/sigmatcher

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page