Skip to main content

Parse .NET executable files.

Project description

dnfile

https://github.com/malwarefrank/dnfile/actions/workflows/lint.yml/badge.svg https://img.shields.io/pypi/v/dnfile.svg https://img.shields.io/pypi/dm/dnfile

Parse .NET executable files.

  • Free software: MIT license

Features

  • Parse as much as we can, even if the file is partially malformed.

  • Easy to use. Developed with IDE autocompletion in mind.

Quick Start

pip install dnfile

Then create a simple program that loads a .NET binary, parses it, and displays information about the streams and Metadata Tables.

import sys
import dnfile

filepath = sys.argv[1]

pe = dnfile.dnPE(filepath)
pe.print_info()

Everything is an object, and raw structure values are stored in an object’s “struct” attribute. The CLR directory entry object is accessible from the “net” attribute of a dnPE object.

import dnfile
import hashlib

pe = dnfile.dnPE(FILEPATH)

# access the directory entry raw structure values
pe.net.struct

# access the metadata raw structure values
pe.net.metadata.struct

# access the streams
for s in pe.net.metadata.streams_list:
    if isinstance(s, dnfile.stream.MetaDataTables):
        # how many Metadata tables are defined in the binary?
        num_of_tables = len(s.tables_list)

# the last Metadata tables stream can also be accessed by a shortcut
num_of_tables = len(pe.net.mdtables.tables_list)

# create a set to hold the hashes of all resources
res_hash = set()
# access the resources
for r in pe.net.resources:
    # if resource data is a simple byte stream
    if isinstance(r.data, bytes):
        # hash it and add the hash to the set
        res_hash.add(hashlib.sha256(r.data).hexdigest())
    # if resource data is a ResourceSet, a dotnet-specific datatype
    elif isinstance(r.data, dnfile.resource.ResourceSet):
        # if there are no entries
        if not r.data.entries:
            # skip it
            continue
        # for each entry in the ResourceSet
        for entry in r.data.entries:
            # if it has data
            if entry.data:
                # hash it and add the hash to the set
                res_hash.add(hashlib.sha256(entry.data).hexdigest())

TODO

  • more tests

  • Documentation on readthedocs

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.15.0 (2024)

  • BREAKING CHANGE: each heap stream’s .get() returns a HeapItem instead of bytes

  • FEATURE: All HeapItem objects include the RVA of where they were retrieved

  • FEATURE: HeapItemBinary objects allow easy access to interpreted item size (CompressedInt)

  • FEATURE: HeapItemString and UserString allow easy access to raw bytes and interpreted value

  • improvements to pypi publishing and tox testing

0.14.1 (2023)

  • fix github workflow

0.14.0 (2023)

  • BREAKING CHANGE: Minimum required Python version is now 3.8

  • BUGFIX: ValueError fired before UnicodeDecodeError when parsing assembly resources

  • BUGFIX: mdtable row run-lists of size one were being ignored

  • BUGFIX: some struct file offsets were RVA values

  • FEATURE: Add clr_lazy_load option for lazy loading Metadata tables and assembly resources

  • move from legacy setup.py to pyproject.toml and tox

  • bump dev dependencies: mypy and isort

  • update tests and examples

  • update README badge to use download statistics from pypistats

0.13.0 (2022)

  • BREAKING CHANGE: rename GenericMethod mdtable to MethodSpec per ECMA 335

  • parse more resources, even if there are exceptions

0.12.0 (2022)

  • FEATURE: parse #Schema stream as MetaDataTables

  • BUGFIX: MDTableRow off-by-one for end of run

  • BUGFIX: MethodSemanticsRow typo list of tables for the Method Index

  • more test data

0.11.0 (2022)

  • FEATURE: access .NET resources (not same as PE resources!) by a shortcut

  • BUGFIX: dnstrings example

  • more attributes default to None

  • update dev dependencies

  • remove some warnings

0.10.0 (2022)

  • BREAKING CHANGE: structure attributes no longer exist by default

  • BREAKING CHANGE: objects’ attributes always exist, but may be None

  • BUGFIX: use last stream if multiple of same name

  • CI: added mypy type checking

  • when duplicate stream names, behave like runtime and use last one for shortcuts

  • add user_strings shortcut

  • able to access MetaDataTables like a 0-based list, with square brackets

  • added use of logging module for warnings

  • better type hints for IDEs

  • more better source comments

  • more tests

0.9.0 (2021)

  • bugfix: row indices parsed in structures are one-based, not zero-based

  • bugfix: TypeDefRow was not parsing Extends coded index

  • bugfix: incorrect BLOBS_MASK and add EXTRA_DATA skip if flag set

  • added CI using github workflow

  • added tests and submodule dnfile-testfiles

  • added style consistency using pycodestyle and isort

  • added more examples

  • parse MetaData tables’ list-type indexes into lists of MDTableRow objects

0.8.0 (2021)

  • bugfix: Metadata Table indexes (i.e. indexes into other tables) were off by one

0.7.1 (2021)

  • bugfix: coded index always None

0.7.0 (2021)

  • bugfix: improper data length check

0.6.0 (2021)

  • bugfix: referenced wrong object

  • parse utf-16 strings in #US stream

0.5.0 (2021-01-29)

  • First release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dnfile-0.15.0.tar.gz (49.4 kB view hashes)

Uploaded Source

Built Distribution

dnfile-0.15.0-py3-none-any.whl (46.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page