Skip to main content

This package contains only the Pydantic model for Maco.

Project description

Maco - Malware config extractor framework

Maco is a framework for malware config extractors.

It aims to solve two problems:

  • Define a standardize ontology (or model) for extractor output. This greatly helps for databasing extracted values.
  • Provide a standard way of identifying which parsers to run and how to execute them.

Maco components

  • model.py
    • A data model for the common output of an extractor
  • extractor.py
    • Base class for extractors to implement
  • collector.py
    • Utilities for loading and running extractors
  • cli.py
    • A CLI tool maco to assist with running your extractors locally
  • base_test.py
    • Assist with writing unit tests for your extractors

Note: If you're interested in using only the model in your project, you can pip install maco-model which is a smaller package containing only the model definition

Project Integrations 🛠️

This framework is actively being used by:

Project Description License
A malware analysis platform that uses the MACO model to export malware configuration extractions into a parseable, machine-friendly format License
configextractor-py A tool designed to run extractors from multiple frameworks and uses the MACO model for output harmonization License
A robust, multiprocessing-capable, multi-family RAT config parser/extractor that is compatible with MACO License
A parser/extractor repository containing MACO extractors that's authored by the CAPE community but is integrated in CAPE deployments.
Note: These MACO extractors wrap and parse the original CAPE extractors.
License
A parser/extractor repository containing MACO extractors that's authored by the SEKOIA community. License
A malware knowledge base designed for malware archiving, analytics and clustering that supports MACO extractors License

Model Example

See the model definition for all the supported fields. You can use the model independently of the rest of the framework. This is still useful for compatibility between systems!

from maco import model
# 'family' is the only required property on the model
output = model.ExtractorModel(family="wanabee")
output.version = "2019"  # variant first found in 2019
output.category.extend([model.CategoryEnum.cryptominer, model.CategoryEnum.clickfraud])
output.http.append(model.ExtractorModel.Http(protocol="https",
                                             uri="https://bad-domain.com/c2_payload",
                                             usage="c2"))
output.tcp.append(model.ExtractorModel.Connection(server_ip="127.0.0.1",
                                           usage="ransom"))
output.campaign_id.append("859186-3224-9284")
output.inject_exe.append("explorer.exe")
output.binaries.append(
    output.Binary(
        data=b"sam I am",
        datatype=output.Binary.TypeEnum.config,
        encryption=output.Binary.Encryption(
            algorithm="rot26",
            mode="block",
        ),
    )
)
# data about the malware that doesn't fit the model
output.other["author_lunch"] = "green eggs and ham"
output.other["author_lunch_time"] = "3pm"
print(output.model_dump(exclude_defaults=True))

# Generated model
{
    'family': 'wanabee',
    'version': '2019',
    'category': ['cryptominer', 'clickfraud'],
    'campaign_id': ['859186-3224-9284'],
    'inject_exe': ['explorer.exe'],
    'other': {'author_lunch': 'green eggs and ham', 'author_lunch_time': '3pm'},
    'http': [{'uri': 'https://bad-domain.com/c2_payload', 'usage': 'c2', 'protocol': 'https'}],
    'tcp': [{'server_ip': '127.0.0.1', 'usage': 'ransom'}],
    'binaries': [{
        'datatype': 'config', 'data': b'sam I am',
        'encryption': {'algorithm': 'rot26', 'mode': 'block'}
    }]
}

And you can create model instances from dictionaries:

from maco import model
output = {
    "family": "wanabee2",
    "version": "2022",
    "ssh": [
        {
            "username": "wanna",
            "password": "bee2",
            "hostname": "10.1.10.100",
        }
    ],
}
print(model.ExtractorModel(**output))

# Generated model
family='wanabee2' version='2022' category=[] attack=[] capability_enabled=[]
capability_disabled=[] campaign_id=[] identifier=[] decoded_strings=[]
password=[] mutex=[] pipe=[] sleep_delay=None inject_exe=[] other={}
binaries=[] ftp=[] smtp=[] http=[]
ssh=[SSH(username='wanna', password='bee2', hostname='10.1.10.100', port=None, usage=None)]
proxy=[] dns=[] tcp=[] udp=[] encryption=[] service=[] cryptocurrency=[]
paths=[] registry=[]

Extractor Example

The following extractor will trigger on any file with more than 50 ELF sections, and set some properties in the model.

Your extractors will do a better job of finding useful information than this one!

class Elfy(extractor.Extractor):
    """Check basic elf property."""

    family = "elfy"
    author = "blue"
    last_modified = "2022-06-14"
    yara_rule = """
        import "elf"

        rule Elfy
        {
            condition:
                elf.number_of_sections > 50
        }
        """

    def run(
        self, stream: BytesIO, matches: List[yara.Match]
    ) -> Optional[model.ExtractorModel]:
        # return config model formatted results
        ret = model.ExtractorModel(family=self.family)
        # the list for campaign_id already exists and is empty, so we just add an item
        ret.campaign_id.append(str(len(stream.read())))
        return ret

Writing Extractors

There are several examples that use Maco in the 'demo_extractors' folder.

Some things to keep in mind:

  • The Yara rule names must be prefixed with the extractor class name.
    • e.g. Class 'MyScript' has Yara rules named 'MyScriptDetect1' and 'MyScriptDetect2', not 'Detect1'
  • You can load other scripts contained within the same folder via a Python relative import
    • See complex.py for details
  • You can standardise your usage of the 'other' dict
    • This is optional, see limit_other.py for details
    • Consider instead making a PR with the properties you are frequently using

Requirements

Python 3.8+.

Install this package with pip install maco.

All required Python packages are in the requirements.txt.

CLI Usage

> maco --help
usage: maco [-h] [-v] [--pretty] [--base64] [--logfile LOGFILE] [--include INCLUDE] [--exclude EXCLUDE] [-f] [--create_venv] extractors samples

Run extractors over samples.

positional arguments:
  extractors         path to extractors
  samples            path to samples

optional arguments:
  -h, --help         show this help message and exit
  -v, --verbose      print debug logging. -v extractor info, -vv extractor debug, -vvv cli debug
  --pretty           pretty print json output
  --base64           Include base64 encoded binary data in output (can be large, consider printing to file rather than console)
  --logfile LOGFILE  file to log output
  --include INCLUDE  comma separated extractors to run
  --exclude EXCLUDE  comma separated extractors to not run
  -f, --force        ignore yara rules and execute all extractors
  --create_venv      Creates venvs for every requirements.txt found (only applies when extractor path is a directory)

CLI output example

The CLI is helpful for using your extractors in a standalone system, such as in a reverse engineering environment.

> maco demo_extractors/ /usr/lib --include Complex
extractors loaded: ['Complex']

complex by blue 2022-06-14 TLP:WHITE
This script has multiple yara rules and coverage of the data model.

path: /usr/lib/udev/hwdb.bin
run Complex extractor from rules ['ComplexAlt']
{"family": "complex", "version": "5", "decoded_strings": ["Paradise"],
"binaries": [{"datatype": "payload", "size": 9, "hex_sample": "736F6D652064617461", "sha256": "1307990e6ba5ca145eb35e99182a9bec46531bc54ddf656a602c780fa0240dee",
"encryption": {"algorithm": "something"}}],
"http": [{"protocol": "https", "hostname": "blarg5.com", "path": "/malz/9956330", "usage": "c2"}],
"encryption": [{"algorithm": "sha256"}]}

path: /usr/lib/udev/hwdb.d/20-OUI.hwdb
run Complex extractor from rules ['ComplexAlt']
{"family": "complex", "version": "5", "decoded_strings": ["Paradise"],
"binaries": [{"datatype": "payload", "size": 9, "hex_sample": "736F6D652064617461", "sha256": "1307990e6ba5ca145eb35e99182a9bec46531bc54ddf656a602c780fa0240dee",
"encryption": {"algorithm": "something"}}],
"http": [{"protocol": "https", "hostname": "blarg5.com", "path": "/malz/1986908", "usage": "c2"}],
"encryption": [{"algorithm": "sha256"}]}

path: /usr/lib/udev/hwdb.d/20-usb-vendor-model.hwdb
run Complex extractor from rules ['ComplexAlt']
{"family": "complex", "version": "5", "decoded_strings": ["Paradise"],
"binaries": [{"datatype": "payload", "size": 9, "hex_sample": "736F6D652064617461", "sha256": "1307990e6ba5ca145eb35e99182a9bec46531bc54ddf656a602c780fa0240dee",
"encryption": {"algorithm": "something"}}],
"http": [{"protocol": "https", "hostname": "blarg5.com", "path": "/malz/1257481", "usage": "c2"}],
"encryption": [{"algorithm": "sha256"}]}


15884 analysed, 3 hits, 3 extracted

The demo extractors are designed to trigger when run over the 'demo_extractors' folder.

e.g. maco demo_extractors demo_extractors

Contributions

Please use ruff to format and lint PRs. This may be the cause of PR test failures.

Ruff will attempt to fix most issues, but some may require manual resolution.

pip install ruff
ruff format
ruff check --fix

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maco_model-1.2.24.tar.gz (35.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

maco_model-1.2.24-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file maco_model-1.2.24.tar.gz.

File metadata

  • Download URL: maco_model-1.2.24.tar.gz
  • Upload date:
  • Size: 35.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for maco_model-1.2.24.tar.gz
Algorithm Hash digest
SHA256 61efba1274f0d4d5ca5157d5682a4307b5f88c44d173f3f59c07dbba6ecc4d2e
MD5 4cfd20e07f03f7227b999ee501cd3477
BLAKE2b-256 8de4c3b8c249f4fad3e17652af63204f4608e1616304f863182dbac7027cb387

See more details on using hashes here.

File details

Details for the file maco_model-1.2.24-py3-none-any.whl.

File metadata

  • Download URL: maco_model-1.2.24-py3-none-any.whl
  • Upload date:
  • Size: 15.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for maco_model-1.2.24-py3-none-any.whl
Algorithm Hash digest
SHA256 a15216046494abbdda762024749cef2497250af6517c0c07fa73340b5469392e
MD5 789bee47074e2a4e9a6ee16777a59fa4
BLAKE2b-256 abf90684d16643a5a302e1bbcd8302d2407a57920aac4f1b1bb135c5fd123342

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page