Skip to main content

Static Features Extraction Engine

Project description

Static Features Extraction Engine

This project allows the user to extract static features from Windows PE files, which have been proven effective for malware family classification.

Specifically, the list of the chosen features and the extraction process itself adhere to the work proposed in the paper: Decoding the Secrets of Machine Learning in Malware Classification: A Deep Dive into Datasets, Feature Extraction, and Model Performance.

The project was carried out as part of my Master's thesis: Clustering Windows Malware using Static Features and Concept Drift Detection.

Prerequisites

Make sure you have a running and active version of Docker.

Usage

  • Configure the Docker Compose file by providing the following information:
    • MALWARE_DIR_PATH: the path where all the PE files are stored. The directory should group malwares based on their family, so it should contain $n$ subdirectories where $n$ is the number of families;
    • VT_REPORTS_PATH: the path of the VirusTotal reports. Each line of this file should be a separate json containing a report of a single PE file;
    • MERGE_DATASET_PATH: the path of the dataset that will be produced containing [SHA256, family, submission-date] of each file, starting from the VT reports file;
    • FINAL_DATASET_DIR: directory path where the final dataset with the extracted features will be stored.
  • Deploy the engine to start the extraction process:
    docker compose up -d
    

Authors

  • Luca Fabri

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dts_cdd_wdis-1.3.0.tar.gz (27.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dts_cdd_wdis-1.3.0-py3-none-any.whl (36.9 kB view details)

Uploaded Python 3

File details

Details for the file dts_cdd_wdis-1.3.0.tar.gz.

File metadata

  • Download URL: dts_cdd_wdis-1.3.0.tar.gz
  • Upload date:
  • Size: 27.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for dts_cdd_wdis-1.3.0.tar.gz
Algorithm Hash digest
SHA256 0a1b5e37f7fd13fe6d455b3357d4e70224d7a19a57044130d1f61a66e1b6d73e
MD5 4c841ecf42a1e0e1dab2370c6246f327
BLAKE2b-256 b46d5d66ab1368c54c15763ad1811c3c07d8f935e3299c8829cd6ef65021a975

See more details on using hashes here.

File details

Details for the file dts_cdd_wdis-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: dts_cdd_wdis-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 36.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for dts_cdd_wdis-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 772b6c1ea1955d37e5284b14144ea9d334c83be79de35adec902c5819ca7daaf
MD5 7900d606cca4a0497c0964e5a77c3b97
BLAKE2b-256 1fd4b9b3ac9bd8fdcb9c13088bfbf63a40975e8cbe2f0cf9b31084446fb2797e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page