Elastic Malware Benchmark for Empowering Researchers
Project description
EMBER Feature Extraction
This repository allows the user to easily create a dataset using EMBERv3 features, starting from a collection of PE files.
If you want to work with EMBER2017 dataset (containing features from 1.1 million PE files scanned in or before 2017) or the EMBER2018 dataset (containing features from 1 million PE files scanned in or before 2018), or EMBER2024 please refer to the official repository.
Details of the selected features is available here: https://arxiv.org/pdf/2506.05074
Prerequisites
- Make sure you have a running and active version of Docker.
Usage:
-
Clone the repository and change directory:
git clone git@github.com:w-disaster/ember.git && cd ember
-
Setup the directory containing PE files. The directory should have the following structure:
<YOUR_PE_MALWARE_DIR>/ ├── <FAMILY_0>/ │ ├── SHA_0_0 │ ├── SHA_0_1 │ ├── ... │ └── ├── <FAMILY_1>/ │ ├── SHA_1_0 │ ├── ... │ └── ├── ... └──
where
FAMILY_0, FAMILY_1, ...are the directories named with the malware family andSHA_0_0, SHA_0_1, ...are the PE files named with their SHA256.The directory structure doesn't change if you want to do malware detection: simply create two directories
benignandmaliciousas the malware families. -
Configure the env variables and Run the static features extraction:
MALWARE_DIR_PATH=<YOUR_MALWARE_DIR> PE_DATASET_NAME=<YOUR_PE_DATASET_NAME> EMBER_DATA_DIR=<YOUR_EMBER_OUTPUT_DIR> docker run \ --name ember-feature-extraction \ -e MALWARE_DIR_PATH=/usr/input_data/malware/ \ -e FINAL_DATASET_FILENAME=/usr/app/dataset/$PE_DATASET_NAME.pkl \ -e N_PROCESSES=64 \ -v $MALWARE_DIR_PATH:/usr/input_data/malware/ \ -v $EMBER_DATA_DIR:/usr/app/dataset/ \ ghcr.io/malware-concept-drift-detection/ember-features-extraction:master
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ember_cdd_wdis-1.2.1.tar.gz.
File metadata
- Download URL: ember_cdd_wdis-1.2.1.tar.gz
- Upload date:
- Size: 17.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37f4d54513c358e9509e0961e25cd91aeeaaea1429170ee26f05159a18231198
|
|
| MD5 |
fcd5d2bf54408c9e0efadfb8f0a43ff9
|
|
| BLAKE2b-256 |
c5d78998c37d3d7eefa143bf94040cbce9abe7be7b039e5b8a66812319b7aa47
|
File details
Details for the file ember_cdd_wdis-1.2.1-py3-none-any.whl.
File metadata
- Download URL: ember_cdd_wdis-1.2.1-py3-none-any.whl
- Upload date:
- Size: 17.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
156c23b0eabdefee9eecd4ede75c0fc3e25d192777af823c0ef2a8ccab58d158
|
|
| MD5 |
16cdfa2bbfe88cf8302805343d78bf98
|
|
| BLAKE2b-256 |
a19a692826f36145e93bd28c367526e7ca44f34e371a406ed133a5b8ca0e3eb7
|