Skip to main content

Extract logs based off events from sysmon. Comes as a package, cli and ui.

Project description

Sysmon Extract

Sysmon Extract is a library to extract events from the sysmon log type based off the event id. They can be extracted as a file (any big data format) with support for HDFS or in memory as a Spark or Pandas DataFrame. As a note, this library works best with Spark as it leverages it for the ETL process.

The library comes with a library, cli and UI.

Table of Contents

Usage

Command Line

Usage: sysxtract [OPTIONS]

Options:

  -i, --input-file PATH
  -h, --header
  -e, --event TEXT
  -lc, --log-column TEXT
  -ec, --event-column TEXT       [default: ]
  -a, --additional-columns TEXT
  -o, --output-file TEXT         [default: /home/sidhu/sysmon-extract/sysmon-output.csv]
  -s, --single-file
  -m, --master TEXT              [default: local]
  -ui, --start-ui
  --help                         Show this message and exit.

sysxtract -i /media/sidhu/Seagate/empire_apt3_2019-05-14223117.json -e 1 -e 2 -lc log_name -ec event_data -s -a host.name -o /home/sidhu/output.json

Let's break it down.

Input file: -i /media/sidhu/Seagate/empire_apt3_2019-05-14223117.json

Sysmon Events to extract: -e 1 -e 2

Column in the dataset that describes the log source (Sysmon, Microsoft Security, Microsoft Audit, etc.): -lc log_name

Column in the dataset that contains the nested sysmon data (often event_data): -ec event_data

Output as a single file: -s

Additional columns to extract: -a host.name

Output file name: /home/sidhu/output.json

UI

sysextract -ui

Alt Text

Package

Using the example above:

from sysxtract import extract

# Extract to a file
extract(
    "/media/sidhu/Seagate/empire_apt3_2019-05-14223117.json",
    [1, 2],
    log_column="log_name",
    event_column="event_data",
    additional_columns="host.name",
    single_file=True,
    output_file="/home/sidhu/output.json"
)

# Extract to a file using an existing Spark cluster
extract(
    "/media/sidhu/Seagate/empire_apt3_2019-05-14223117.json",
    [1, 2],
    log_column="log_name",
    event_column="event_data",
    additional_columns="host.name",
    single_file=True,
    output_file="/home/sidhu/output.json",
    master="spark://HOST:PORT" # mesos://HOST:PORT for yarn/mesos cluster
)

# Extract to a file using an existing spark session
extract(
    "/media/sidhu/Seagate/empire_apt3_2019-05-14223117.json",
    [1, 2],
    log_column="log_name",
    event_column="event_data",
    additional_columns="host.name",
    single_file=True,
    output_file="/home/sidhu/output.json",
    spark_sess=spark, # spark session variable, usually named spark
)

# Extract to a Spark DataFrame
# NOTE: Must provide an existing Spark Session
extract(
    "/media/sidhu/Seagate/empire_apt3_2019-05-14223117.json",
    [1, 2],
    log_column="log_name",
    event_column="event_data",
    additional_columns="host.name",
    single_file=True,
    spark_sess=spark, # spark session variable, usually named spark
    as_spark_frame=True
)

# Extract to a Pandas DataFrame
df = extract(
    "/media/sidhu/Seagate/empire_apt3_2019-05-14223117.json",
    [1, 2],
    log_column="log_name",
    event_column="event_data",
    additional_columns="host.name",
    single_file=True,
    as_pandas_frame=True
)

# Extract using SparkDf as input
# NOTE: Must provide an existing Spark Session
df = extract(
    spark_df,
    [1, 2],
    log_column="log_name",
    event_column="event_data",
    additional_columns="host.name",
    single_file=True,
    as_pandas_frame=True
)

# Extract using PandasDf as input
# NOTE: To use a Pandas DataFrame as input and a Spark DataFrame as output, a Spark Session must be provided.
df = extract(
    pandas_df,
    [1, 2],
    log_column="log_name",
    event_column="event_data",
    additional_columns="host.name",
    single_file=True,
    as_pandas_frame=True
)

Installation

pip install sysxtract

Since this library leverages Spark, specifically PySpark, you need to install it manually. This allows for version compatability when connecting to existing clusters.

pip install pyspark==$VERSION.

If you're going to use spark locally:

pip install pyspark

Feedback

I appreciate any feedback so if you have any feature requests or issues make an issue with the appropriate tag or futhermore, send me an email at sidhuashton@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sysxtract-1.0.0.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

sysxtract-1.0.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file sysxtract-1.0.0.tar.gz.

File metadata

  • Download URL: sysxtract-1.0.0.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3

File hashes

Hashes for sysxtract-1.0.0.tar.gz
Algorithm Hash digest
SHA256 c4ee9baae292e66075a34abc6ccd34f6a7c24f50c01d1014297c51b9cb20153a
MD5 ddfa84484d970ee0e2d098ea78ac613a
BLAKE2b-256 fcbb63054759cf82d0fd044ffff133e58aa5a0769ffed2c695f75a7d0672d755

See more details on using hashes here.

File details

Details for the file sysxtract-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: sysxtract-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3

File hashes

Hashes for sysxtract-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 66a3e33ee300172ad7a7e55b2715c6dd74f5aebc282c8cc397180de127f28153
MD5 839fb8ac00be9773aef55d83bd512c0d
BLAKE2b-256 9f109765067b0cc0a24407ddf3614e976ac8d573c7dfadb3c54b9c2c645288b5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page