Skip to main content

Fully automated traffic analysis with nPrint

Project description

nprintML

nprintML bridges the gap between nPrint, which generates standard fingerprints for packets, and AutoML, which allows for optimized model training and traffic analysis. nprintML enables users with network traffic and labels to perform optimized packet-level traffic analysis without writing any code.

Getting It

Dependencies

Python versions 3.6 through 3.8 are supported.

You might check what versions of Python are installed on your system, e.g.:

ls -1 /usr/bin/python*

As needed, consult your package manager or python.org.

Depending on your situation, consider pyenv for easy installation and management of arbitrary versions of Python.

nprintML further requires nPrint (see below).

Installation

nprintML itself is available for download from the Python Package Index (PyPI) and via pip:

python -m pip install nprintml

This downloads, builds and installs the nprintml console command. If you're happy to manage your Python (virtual) environment, you're all set with the above.

That said, installation of this command via a tool such as pipx is strongly encouraged. pipx will ensure that nprintML is installed into its own virtual environment, such that its third-party libraries do not conflict with any others installed on your system.

(Note that nPrint and nprintML are unrelated to the PyPI distribution named "nprint.")

Post-installation

nprintML depends on the nPrint command, which may be installed separately, (with reference to the nPrint documentation).

For quick-and-easy satisfaction of this requirement, nprintML supplies the bootstrapping command nprint-install, which is made available to your environment with nprintML installed. This command will inspect its execution environment and attempt to retrieve, compile and install nPrint with the most appropriate defaults:

nprint-install

nPrint may thereby be installed system-globally, to the user environment, to the (virtual) environment to which nprintML was installed, or to a specified path prefix. Consult the command's --help for more information.

nprint-install is identically available through its Python module (no different from pip above):

python -m nprintml.net.install

Further set-up

nprintML leverages AutoGluon to manage AutoML. However, it does not by default install additional libraries required for all models supported by AutoGluon. If you wish to test these models, you will need to install their requirements manually.

AutoGluon will itself note which models it is unable to generate – and how to satisfy their requirements – during operation.

For more information, consult the AutoGluon documentation.

Using It

nprintML supplies the top-level shell command nprintml

nprintml ...

– as well as its terse alias nml

nml ...

In case of command path ambiguity and in support of debugging, the nprintml command is also available through its Python module:

python -m nprintml ...

The nPrintML traffic analysis pipeline is customizable. Traffic ingestion leverages nPrint, and as such supports its inputs. In addition, beyond a single PCAP file, nprintML may ingest multiple PCAP files and recursive directories of files, as outlined in the wiki.

A simple example involves per-packet machine learning given a single PCAP and IP address labels:

nprintml --ipv4 --pcap-file test.pcap --label-file labels.txt --aggregator index

The above instructs nprintML to execute a traffic analysis pipeline considering each packet in the file test.pcap as a sample, and to attach labels to each source IP address (nPrint's default index) as specified in labels.txt.

The label file should be formatted as follows:

Item,Label  # (optional header line)
IP1,label1
IP2,label2
IP3,label3
...

Through this labeling scheme we can attach labels to ports, ip addresses, and entire flows with nPrintML. For more information and advanced usage see the wiki.

Another example of using nPrintML is running a machine learning pipeline where every PCAP is considered to contain a single sample. The following command – (this time using terse aliases) – will create a machine learning pipeline using every PCAP file in the directory pcaps/ and the labels in labels.txt with IPv4 nPrints:

nml -4 --pcap-dir pcaps/ -L labels.txt -a pcap

The label file for the above follows the same format as in single PCAP usage, with only the Item column changing to specify file names as opposed to IP addresses:

item,label  # (optional header line)
path/name1.pcap,label1
path/name2.pcap,label2
path/name3.pcap,label3
...

Note that the path/ in the above example is the path relative to the directory specified by --pcap-dir, that is relative to the directory pcaps/.

Development

Development requirements may be installed via the dev extra (below assuming a source checkout):

pip install --editable .[dev]

(Note: the installation flag --editable is also used above to instruct pip to place the source checkout directory itself onto the Python path, to ensure that any changes to the source are reflected in Python imports.)

Development tasks are then managed via argcmdr sub-commands of manage …, (as defined by the repository module manage.py), e.g.:

manage version patch -m "initial release of nprintml" \
       --build                                        \
       --release

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nprintml-1.0.4.tar.gz (43.2 kB view details)

Uploaded Source

Built Distribution

nprintml-1.0.4-py3-none-any.whl (48.6 kB view details)

Uploaded Python 3

File details

Details for the file nprintml-1.0.4.tar.gz.

File metadata

  • Download URL: nprintml-1.0.4.tar.gz
  • Upload date:
  • Size: 43.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.0 importlib_metadata/4.8.2 packaging/21.0 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.8.11

File hashes

Hashes for nprintml-1.0.4.tar.gz
Algorithm Hash digest
SHA256 8ecfe6410f0aba25463ed9e26a28509750a31ab59b40436df73da10299d6d777
MD5 e6b4258299ff801136f790464417e346
BLAKE2b-256 6c4119cad7235972f960da11d71c711c5b98f760a02b03f731c002ccca247023

See more details on using hashes here.

File details

Details for the file nprintml-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: nprintml-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 48.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.0 importlib_metadata/4.8.2 packaging/21.0 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.8.11

File hashes

Hashes for nprintml-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 90abc7672403b5158c6bf8ebb74cb989f100f84aa8c5b97a9a63b8985a84bc2d
MD5 e855070cbcbb49fdeb996d52f17a66db
BLAKE2b-256 10c44d73b96d8cf7547f548f4ee768694dd379ede1a5ce79a8ce6817c1d31414

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page