Network anomaly detection via machine learning
Project description
netml
netml
is a network anomaly detection tool & library written in Python.
The library contains two primary submodules:
-
pparser
: pcap parser
Parse pcaps to produce flow features using Scapy. -
ndm
: novelty detection modeling
Detect novelties / anomalies, via different models, such as OCSVM.
The tool's command-line interface is documented by its built-in help flags -h
and --help
:
netml --help
Installation
netml
is available on PyPI:
pip install netml
Or, from a repository clone:
pip install .
CLI
The CLI tool is available as a distribution "extra":
pip install netml[cli]
Or:
pip install .[cli]
Tab-completion
Shell tab-completion is provided by argcomplete
(through argcmdr
). Completion code appropriate to your shell may be generated by register-python-argcomplete
, e.g.:
register-python-argcomplete --shell=bash netml
The results of the above should be evaluated, e.g.:
eval "$(register-python-argcomplete --shell=bash netml)"
Or, to ensure the above is evaluated for every session, e.g.:
register-python-argcomplete --shell=bash netml > ~/.bash_completion
For more information, refer to argcmdr
: Shell completion.
Use
All of the below may be wrapped up into a single command via the CLI:
netml --pcap=data/demo.pcap \
--label=data/demo.csv \
--output=out/OCSVM-results.dat
PCAP to features
To only extract features via the CLI:
netml extract \
--pcap=data/demo.pcap \
--label=data/demo.csv \
--feature=out/IAT-features.dat
Or in Python:
import os
from netml.pparser.parser import PCAP
from netml.utils.tool import dump_data
RANDOM_STATE = 42
pcap_file = 'data/demo.pcap'
pp = PCAP(pcap_file, flow_ptks_thres=2, verbose=10, random_state=RANDOM_STATE)
# extract flows from pcap
pp.pcap2flows(q_interval=0.9)
# label each flow with a label
label_file = 'data/demo.csv'
pp.label_flows(label_file=label_file)
# extract features from each flow given feat_type
feat_type = 'IAT'
pp.flow2features(feat_type, fft=False, header=False)
# dump data to disk
X, y = pp.features, pp.labels
out_dir = os.path.join('out', os.path.dirname(pcap_file))
dump_data((X, y), out_file=f'{out_dir}/demo_{feat_type}.dat')
print(pp.features.shape, pp.pcap2flows.tot_time, pp.flow2features.tot_time)
Novelty detection
To analyze already-extracted features via the CLI:
netml analyze \
--feature=out/IAT-features.dat \
--output=out/OCSVM-results.dat
Or in Python:
import os
from sklearn.model_selection import train_test_split
from netml.ndm.model import MODEL
from netml.ndm.ocsvm import OCSVM
from netml.utils.tool import dump_data, load_data
RANDOM_STATE = 42
# load data
data_file = 'out/data/demo_IAT.dat'
X, y = load_data(data_file)
# split train and test test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=RANDOM_STATE)
# create detection model
model = OCSVM(kernel='rbf', nu=0.5, random_state=RANDOM_STATE)
model.name = 'OCSVM'
ndm = MODEL(model, score_metric='auc', verbose=10, random_state=RANDOM_STATE)
# learned the model from the train set
ndm.train(X_train, y_train)
# evaluate the learned model
ndm.test(X_test, y_test)
# dump data to disk
out_dir = os.path.dirname(data_file)
dump_data((model, ndm.history), out_file=f'{out_dir}/{ndm.model_name}-results.dat')
print(ndm.train.tot_time, ndm.test.tot_time, ndm.score)
For more examples, see the examples/
directory in the source repository.
Architecture
- docs/: includes all documents (such as APIs)
- examples/: includes toy examples and datasets for you to play with it
- ndm/: includes different detection models (such as OCSVM)
- pparser/: includes pcap propcess (feature extraction from pcap)
- scripts/: others (such as xxx.sh, make)
- tests/: includes test cases
- utils/: includes common functions (such as load data and dump data)
- visul/: includes visualization functions
- LICENSE.txt
- readme.md
- requirements.txt
- setup.py
To Do
The current version just implements basic functions. We still need to further evaluate and optimize them continually.
- Evaluate 'pparser' performance on different pcaps
- Add 'test' cases
- Add license
- Add more examples
- Generated docs from docs-string automatically
Welcome to make any comments to make it more robust and easier to use!
Development
Development dependencies may be installed via the dev
extras (below assuming a source checkout):
pip install --editable .[dev]
(Note: the installation flag --editable
is also used above to instruct pip
to place the source checkout directory itself onto the Python path, to ensure that any changes to the source are reflected in Python imports.)
Development tasks are then managed via argcmdr
sub-commands of manage …
, (as defined by the repository module manage.py
), e.g.:
manage version patch -m "initial release of netml" \
--build \
--release
Thanks
netml
is based on the initial work of the "Outlier Detection" library odet
🙌
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.