Network anomaly detection via machine learning
Project description
netml
netml
is a network anomaly detection library written in Python.
This library contains two primary submodules:
-
pcap parser:
pparser
pparser
is for parsing pcaps to flow features, using Scapy. -
novelty detection modeling:
ndm
ndm
is for detecting novelty / anomaly, via different models, such as OCSVM.
Installation
From a repository clone:
pip install .
Use
PCAP to features
import os
from netml.pparser.parser import PCAP
from netml.utils.tool import dump_data
RANDOM_STATE = 42
pcap_file = 'data/demo.pcap'
pp = PCAP(pcap_file, flow_ptks_thres=2, verbose=10, random_state=RANDOM_STATE)
# extract flows from pcap
pp.pcap2flows(q_interval=0.9)
# label each flow with a label
label_file = 'data/demo.csv'
pp.label_flows(label_file=label_file)
# extract features from each flow given feat_type
feat_type = 'IAT'
pp.flow2features(feat_type, fft=False, header=False)
# dump data to disk
X, y = pp.features, pp.labels
out_dir = os.path.join('out', os.path.dirname(pcap_file))
dump_data((X, y), out_file=f'{out_dir}/demo_{feat_type}.dat')
print(pp.features.shape, pp.pcap2flows.tot_time, pp.flow2features.tot_time)
Novelty detection
import os
from sklearn.model_selection import train_test_split
from netml.ndm.model import MODEL
from netml.ndm.ocsvm import OCSVM
from netml.utils.tool import dump_data, load_data
RANDOM_STATE = 42
# load data
data_file = 'out/data/demo_IAT.dat'
X, y = load_data(data_file)
# split train and test test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=RANDOM_STATE)
# create detection model
model = OCSVM(kernel='rbf', nu=0.5, random_state=RANDOM_STATE)
model.name = 'OCSVM'
ndm = MODEL(model, score_metric='auc', verbose=10, random_state=RANDOM_STATE)
# learned the model from the train set
ndm.train(X_train, y_train)
# evaluate the learned model
ndm.test(X_test, y_test)
# dump data to disk
out_dir = os.path.dirname(data_file)
dump_data((model, ndm.history), out_file=f'{out_dir}/{ndm.model_name}-results.dat')
print(ndm.train.tot_time, ndm.test.tot_time, ndm.score)
For more examples, please check the 'examples' directory.
Architecture
- docs/: includes all documents (such as APIs)
- examples/: includes toy examples and datasets for you to play with it
- ndm/: includes different detection models (such as OCSVM)
- pparser/: includes pcap propcess (feature extraction from pcap)
- scripts/: others (such as xxx.sh, make)
- tests/: includes test cases
- utils/: includes common functions (such as load data and dump data)
- visul/: includes visualization functions
- LICENSE.txt
- readme.md
- requirements.txt
- setup.py
To Do
The current version just implements basic functions. We still need to further evaluate and optimize them continually.
- Evaluate 'pparser' performance on different pcaps
- Add setup.py for 'install'
- Add 'test' cases
- Add license
- Add more examples
- Generated docs from docs-string automatically
Welcome to make any comments to make it more robust and easier to use!
Thanks
netml
is based on the initial work of "Outlier Detection" library odet
🙌
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.