Skip to main content

A Flexible Network Data Analysis Framework

Project description

nfstream: a flexible network data analysis framework

nfstream is a Python package providing fast, flexible, and expressive data structures designed to make working with online or offline network data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world network data analysis in Python. Additionally, it has the broader goal of becoming a common network data processing framework for researchers providing data reproducibility across experiments.

Live Notebook live notebook
Project Website website
Discussion Channel Gitter
Latest Release latest release
Supported Versions python3
Project License License
Build Status Github WorkFlows
Code Quality Quality
Code Coverage Coverage

Main Features

  • Performance: nfstream is designed to be fast (x10 faster with PyPy support) with a small CPU and memory footprint.
  • Layer-7 visibility: nfstream deep packet inspection engine is based on nDPI. It allows nfstream to perform reliable encrypted applications identification and metadata extraction (e.g. TLS, QUIC, TOR, HTTP, SSH, DNS, etc.).
  • Flexibility: add a flow feature in 2 lines as an NFPlugin.
  • Machine Learning oriented: add your trained model as an NFPlugin.

How to use it?

  • Dealing with a big pcap file and just want to aggregate it as network flows? nfstream make this path easier in few lines:
from nfstream import NFStreamer
my_awesome_streamer = NFStreamer(source="facebook.pcap", # or network interface (source="eth0")
                                 snaplen=65535,
                                 idle_timeout=30,
                                 active_timeout=300,
                                 plugins=(),
                                 dissect=True,
                                 max_tcp_dissections=10,
                                 max_udp_dissections=16,
                                 statistics=False,
                                 account_ip_padding_size=False,
                                 enable_guess=True,
                                 decode_tunnels=True,
                                 bpf_filter=None,
                                 promisc=True
)

for flow in my_awesome_streamer:
    print(flow)  # print it.
    print(flow.to_namedtuple()) # convert it to a named tuple.
    print(flow.to_json()) # convert it to json.
    print(flow.keys()) # get flow keys.
    print(flow.values()) # get flow values.
NFEntry(id=0,
        bidirectional_first_seen_ms=1472393122365,
        bidirectional_last_seen_ms=1472393123665,
        src2dst_first_seen_ms=1472393122365,
        src2dst_last_seen_ms=1472393123408,
        dst2src_first_seen_ms=1472393122668,
        dst2src_last_seen_ms=1472393123665,
        src_ip='192.168.43.18',
        dst_ip='66.220.156.68',
        version=4,
        src_port=52066,
        dst_port=443,
        protocol=6,
        vlan_id=4,
        bidirectional_packets=19,
        bidirectional_raw_bytes=5745,
        bidirectional_ip_bytes=5479,
        bidirectional_duration_ms=1300,
        src2dst_packets=9,
        src2dst_raw_bytes=1345,
        src2dst_ip_bytes=1219,
        src2dst_duration_ms=1300,
        dst2src_packets=10,
        dst2src_raw_bytes=4400,
        dst2src_ip_bytes=4260,
        dst2src_duration_ms=997,
        expiration_id=0,
        master_protocol=91,
        app_protocol=119,
        application_name='TLS.Facebook',
        category_name='SocialNetwork',
        client_info='facebook.com',
        server_info='*.facebook.com,*.facebook.net,*.fb.com,\
                     *.fbcdn.net,*.fbsbx.com,*.m.facebook.com,\
                     *.messenger.com,*.xx.fbcdn.net,*.xy.fbcdn.net,\
                     *.xz.fbcdn.net,facebook.com,fb.com,messenger.com',
        j3a_client='bfcc1a3891601edb4f137ab7ab25b840',
        j3a_server='2d1eb5817ece335c24904f516ad5da12')
from nfstream import NFStreamer
my_awesome_streamer = NFStreamer(source="facebook.pcap", statistics=True)
for flow in my_awesome_streamer:
    print(flow)
NFEntry(id=0,      
        bidirectional_first_seen_ms=1472393122365,
        bidirectional_last_seen_ms=1472393123665,
        src2dst_first_seen_ms=1472393122365,
        src2dst_last_seen_ms=1472393123408,
        dst2src_first_seen_ms=1472393122668,
        dst2src_last_seen_ms=1472393123665,
        src_ip='192.168.43.18',
        dst_ip='66.220.156.68',
        version=4,
        src_port=52066,
        dst_port=443,
        protocol=6,
        vlan_id=4,
        bidirectional_packets=19,
        bidirectional_raw_bytes=5745,
        bidirectional_ip_bytes=5479,
        bidirectional_duration_ms=1300,
        src2dst_packets=9,
        src2dst_raw_bytes=1345,
        src2dst_ip_bytes=1219,
        src2dst_duration_ms=1300,
        dst2src_packets=10,
        dst2src_raw_bytes=4400,
        dst2src_ip_bytes=4260,
        dst2src_duration_ms=997,
        expiration_id=0,
        bidirectional_min_raw_ps=66,
        bidirectional_mean_raw_ps=302.36842105263156,
        bidirectional_stdev_raw_ps=425.53315715259754,
        bidirectional_max_raw_ps=1454,
        src2dst_min_raw_ps=66,
        src2dst_mean_raw_ps=149.44444444444446,
        src2dst_stdev_raw_ps=132.20354676701294,
        src2dst_max_raw_ps=449,
        dst2src_min_raw_ps=66,
        dst2src_mean_raw_ps=440.0,
        dst2src_stdev_raw_ps=549.7164925870628,
        dst2src_max_raw_ps=1454,
        bidirectional_min_ip_ps=52,
        bidirectional_mean_ip_ps=288.36842105263156,
        bidirectional_stdev_ip_ps=425.53315715259754,
        bidirectional_max_ip_ps=1440,
        src2dst_min_ip_ps=52,
        src2dst_mean_ip_ps=135.44444444444446,
        src2dst_stdev_ip_ps=132.20354676701294,
        src2dst_max_ip_ps=435,
        dst2src_min_ip_ps=52,
        dst2src_mean_ip_ps=426.0,
        dst2src_stdev_ip_ps=549.7164925870628,
        dst2src_max_ip_ps=1440,
        bidirectional_min_piat_ms=0,
        bidirectional_mean_piat_ms=72.22222222222223,
        bidirectional_stdev_piat_ms=137.34994188549086,
        bidirectional_max_piat_ms=398,
        src2dst_min_piat_ms=0,
        src2dst_mean_piat_ms=130.375,
        src2dst_stdev_piat_ms=179.72036811192467,
        src2dst_max_piat_ms=415,
        dst2src_min_piat_ms=0,
        dst2src_mean_piat_ms=110.77777777777777,
        dst2src_stdev_piat_ms=169.51458475436397,
        dst2src_max_piat_ms=1,
        bidirectional_syn_packets=2,
        bidirectional_cwr_packets=0,
        bidirectional_ece_packets=0,
        bidirectional_urg_packets=0,
        bidirectional_ack_packets=18,
        bidirectional_psh_packets=9,
        bidirectional_rst_packets=0,
        bidirectional_fin_packets=0,
        src2dst_syn_packets=1,
        src2dst_cwr_packets=0,
        src2dst_ece_packets=0,
        src2dst_urg_packets=0,
        src2dst_ack_packets=8,
        src2dst_psh_packets=4,
        src2dst_rst_packets=0,
        src2dst_fin_packets=0,
        dst2src_syn_packets=1,
        dst2src_cwr_packets=0,
        dst2src_ece_packets=0,
        dst2src_urg_packets=0,
        dst2src_ack_packets=10,
        dst2src_psh_packets=5,
        dst2src_rst_packets=0,
        dst2src_fin_packets=0,
        master_protocol=91,
        app_protocol=119,
        application_name='TLS.Facebook',
        category_name='SocialNetwork',
        client_info='facebook.com',
        server_info='*.facebook.com,*.facebook.net,*.fb.com,\
                     *.fbcdn.net,*.fbsbx.com,*.m.facebook.com,\
                     *.messenger.com,*.xx.fbcdn.net,*.xy.fbcdn.net,\
                     *.xz.fbcdn.net,facebook.com,fb.com,messenger.com',
        j3a_client='bfcc1a3891601edb4f137ab7ab25b840',
        j3a_server='2d1eb5817ece335c24904f516ad5da12')
  • From pcap to Pandas DataFrame?
flows_count = NFStreamer(source='devil.pcap').to_pandas()
my_dataframe.head(5)
  • From pcap to csv file?
flows_rows_count = NFStreamer(source='devil.pcap').to_csv(path="devil.pcap.csv", sep=";")
  • Didn't find a specific flow feature? add a plugin to nfstream in few lines:
from nfstream import NFPlugin

class packet_with_666_size(NFPlugin):
    def on_init(self, pkt): # flow creation with the first packet
        if pkt.raw_size == 666:
            return 1
        else:
            return 0

    def on_update(self, pkt, flow): # flow update with each packet belonging to the flow
        if pkt.raw_size == 666:
            flow.packet_with_666_size += 1

streamer_awesome = NFStreamer(source='devil.pcap', plugins=[packet_with_666_size()])
for flow in streamer_awesome:
    print(flow.packet_with_666_size) # see your dynamically created metric in generated flows

Run your Machine Learning models

In the following, we want to run an early classification of flows based on a trained machine learning model than takes as features the 3 first packets size of a flow.

Computing required features

from nfstream import NFPlugin

class feat_1(NFPlugin):
    def on_init(self, obs):
        entry.feat_1 == obs.raw_size

class feat_2(NFPlugin):
    def on_update(self, obs, entry):
        if entry.bidirectional_packets == 2:
            entry.feat_2 == obs.raw_size

class feat_3(NFPlugin):
    def on_update(self, obs, entry):
        if entry.bidirectional_packets == 3:
            entry.feat_3 == obs.raw_size

Trained model mrediction

class model_prediction(NFPlugin):
    def on_update(self, obs, entry):
        if entry.bidirectional_packets == 3:
            entry.model_prediction = self.user_data.predict_proba([entry.feat_1,
                                                                   entry.feat_2,
                                                                   entry.feat_3])
            # optionally we can force NFStreamer to immediately expires the flow
            # entry.expiration_id = -1

Start your ML powered streamer

my_model = function_to_load_your_model() # or whatever
ml_streamer = NFStreamer(source='devil.pcap',
                         plugins=[feat_1(volatile=True),
                                  feat_2(volatile=True),
                                  feat_3(volatile=True),
                                  model_prediction(user_data=my_model)
                                  ])
for flow in ml_streamer:
     print(flow.model_prediction) # now you will see your trained model prediction.

Installation

Using pip

Binary installers for the latest released version are available:

python3 -m pip install nfstream

Build from sources

If you want to build nfstream from sources on your local machine:

linux Linux

sudo apt-get install autoconf automake libtool pkg-config libpcap-dev
git clone https://github.com/aouinizied/nfstream.git
cd nfstream
python3 -m pip install -r requirements.txt
python3 setup.py bdist_wheel

osx MacOS

brew install autoconf automake libtool pkg-config
git clone https://github.com/aouinizied/nfstream.git
cd nfstream
python3 -m pip install -r requirements.txt
python3 setup.py bdist_wheel

Contributing

Please read Contributing for details on our code of conduct, and the process for submitting pull requests to us.

Authors

Zied Aouini created nfstream and these fine people have contributed.

Ethics

nfstream is intended for network data research and forensics. Researchers and network data scientists can use these framework to build reliable datasets, train and evaluate network applied machine learning models. As with any packet monitoring tool, nfstream could potentially be misused. Do not run it on any network of which you are not the owner or the administrator.

License

This project is licensed under the GPLv3 License - see the License file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

nfstream-5.1.0-pp36-pypy36_pp73-macosx_10_15_x86_64.whl (441.9 kB view details)

Uploaded PyPy macOS 10.15+ x86-64

nfstream-5.1.0-cp38-cp38-manylinux1_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.8

nfstream-5.1.0-cp38-cp38-macosx_10_15_x86_64.whl (441.9 kB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

nfstream-5.1.0-cp37-cp37m-manylinux1_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.7m

nfstream-5.1.0-cp37-cp37m-macosx_10_15_x86_64.whl (441.9 kB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

nfstream-5.1.0-cp36-cp36m-manylinux1_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.6m

nfstream-5.1.0-cp36-cp36m-macosx_10_15_x86_64.whl (441.9 kB view details)

Uploaded CPython 3.6m macOS 10.15+ x86-64

File details

Details for the file nfstream-5.1.0-pp36-pypy36_pp73-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.0-pp36-pypy36_pp73-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: PyPy
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 PyPy/7.3.0

File hashes

Hashes for nfstream-5.1.0-pp36-pypy36_pp73-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a331ae913781b720d701be25d2045ada3b98c6aec1e9e82007f45dd53d91a258
MD5 8840f94cd9163326c4d984bc0e551711
BLAKE2b-256 604ed965a8fa37f05e21fc67e058c08559d76b618d1c32e54bf0db0a9a2831a4

See more details on using hashes here.

File details

Details for the file nfstream-5.1.0-pp36-pypy36_pp73-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.0-pp36-pypy36_pp73-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 441.9 kB
  • Tags: PyPy, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 PyPy/7.3.1

File hashes

Hashes for nfstream-5.1.0-pp36-pypy36_pp73-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 db04388a82b5b69df436df91027b6f05f757c5cdc12ebfd216e1f6a2f3158de6
MD5 e60d3d71acf535a253b6af0418def119
BLAKE2b-256 5b987166b1f4b79f4a8f8e48673c2ae4f7720abf979bcea8fa4ed4b1eab72db9

See more details on using hashes here.

File details

Details for the file nfstream-5.1.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.0-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for nfstream-5.1.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 45b2a9d507eb7d809714ef62a5eb1dd259e05b7aab52c20863a2ae19250a8e82
MD5 b5579da89211a1bd75ef59bb3427d1f0
BLAKE2b-256 6f90ba997242a70938b3aac5710cad00ac2926130594cd03ee28c03502bc2282

See more details on using hashes here.

File details

Details for the file nfstream-5.1.0-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.0-cp38-cp38-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 441.9 kB
  • Tags: CPython 3.8, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for nfstream-5.1.0-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 b64647964bf09d0e040357661976cb108099d5293426eccd7a1c96594d6a739a
MD5 09feb206dbd98b228c32673174607bf2
BLAKE2b-256 2622536e933b85fe583d5373de3257a23e9e0d52e1a69ca7081031591c405e6e

See more details on using hashes here.

File details

Details for the file nfstream-5.1.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.6

File hashes

Hashes for nfstream-5.1.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b19293666ef63ff83ae5375345687796f55bf565ad2f4f8db5ed4dfe7598f112
MD5 3de0ce4e7c6e616d73b41eaa6d4fc7db
BLAKE2b-256 83ba85efeb28705eb95a8b0e2d41b3d7dab76e69a702614ed44aca26d067b4a4

See more details on using hashes here.

File details

Details for the file nfstream-5.1.0-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.0-cp37-cp37m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 441.9 kB
  • Tags: CPython 3.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.6

File hashes

Hashes for nfstream-5.1.0-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 62e49361a9c200211accd758c5a2cb4ce519a98f9ddf365c11cbe66343b54123
MD5 b8d2c9bb737d1b50bcbe74784ec9676f
BLAKE2b-256 9565c18448945d6fa82c854d817ce67c6164ea18549b4868692f5210ef3ec399

See more details on using hashes here.

File details

Details for the file nfstream-5.1.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.10

File hashes

Hashes for nfstream-5.1.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 eff09a02fe77e9bca9cbadbb618843d31a50f39bf64127d61eb07ea4833cb308
MD5 88871cf1fba50ab96bea393d53bacc49
BLAKE2b-256 30fa6ad5370060253459b7286f5fe5843d5576cae3063277ae0ed707bccc8f4a

See more details on using hashes here.

File details

Details for the file nfstream-5.1.0-cp36-cp36m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.0-cp36-cp36m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 441.9 kB
  • Tags: CPython 3.6m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.10

File hashes

Hashes for nfstream-5.1.0-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 6691f399c62a55a5d8e6ca4322fb543b9677b3aeefc6755938160c715e307f89
MD5 bc19cf72ed7ecf60b64c9454e499b976
BLAKE2b-256 2aec9ac20d2eac90d23529a82345200a4c65bbfc45372479c5e4cfd1ee273372

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page