Skip to main content

A Flexible Network Data Analysis Framework

Project description

nfstream: a flexible network data analysis framework

nfstream is a Python package providing fast, flexible, and expressive data structures designed to make working with online or offline network data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world network data analysis in Python. Additionally, it has the broader goal of becoming a common network data processing framework for researchers providing data reproducibility across experiments.

Live Notebook live notebook
Project Website website
Discussion Channel Gitter
Latest Release latest release
Supported Versions python3
Project License License
Build Status Github WorkFlows
Code Quality Quality
Code Coverage Coverage

Main Features

  • Performance: nfstream is designed to be fast (x10 faster with PyPy support) with a small CPU and memory footprint.
  • Layer-7 visibility: nfstream deep packet inspection engine is based on nDPI. It allows nfstream to perform reliable encrypted applications identification and metadata extraction (e.g. TLS, QUIC, TOR, HTTP, SSH, DNS, etc.).
  • Flexibility: add a flow feature in 2 lines as an NFPlugin.
  • Machine Learning oriented: add your trained model as an NFPlugin.

How to use it?

  • Dealing with a big pcap file and just want to aggregate it as network flows? nfstream make this path easier in few lines:
from nfstream import NFStreamer
my_awesome_streamer = NFStreamer(source="facebook.pcap", # or network interface (source="eth0")
                                 snaplen=65535,
                                 idle_timeout=30,
                                 active_timeout=300,
                                 plugins=(),
                                 dissect=True,
                                 max_tcp_dissections=10,
                                 max_udp_dissections=16,
                                 statistics=False,
                                 account_ip_padding_size=False,
                                 enable_guess=True,
                                 decode_tunnels=True,
                                 bpf_filter=None,
                                 promisc=True
)

for flow in my_awesome_streamer:
    print(flow)  # print it.
    print(flow.to_namedtuple()) # convert it to a namedtuple.
    print(flow.to_json()) # convert it to json.
    print(flow.keys()) # get flow keys.
    print(flow.values()) # get flow values.
NFEntry(id=0,
        bidirectional_first_seen_ms=1472393122365,
        bidirectional_last_seen_ms=1472393123665,
        src2dst_first_seen_ms=1472393122365,
        src2dst_last_seen_ms=1472393123408,
        dst2src_first_seen_ms=1472393122668,
        dst2src_last_seen_ms=1472393123665,
        src_ip='192.168.43.18',
        dst_ip='66.220.156.68',
        version=4,
        src_port=52066,
        dst_port=443,
        protocol=6,
        vlan_id=4,
        bidirectional_packets=19,
        bidirectional_raw_bytes=5745,
        bidirectional_ip_bytes=5479,
        bidirectional_duration_ms=1300,
        src2dst_packets=9,
        src2dst_raw_bytes=1345,
        src2dst_ip_bytes=1219,
        src2dst_duration_ms=1300,
        dst2src_packets=10,
        dst2src_raw_bytes=4400,
        dst2src_ip_bytes=4260,
        dst2src_duration_ms=997,
        expiration_id=0,
        master_protocol=91,
        app_protocol=119,
        application_name='TLS.Facebook',
        category_name='SocialNetwork',
        client_info='facebook.com',
        server_info='*.facebook.com,*.facebook.net,*.fb.com,\
                     *.fbcdn.net,*.fbsbx.com,*.m.facebook.com,\
                     *.messenger.com,*.xx.fbcdn.net,*.xy.fbcdn.net,\
                     *.xz.fbcdn.net,facebook.com,fb.com,messenger.com',
        j3a_client='bfcc1a3891601edb4f137ab7ab25b840',
        j3a_server='2d1eb5817ece335c24904f516ad5da12')
from nfstream import NFStreamer
my_awesome_streamer = NFStreamer(source="facebook.pcap", statistics=True)
for flow in my_awesome_streamer:
    print(flow)
NFEntry(id=0,      
        bidirectional_first_seen_ms=1472393122365,
        bidirectional_last_seen_ms=1472393123665,
        src2dst_first_seen_ms=1472393122365,
        src2dst_last_seen_ms=1472393123408,
        dst2src_first_seen_ms=1472393122668,
        dst2src_last_seen_ms=1472393123665,
        src_ip='192.168.43.18',
        dst_ip='66.220.156.68',
        version=4,
        src_port=52066,
        dst_port=443,
        protocol=6,
        vlan_id=4,
        bidirectional_packets=19,
        bidirectional_raw_bytes=5745,
        bidirectional_ip_bytes=5479,
        bidirectional_duration_ms=1300,
        src2dst_packets=9,
        src2dst_raw_bytes=1345,
        src2dst_ip_bytes=1219,
        src2dst_duration_ms=1300,
        dst2src_packets=10,
        dst2src_raw_bytes=4400,
        dst2src_ip_bytes=4260,
        dst2src_duration_ms=997,
        expiration_id=0,
        bidirectional_min_raw_ps=66,
        bidirectional_mean_raw_ps=302.36842105263156,
        bidirectional_stdev_raw_ps=425.53315715259754,
        bidirectional_max_raw_ps=1454,
        src2dst_min_raw_ps=66,
        src2dst_mean_raw_ps=149.44444444444446,
        src2dst_stdev_raw_ps=132.20354676701294,
        src2dst_max_raw_ps=449,
        dst2src_min_raw_ps=66,
        dst2src_mean_raw_ps=440.0,
        dst2src_stdev_raw_ps=549.7164925870628,
        dst2src_max_raw_ps=1454,
        bidirectional_min_ip_ps=52,
        bidirectional_mean_ip_ps=288.36842105263156,
        bidirectional_stdev_ip_ps=425.53315715259754,
        bidirectional_max_ip_ps=1440,
        src2dst_min_ip_ps=52,
        src2dst_mean_ip_ps=135.44444444444446,
        src2dst_stdev_ip_ps=132.20354676701294,
        src2dst_max_ip_ps=435,
        dst2src_min_ip_ps=52,
        dst2src_mean_ip_ps=426.0,
        dst2src_stdev_ip_ps=549.7164925870628,
        dst2src_max_ip_ps=1440,
        bidirectional_min_piat_ms=0,
        bidirectional_mean_piat_ms=72.22222222222223,
        bidirectional_stdev_piat_ms=137.34994188549086,
        bidirectional_max_piat_ms=398,
        src2dst_min_piat_ms=0,
        src2dst_mean_piat_ms=130.375,
        src2dst_stdev_piat_ms=179.72036811192467,
        src2dst_max_piat_ms=415,
        dst2src_min_piat_ms=0,
        dst2src_mean_piat_ms=110.77777777777777,
        dst2src_stdev_piat_ms=169.51458475436397,
        dst2src_max_piat_ms=1,
        bidirectional_syn_packets=2,
        bidirectional_cwr_packets=0,
        bidirectional_ece_packets=0,
        bidirectional_urg_packets=0,
        bidirectional_ack_packets=18,
        bidirectional_psh_packets=9,
        bidirectional_rst_packets=0,
        bidirectional_fin_packets=0,
        src2dst_syn_packets=1,
        src2dst_cwr_packets=0,
        src2dst_ece_packets=0,
        src2dst_urg_packets=0,
        src2dst_ack_packets=8,
        src2dst_psh_packets=4,
        src2dst_rst_packets=0,
        src2dst_fin_packets=0,
        dst2src_syn_packets=1,
        dst2src_cwr_packets=0,
        dst2src_ece_packets=0,
        dst2src_urg_packets=0,
        dst2src_ack_packets=10,
        dst2src_psh_packets=5,
        dst2src_rst_packets=0,
        dst2src_fin_packets=0,
        master_protocol=91,
        app_protocol=119,
        application_name='TLS.Facebook',
        category_name='SocialNetwork',
        client_info='facebook.com',
        server_info='*.facebook.com,*.facebook.net,*.fb.com,\
                     *.fbcdn.net,*.fbsbx.com,*.m.facebook.com,\
                     *.messenger.com,*.xx.fbcdn.net,*.xy.fbcdn.net,\
                     *.xz.fbcdn.net,facebook.com,fb.com,messenger.com',
        j3a_client='bfcc1a3891601edb4f137ab7ab25b840',
        j3a_server='2d1eb5817ece335c24904f516ad5da12')
  • From pcap to Pandas DataFrame?
flows_count = NFStreamer(source='devil.pcap').to_pandas()
my_dataframe.head(5)
  • From pcap to csv file?
flows_rows_count = NFStreamer(source='devil.pcap').to_csv(path="devil.pcap.csv", sep=";")
  • Didn't find a specific flow feature? add a plugin to nfstream in few lines:
from nfstream import NFPlugin

class packet_with_666_size(NFPlugin):
    def on_init(self, pkt): # flow creation with the first packet
        if pkt.raw_size == 666:
            return 1
        else:
            return 0

    def on_update(self, pkt, flow): # flow update with each packet belonging to the flow
        if pkt.raw_size == 666:
            flow.packet_with_666_size += 1

streamer_awesome = NFStreamer(source='devil.pcap', plugins=[packet_with_666_size()])
for flow in streamer_awesome:
    print(flow.packet_with_666_size) # see your dynamically created metric in generated flows

Run your Machine Learning models

In the following, we want to run an early classification of flows based on a trained machine learning model than takes as features the 3 first packets size of a flow.

Computing required features

from nfstream import NFPlugin

class feat_1(NFPlugin):
    def on_init(self, obs):
        entry.feat_1 == obs.raw_size

class feat_2(NFPlugin):
    def on_update(self, obs, entry):
        if entry.bidirectional_packets == 2:
            entry.feat_2 == obs.raw_size

class feat_3(NFPlugin):
    def on_update(self, obs, entry):
        if entry.bidirectional_packets == 3:
            entry.feat_3 == obs.raw_size

Trained model prediction

class model_prediction(NFPlugin):
    def on_update(self, obs, entry):
        if entry.bidirectional_packets == 3:
            entry.model_prediction = self.user_data.predict_proba([entry.feat_1,
                                                                   entry.feat_2,
                                                                   entry.feat_3])
            # optionally we can force NFStreamer to immediately expires the flow
            # entry.expiration_id = -1

Start your ML powered streamer

my_model = function_to_load_your_model() # or whatever
ml_streamer = NFStreamer(source='devil.pcap',
                         plugins=[feat_1(volatile=True),
                                  feat_2(volatile=True),
                                  feat_3(volatile=True),
                                  model_prediction(user_data=my_model)
                                  ])
for flow in ml_streamer:
     print(flow.model_prediction) # now you will see your trained model prediction.

Installation

Using pip

Binary installers for the latest released version are available:

python3 -m pip install nfstream

Build from sources

If you want to build nfstream from sources on your local machine:

linux Linux

sudo apt-get install autoconf automake libtool pkg-config libpcap-dev
sudo apt-get install libusb-1.0-0-dev libdbus-glib-1-dev libbluetooth-dev libnl-genl-3-dev flex bison
git clone https://github.com/aouinizied/nfstream.git
cd nfstream
python3 -m pip install -r requirements.txt
python3 setup.py bdist_wheel

osx MacOS

brew install autoconf automake libtool pkg-config
git clone https://github.com/aouinizied/nfstream.git
cd nfstream
python3 -m pip install -r requirements.txt
python3 setup.py bdist_wheel

Contributing

Please read Contributing for details on our code of conduct, and the process for submitting pull requests to us.

Authors

Zied Aouini created nfstream and these fine people have contributed.

Ethics

nfstream is intended for network data research and forensics. Researchers and network data scientists can use these framework to build reliable datasets, train and evaluate network applied machine learning models. As with any packet monitoring tool, nfstream could potentially be misused. Do not run it on any network of which you are not the owner or the administrator.

License

This project is licensed under the GPLv3 License - see the License file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

nfstream-5.1.1-pp36-pypy36_pp73-macosx_10_15_x86_64.whl (442.0 kB view details)

Uploaded PyPy macOS 10.15+ x86-64

nfstream-5.1.1-cp38-cp38-manylinux1_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.8

nfstream-5.1.1-cp38-cp38-macosx_10_15_x86_64.whl (442.0 kB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

nfstream-5.1.1-cp37-cp37m-manylinux1_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.7m

nfstream-5.1.1-cp37-cp37m-macosx_10_15_x86_64.whl (442.0 kB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

nfstream-5.1.1-cp36-cp36m-manylinux1_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.6m

nfstream-5.1.1-cp36-cp36m-macosx_10_15_x86_64.whl (442.0 kB view details)

Uploaded CPython 3.6m macOS 10.15+ x86-64

File details

Details for the file nfstream-5.1.1-pp36-pypy36_pp73-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.1-pp36-pypy36_pp73-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: PyPy
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.46.0 PyPy/7.3.1

File hashes

Hashes for nfstream-5.1.1-pp36-pypy36_pp73-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 d334ca2d63bbd19512c4728d578caef030bfe8463eec0c934e27ebf3475ba817
MD5 c27641cad1cde414012b974013560798
BLAKE2b-256 3347ec4f8f16a7614c3bf318b7480fc121b1961dd09708b5083bbebd4b36c8b0

See more details on using hashes here.

File details

Details for the file nfstream-5.1.1-pp36-pypy36_pp73-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.1-pp36-pypy36_pp73-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 442.0 kB
  • Tags: PyPy, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.46.0 PyPy/7.3.1

File hashes

Hashes for nfstream-5.1.1-pp36-pypy36_pp73-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 84f1ceb1c0bd8d9259aa77f84fd3e80bfb079135e7021b4303565a04d03ade38
MD5 961a72b20fbbf8e0af556a2d9251a3c6
BLAKE2b-256 cc41589e70a149c7e3c28cea011ae94019cd03e78194919d4777d64f243979e9

See more details on using hashes here.

File details

Details for the file nfstream-5.1.1-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.1-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for nfstream-5.1.1-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 fef8643c71298fd00393f9d961e4673ff6b9dedfaad29c7c64dbebcb14506c54
MD5 4c13a3ee61e3ec60a3357a7b1247aabe
BLAKE2b-256 cd4125497af9b3d2a4fd4b2678d19973d04d05459ed09dccd193499b9f09c844

See more details on using hashes here.

File details

Details for the file nfstream-5.1.1-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.1-cp38-cp38-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 442.0 kB
  • Tags: CPython 3.8, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for nfstream-5.1.1-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 2ce41a141cd5c5bb1a874353689f3ed08c58e4768572009faa6b32ea97f2292b
MD5 aca5fe89bef01e23087c0914d046d2b1
BLAKE2b-256 e9a09205ab58f1daf01f0eecab50a7775aaf6ef5005bc08b3bb15f00e2941525

See more details on using hashes here.

File details

Details for the file nfstream-5.1.1-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.1-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for nfstream-5.1.1-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 28fd326a891c112eecf4f791790e8585e7614293a13df27b163fbb69264728e6
MD5 9f22a787b94a00b5ebd6b2b62ecaec12
BLAKE2b-256 ab3c35c9e58d003ed61bcd745519dff2a7a2fba681dc408f01757f56590881d4

See more details on using hashes here.

File details

Details for the file nfstream-5.1.1-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.1-cp37-cp37m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 442.0 kB
  • Tags: CPython 3.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for nfstream-5.1.1-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 3d7aa0a6d46cc408864d9f2646f8ee5c0be8d1c3cbe04172f04f22a7ae06b0b0
MD5 61dd2d8226ec94dcd2579edb8c2213f7
BLAKE2b-256 f33ebd19de5785c3625f945e9b3696563c56bddc31844b42a7320de9b4e425e8

See more details on using hashes here.

File details

Details for the file nfstream-5.1.1-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.1-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.10

File hashes

Hashes for nfstream-5.1.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 bde23fc3ee2abf7664fce0b96e1b579f7604bb859cb2635e012378c7971aefc5
MD5 7cf43792822b770c8a6d485b9589e932
BLAKE2b-256 288602ece63d422aa143959f5481de84fa113c1ec8e1015ad828acc506b365c4

See more details on using hashes here.

File details

Details for the file nfstream-5.1.1-cp36-cp36m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.1.1-cp36-cp36m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 442.0 kB
  • Tags: CPython 3.6m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.10

File hashes

Hashes for nfstream-5.1.1-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 1f27e9dfb0310ed9beef36205a08ffa3d6bed8ed44827afd98c390d7fc4b6738
MD5 d64c71eeb2d0e35bb3088ab18ec01fe0
BLAKE2b-256 250773a308df940b3d3c5e079d9d2a0f2920df75cef7cf7cbeec2344affef427

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page