Skip to main content

A flexible network data analysis framework

Project description

nfstream: a flexible network data analysis framework

nfstream is a Python package providing fast, flexible, and expressive data structures designed to make working with online or offline network data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world network data analysis in Python. Additionally, it has the broader goal of becoming a common network data processing framework for researchers providing data reproducibility across experiments.

Live Notebook live notebook
Project Website website
Discussion Channel Gitter
Latest Release latest release
Supported Versions python3
Project License License
Build Status Github WorkFlows
Code Quality Quality
Code Coverage Coverage

Main Features

  • Performance: nfstream is designed to be fast (x10 faster with PyPy support) with a small CPU and memory footprint.
  • Layer-7 visibility: nfstream deep packet inspection engine is based on nDPI. It allows nfstream to perform reliable encrypted applications identification and metadata extraction (e.g. TLS, QUIC, TOR, HTTP, SSH, DNS, etc.).
  • Flexibility: add a flow feature in 2 lines as an NFPlugin.
  • Machine Learning oriented: add your trained model as an NFPlugin.

How to use it?

  • Dealing with a big pcap file and just want to aggregate it as network flows? nfstream make this path easier in few lines:
from nfstream import NFStreamer
my_awesome_streamer = NFStreamer(source="facebook.pcap", # or network interface (source="eth0")
                                 snaplen=65535,
                                 idle_timeout=30,
                                 active_timeout=300,
                                 plugins=(),
                                 dissect=True,
                                 max_tcp_dissections=10,
                                 max_udp_dissections=16,
                                 statistics=False,
                                 account_ip_padding_size=False,
                                 enable_guess=True,
                                 decode_tunnels=True,
                                 bpf_filter=None,
                                 promisc=True
)

for flow in my_awesome_streamer:
    print(flow)  # print it.
    print(flow.to_namedtuple()) # convert it to a named tuple.
    print(flow.to_json()) # convert it to json.
NFEntry(id=0,
        bidirectional_first_seen_ms=1472393122365,
        bidirectional_last_seen_ms=1472393123665,
        src2dst_first_seen_ms=1472393122365,
        src2dst_last_seen_ms=1472393123408,
        dst2src_first_seen_ms=1472393122668,
        dst2src_last_seen_ms=1472393123665,
        src_ip='192.168.43.18',
        dst_ip='66.220.156.68',
        version=4,
        src_port=52066,
        dst_port=443,
        protocol=6,
        vlan_id=4,
        bidirectional_packets=19,
        bidirectional_raw_bytes=5745,
        bidirectional_ip_bytes=5479,
        bidirectional_duration_ms=1300,
        src2dst_packets=9,
        src2dst_raw_bytes=1345,
        src2dst_ip_bytes=1219,
        src2dst_duration_ms=1300,
        dst2src_packets=10,
        dst2src_raw_bytes=4400,
        dst2src_ip_bytes=4260,
        dst2src_duration_ms=997,
        expiration_id=0,
        master_protocol=91,
        app_protocol=119,
        application_name='TLS.Facebook',
        category_name='SocialNetwork',
        client_info='facebook.com',
        server_info='*.facebook.com,*.facebook.net,*.fb.com,\
                     *.fbcdn.net,*.fbsbx.com,*.m.facebook.com,\
                     *.messenger.com,*.xx.fbcdn.net,*.xy.fbcdn.net,\
                     *.xz.fbcdn.net,facebook.com,fb.com,messenger.com',
        j3a_client='bfcc1a3891601edb4f137ab7ab25b840',
        j3a_server='2d1eb5817ece335c24904f516ad5da12')
from nfstream import NFStreamer
my_awesome_streamer = NFStreamer(source="facebook.pcap", statistics=True)
for flow in my_awesome_streamer:
    print(flow)
NFEntry(id=0,      
        bidirectional_first_seen_ms=1472393122365,
        bidirectional_last_seen_ms=1472393123665,
        src2dst_first_seen_ms=1472393122365,
        src2dst_last_seen_ms=1472393123408,
        dst2src_first_seen_ms=1472393122668,
        dst2src_last_seen_ms=1472393123665,
        src_ip='192.168.43.18',
        dst_ip='66.220.156.68',
        version=4,
        src_port=52066,
        dst_port=443,
        protocol=6,
        vlan_id=4,
        bidirectional_packets=19,
        bidirectional_raw_bytes=5745,
        bidirectional_ip_bytes=5479,
        bidirectional_duration_ms=1300,
        src2dst_packets=9,
        src2dst_raw_bytes=1345,
        src2dst_ip_bytes=1219,
        src2dst_duration_ms=1300,
        dst2src_packets=10,
        dst2src_raw_bytes=4400,
        dst2src_ip_bytes=4260,
        dst2src_duration_ms=997,
        expiration_id=0,
        bidirectional_min_raw_ps=66,
        bidirectional_mean_raw_ps=302.36842105263156,
        bidirectional_stdev_raw_ps=425.53315715259754,
        bidirectional_max_raw_ps=1454,
        src2dst_min_raw_ps=66,
        src2dst_mean_raw_ps=149.44444444444446,
        src2dst_stdev_raw_ps=132.20354676701294,
        src2dst_max_raw_ps=449,
        dst2src_min_raw_ps=66,
        dst2src_mean_raw_ps=440.0,
        dst2src_stdev_raw_ps=549.7164925870628,
        dst2src_max_raw_ps=1454,
        bidirectional_min_ip_ps=52,
        bidirectional_mean_ip_ps=288.36842105263156,
        bidirectional_stdev_ip_ps=425.53315715259754,
        bidirectional_max_ip_ps=1440,
        src2dst_min_ip_ps=52,
        src2dst_mean_ip_ps=135.44444444444446,
        src2dst_stdev_ip_ps=132.20354676701294,
        src2dst_max_ip_ps=435,
        dst2src_min_ip_ps=52,
        dst2src_mean_ip_ps=426.0,
        dst2src_stdev_ip_ps=549.7164925870628,
        dst2src_max_ip_ps=1440,
        bidirectional_min_piat_ms=0,
        bidirectional_mean_piat_ms=72.22222222222223,
        bidirectional_stdev_piat_ms=137.34994188549086,
        bidirectional_max_piat_ms=398,
        src2dst_min_piat_ms=0,
        src2dst_mean_piat_ms=130.375,
        src2dst_stdev_piat_ms=179.72036811192467,
        src2dst_max_piat_ms=415,
        dst2src_min_piat_ms=0,
        dst2src_mean_piat_ms=110.77777777777777,
        dst2src_stdev_piat_ms=169.51458475436397,
        dst2src_max_piat_ms=1,
        bidirectional_syn_packets=2,
        bidirectional_cwr_packets=0,
        bidirectional_ece_packets=0,
        bidirectional_urg_packets=0,
        bidirectional_ack_packets=18,
        bidirectional_psh_packets=9,
        bidirectional_rst_packets=0,
        bidirectional_fin_packets=0,
        src2dst_syn_packets=1,
        src2dst_cwr_packets=0,
        src2dst_ece_packets=0,
        src2dst_urg_packets=0,
        src2dst_ack_packets=8,
        src2dst_psh_packets=4,
        src2dst_rst_packets=0,
        src2dst_fin_packets=0,
        dst2src_syn_packets=1,
        dst2src_cwr_packets=0,
        dst2src_ece_packets=0,
        dst2src_urg_packets=0,
        dst2src_ack_packets=10,
        dst2src_psh_packets=5,
        dst2src_rst_packets=0,
        dst2src_fin_packets=0,
        master_protocol=91,
        app_protocol=119,
        application_name='TLS.Facebook',
        category_name='SocialNetwork',
        client_info='facebook.com',
        server_info='*.facebook.com,*.facebook.net,*.fb.com,\
                     *.fbcdn.net,*.fbsbx.com,*.m.facebook.com,\
                     *.messenger.com,*.xx.fbcdn.net,*.xy.fbcdn.net,\
                     *.xz.fbcdn.net,facebook.com,fb.com,messenger.com',
        j3a_client='bfcc1a3891601edb4f137ab7ab25b840',
        j3a_server='2d1eb5817ece335c24904f516ad5da12')
  • From pcap to Pandas DataFrame?
my_dataframe = NFStreamer(source='devil.pcap').to_pandas()
my_dataframe.head(5)
  • Didn't find a specific flow feature? add a plugin to nfstream in few lines:
from nfstream import NFPlugin

class packet_with_666_size(NFPlugin):
    def on_init(self, pkt): # flow creation with the first packet
        if pkt.raw_size == 666:
            return 1
        else:
            return 0

    def on_update(self, pkt, flow): # flow update with each packet belonging to the flow
        if pkt.raw_size == 666:
            flow.packet_with_666_size += 1

streamer_awesome = NFStreamer(source='devil.pcap', plugins=[packet_with_666_size()])
for flow in streamer_awesome:
    print(flow.packet_with_666_size) # see your dynamically created metric in generated flows

Run your Machine Learning models

In the following, we want to run an early classification of flows based on a trained machine learning model than takes as features the 3 first packets size of a flow.

Computing required features

from nfstream import NFPlugin

class feat_1(NFPlugin):
    def on_init(self, obs):
        entry.feat_1 == obs.raw_size

class feat_2(NFPlugin):
    def on_update(self, obs, entry):
        if entry.bidirectional_packets == 2:
            entry.feat_2 == obs.raw_size

class feat_3(NFPlugin):
    def on_update(self, obs, entry):
        if entry.bidirectional_packets == 3:
            entry.feat_3 == obs.raw_size

Trained model mrediction

class model_prediction(NFPlugin):
    def on_update(self, obs, entry):
        if entry.bidirectional_packets == 3:
            entry.model_prediction = self.user_data.predict_proba([entry.feat_1,
                                                                   entry.feat_2,
                                                                   entry.feat_3])
            # optionally we can force NFStreamer to immediately expires the flow
            # entry.expiration_id = -1

Start your ML powered streamer

my_model = function_to_load_your_model() # or whatever
ml_streamer = NFStreamer(source='devil.pcap',
                         plugins=[feat_1(volatile=True),
                                  feat_2(volatile=True),
                                  feat_3(volatile=True),
                                  model_prediction(user_data=my_model)
                                  ])
for flow in ml_streamer:
     print(flow.model_prediction) # now you will see your trained model prediction.

Installation

Using pip

Binary installers for the latest released version are available:

python3 -m pip install nfstream

Build from sources

If you want to build nfstream from sources on your local machine:

linux Linux

sudo apt-get install autoconf automake libtool pkg-config libpcap-dev
git clone https://github.com/aouinizied/nfstream.git
cd nfstream
python3 -m pip install -r requirements.txt
python3 setup.py bdist_wheel

osx MacOS

brew install autoconf automake libtool pkg-config
git clone https://github.com/aouinizied/nfstream.git
cd nfstream
python3 -m pip install -r requirements.txt
python3 setup.py bdist_wheel

Contributing

Please read Contributing for details on our code of conduct, and the process for submitting pull requests to us.

Authors

Zied Aouini created nfstream and these fine people have contributed.

Ethics

nfstream is intended for network data research and forensics. Researchers and network data scientists can use these framework to build reliable datasets, train and evaluate network applied machine learning models. As with any packet monitoring tool, nfstream could potentially be misused. Do not run it on any network of which you are not the owner or the administrator.

License

This project is licensed under the GPLv3 License - see the License file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

nfstream-5.0.0-pp36-pypy36_pp73-macosx_10_15_x86_64.whl (330.7 kB view details)

Uploaded PyPy macOS 10.15+ x86-64

nfstream-5.0.0-cp38-cp38-manylinux1_x86_64.whl (977.9 kB view details)

Uploaded CPython 3.8

nfstream-5.0.0-cp38-cp38-macosx_10_15_x86_64.whl (330.6 kB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

nfstream-5.0.0-cp37-cp37m-manylinux1_x86_64.whl (977.9 kB view details)

Uploaded CPython 3.7m

nfstream-5.0.0-cp37-cp37m-macosx_10_15_x86_64.whl (330.6 kB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

nfstream-5.0.0-cp36-cp36m-manylinux1_x86_64.whl (977.9 kB view details)

Uploaded CPython 3.6m

nfstream-5.0.0-cp36-cp36m-macosx_10_15_x86_64.whl (330.6 kB view details)

Uploaded CPython 3.6m macOS 10.15+ x86-64

File details

Details for the file nfstream-5.0.0-pp36-pypy36_pp73-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.0.0-pp36-pypy36_pp73-manylinux1_x86_64.whl
  • Upload date:
  • Size: 977.9 kB
  • Tags: PyPy
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 PyPy/7.3.0

File hashes

Hashes for nfstream-5.0.0-pp36-pypy36_pp73-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 bd587ae586717e4f8a016c596c106df874a0d25e19ab791a85cdd5659b498dea
MD5 8480ee1b474981b7d1e2dc4d780eae36
BLAKE2b-256 94305d3a50ec62566e0e488d2f977ea3ad9df9a305796f780843b3e2166b488c

See more details on using hashes here.

File details

Details for the file nfstream-5.0.0-pp36-pypy36_pp73-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.0.0-pp36-pypy36_pp73-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 330.7 kB
  • Tags: PyPy, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 PyPy/7.3.1

File hashes

Hashes for nfstream-5.0.0-pp36-pypy36_pp73-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 017c6ae1655e6a8ed39fbeec1bc5bafcd669576408c97e502f89cef9981ac84d
MD5 9e4af68e94f7a10f35968a88b11930d2
BLAKE2b-256 910dab0581746f6d3a1cde466b7b947af35ee5afeac1d4266caaa1fcdca8261d

See more details on using hashes here.

File details

Details for the file nfstream-5.0.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.0.0-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 977.9 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for nfstream-5.0.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 833c1832dddc76c7c1b8c3614267673061327ab2ef9fa250d1fdf17bbb177799
MD5 73d2a1ea0f3cda8d4240c6c4d9bb0d40
BLAKE2b-256 193e50fc1b992f56e4be90a783aef6d4cd421eb7699358a10ac8d52d844e5c20

See more details on using hashes here.

File details

Details for the file nfstream-5.0.0-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.0.0-cp38-cp38-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 330.6 kB
  • Tags: CPython 3.8, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for nfstream-5.0.0-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 3f8088fbf41c1e3679ca5039473652540514c0d52e1974a65f6742e178325daa
MD5 1a2a00c46e2d6c0959921330807b187f
BLAKE2b-256 34dc3725d0ff136ef41058968573c992d1a6c8c05dc5cf9f3991a1d4db30949d

See more details on using hashes here.

File details

Details for the file nfstream-5.0.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.0.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 977.9 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.6

File hashes

Hashes for nfstream-5.0.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6754cabb7e6d50d2a2c8ec2b6ec38afd1cdeb3d5e4c8736681874c60d3771323
MD5 1c9f163533f56b6f881d8f07125614b4
BLAKE2b-256 e2f378568e50456c1d8c5662a0d6245a81df95922180266ecede194f54a68582

See more details on using hashes here.

File details

Details for the file nfstream-5.0.0-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.0.0-cp37-cp37m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 330.6 kB
  • Tags: CPython 3.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.6

File hashes

Hashes for nfstream-5.0.0-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 239d80abf1953d9dfd9c8ec706e10650a920976bb8d6b7e95d4a6ff4a5b59247
MD5 615c1515a5dbf5fd0ed0bf88e6f508ef
BLAKE2b-256 e3db9d45249dc4c473c43e9ba80d057c93e72306b27c6ba8f95360da8168c151

See more details on using hashes here.

File details

Details for the file nfstream-5.0.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: nfstream-5.0.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 977.9 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.10

File hashes

Hashes for nfstream-5.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5f9b284df496aec285967acda628bea932e0903dfdf884212f962e36c253569a
MD5 a41b30446b06163ab7eed4bab156dc3a
BLAKE2b-256 9b9af12a86eaea8cc427ed6049c5d26f1df1fd51b543379a9db27cc4c90f10cc

See more details on using hashes here.

File details

Details for the file nfstream-5.0.0-cp36-cp36m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: nfstream-5.0.0-cp36-cp36m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 330.6 kB
  • Tags: CPython 3.6m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.10

File hashes

Hashes for nfstream-5.0.0-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 b2e1ed81109e1d085f66261c6572d1dd2b6bffb07bf0f744a66f06ae2a5910f8
MD5 30bea15fd12b1a4edae4d575107cc8b5
BLAKE2b-256 eb34c02ba7513a731ce9e67462881548894c7cf82961f2437fc21a4f4f9c27b5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page