Skip to main content

A framework for analysis and modeling of IP network flows

Project description

flow-models: A framework for analysis and modeling of IP network flows

Packages like flow-tools or nfdump provide tools for filtering and calculating simple summary/top-N statistics from network flow records. They lack, however, any capabilities for analysis and modeling of flow features (length, size, duration, rate, etc.) distributions. The goal of this framework is to fill this gap.

flow-models is a software framework for creating precise and reproducible statistical flow models from NetFlow/IPFIX flow records. It can be used to merge split records, calculate histograms of flow features and create General Mixture Models fitting them. Created models can be used both as an input in analytical calculations and to generate realistic traffic in simulations.

The framework can be installed from Python Package Index (PyPI) using the following command:

pip install flow-models

A detailed documentation, including usage examples, is available at: https://flow-models.readthedocs.io

Apart from the framework, the Git repository also contains a library of flow models created with it, including histograms and fitted mixture models.

Provided tools

The framework currently includes the following tools:

  • merge -- merges flows which were split across multiple records due to active timeout
  • sort -- sorts flow records according to specified fields (requires numpy)
  • hist -- calculates histograms of flows length, size, duration or rate
  • hist_np -- calculates histograms using multiple threads (requires numpy, much faster, but uses more memory)
  • fit -- creates General Mixture Models (GMM) fitted to flow records (requires scipy)
  • plot -- generates plots from flow records and fitted models (requires pandas and scipy)
  • generate -- generates flow records from histograms or mixture models
  • summary -- produces TeX tables containing summary statistics of flow dataset (requires scipy)
  • convert -- converts flow records between supported formats

Following the Unix philosophy, each tool is a separate Python program aimed at a single purpose. Features provided by the tools are orthogonal and they are tailored to be used sequentially in data-processing pipelines.

Models library

The repository of flow models, containing histogram CSV files, fitted mixture models and plots, is available at: https://github.com/piotrjurkiewicz/flow-models

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flow_models-2.0.tar.gz (47.5 kB view hashes)

Uploaded Source

Built Distribution

flow_models-2.0-py3-none-any.whl (60.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page