Skip to main content

A ML/DL framework for Traffic Classification

Project description

An ML/DL framework for Traffic Classification TC

domain Documentation

tcbench design is cored in the following objectives:

  • Easing ML/DL models training/testing results replicability.
  • Tight integration with public TC datasets with ease data installation and curation,
  • Model tracking via AIM.
  • Rich command line for executing modeling campaings and collecting performance reports.

...wait, what is Traffic Classification?

A computer network is formed by hosts that exchange information, namely packets, according to standardized protocols (e.g., HTTP is the protocol used for the web). So to properly operate/manage networks one is required to monitor this flow of information and react accordingly. For instance, in an office/enterprise environment, one might want to prioritize video meeting traffic while limit social media traffic.

Traffic classification is the the act of labeling an exchange of packets between network hosts based on the application that generated it. For instance, you want to identify traffic related to zoom/webx/skype/etc. calls or traffic related to twitter/instagram/facebook/mastodon out of all traffic flowing throught the network.

Motivations

The academic literature is ripe with methods and proposals for TC. Yet, it is scarce of code artifacts and public datasets do not offer common conventions of use.

We designed tcbench with the following goals in mind:

Goal State of the art tcbench
Data curation There are a few public datasets for TC, yet no common format/schema, cleaning process, or standard train/val/test folds. An (opinionated) curation of datasets to create easy to use parquet files with associated train/val/test fold.
Code TC literature has no reference code base for ML/DL modeling tcbench is open source with an easy to use CLI based on click
Model tracking Most of ML framework requires integration with cloud environments and subscription services tcbench uses aimstack to save on local servers metrics during training which can be later explored via its web UI or aggregated in report summaries using tcbench

Install

Create a conda environment

conda create -n tcbench python=3.10 pip
conda activate tcbench
python -m pip install tcbench

For the developer version

python -m pip install tcbench[dev]

Features and roadmap

tcbench is still under development, but (as suggested by its name) ultimately aims to be a reference framework for benchmarking multiple ML/DL solutions related to TC.

At the current stage, tcbench offers

  • Integration with 4 datasets, namely ucdavis-icdm19, mirage19, mirage22 and utmobilenet21. You can use these datasets and their curated version independently from tcbench. Check out the dataset install process and dataset loading tutorial.

  • Good support for flowpic input representation.

  • Initial support for for 1d packet time series (based on network packets properties) input representation.

  • Data augmentation functionality for flowpic input representation.

  • Modeling via XGBoost, vanilla DL supervision and contrastive learning (via SimCLR or SupCon).

More exiting features including more datasets and algorithms will come in the next months.

Stay tuned ;)!

Papers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tcbench-0.0.22.tar.gz (96.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tcbench-0.0.22-py3-none-any.whl (115.1 kB view details)

Uploaded Python 3

File details

Details for the file tcbench-0.0.22.tar.gz.

File metadata

  • Download URL: tcbench-0.0.22.tar.gz
  • Upload date:
  • Size: 96.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for tcbench-0.0.22.tar.gz
Algorithm Hash digest
SHA256 3d4cbf9e4403d3abf42534517002a3344d4a18ee564e5a95c6dcdbbee4cfb8ed
MD5 2ff0beaa776ac63828de953c269267a4
BLAKE2b-256 f2ab09cc49069893c1241e7fd6cc1516e99e4f469294dab89ac2989e43a22873

See more details on using hashes here.

File details

Details for the file tcbench-0.0.22-py3-none-any.whl.

File metadata

  • Download URL: tcbench-0.0.22-py3-none-any.whl
  • Upload date:
  • Size: 115.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for tcbench-0.0.22-py3-none-any.whl
Algorithm Hash digest
SHA256 72b3b1b8d421eca6c8c1c6dee826ec7564a924a7c2e633fdb884089c7986fa9a
MD5 b27e29a632b44482d8acc8130106711a
BLAKE2b-256 ad691efc51e4044b22a48e046a28a20c673dfca957059d64215ee98c26b375f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page