Skip to main content

FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic

Project description

FlowPrint

This repository contains the code for FlowPrint by the authors of the NDSS FlowPrint [1] paper [PDF]. Please cite FlowPrint when using it in academic publications. This master branch provides FlowPrint as an out of the box tool. For the original experiments from the paper, please checkout the NDSS branch.

Introduction

FlowPrint introduces a semi-supervised approach for fingerprinting mobile apps from (encrypted) network traffic. We automatically find temporal correlations among destination-related features of network traffic and use these correlations to generate app fingerprints. These fingerprints can later be reused to recognize known apps or to detect previously unseen apps. The main contribution of this work is to create network fingerprints without prior knowledge of the apps running in the network.

Installation

The easiest way to install FlowPrint is using pip

pip install flowprint

Manually

If you would like to install FlowPrint manually, please make sure you have installed the required dependencies.

Dependencies

This code is written in Python3 and depends on the following libraries:

  • Cryptography
  • Matplotlib
  • NetworkX
  • Numpy
  • Pyshark or tshark (works with both backends, tshark is much faster)
  • Scikit-learn

To install these use the following command

pip install -U cryptography matplotlib networkx numpy pyshark scikit-learn

If you'd like to use the tshark backend, please install tshark, on Ubuntu this can be done using

sudo apt install tshark

Note that FlowPrint will try to use tshark, if it cannot be found, it will default back to pyshark. It will display a warning message when tshark is not installed.

Usage

usage: flowprint.py [-h]
                    (--detection [FLOAT] | --fingerprint [FILE] | --recognition)
                    [-b BATCH] [-c CORRELATION], [-s SIMILARITY], [-w WINDOW]
                    [-p PCAPS...] [-rp READ...] [-wp WRITE]

Flowprint: Semi-Supervised Mobile-App
Fingerprinting on Encrypted Network Traffic

Arguments:
  -h, --help                 show this help message and exit

FlowPrint mode (select up to one):
  --fingerprint [FILE]       run in raw fingerprint generation mode (default)
                             outputs to terminal or json FILE
  --detection   FLOAT        run in unseen app detection mode with given
                             FLOAT threshold
  --recognition              run in app recognition mode

FlowPrint parameters:
  -b, --batch       FLOAT    batch size in seconds       (default=300)
  -c, --correlation FLOAT    cross-correlation threshold (default=0.1)
  -s, --similarity  FLOAT    similarity threshold        (default=0.9)
  -w, --window      FLOAT    window size in seconds      (default=30)

Flow data input/output (either --pcaps or --read required):
  -p, --pcaps  PATHS...      path to pcap(ng) files to run through FlowPrint
  -r, --read   PATHS...      read preprocessed data from given files
  -o, --write  PATH          write preprocessed data to given file
  -i, --split  FLOAT         fraction of data to select for testing (default= 0)
  -a, --random FLOAT         random state to use for split          (default=42)

Train/test input (for --detection/--recognition):
  -t, --train PATHS...       path to json files containing training fingerprints
  -e, --test  PATHS...       path to json files containing testing fingerprints

Run FlowPrint requires three steps:

  1. Preprocessing: transform .pcap files to flows that FlowPrint can interpret.
$ python3 -m flowprint --pcaps <data.pcap> --write <flows.p>
  1. Fingerprinting: extract fingerprints from flows.
$ python3 -m flowprint --read <flows.p> --fingerprint <fingerprints.json> --split 0.5
  1. Application: use FlowPrint to recognize apps or detect previously unknown apps.
$ python3 -m flowprint --train <fingerprints.train.json> --test <fingerprints.test.json> --recognition
$ python3 -m flowprint --train <fingerprints.train.json> --test <fingerprints.test.json> --detection 0.1

References

[1] van Ede, T., Bortolameotti, R., Continella, A., Ren, J., Dubois, D. J., Lindorfer, M., Choffnes, D., van Steen, M. & Peter, A. (2020, February). FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic. In 2020 NDSS. The Internet Society.

Bibtex

@inproceedings{vanede2020flowprint,
  title={{FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic}},
  author={van Ede, Thijs and Bortolameotti, Riccardo and Continella, Andrea and Ren, Jingjing and Dubois, Daniel J. and Lindorfer, Martina and Choffness, David and van Steen, Maarten, and Peter, Andreas}
  booktitle={NDSS},
  year={2020},
  organization={The Internet Society}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flowprint-0.0.5.tar.gz (21.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flowprint-0.0.5-py3-none-any.whl (27.8 kB view details)

Uploaded Python 3

File details

Details for the file flowprint-0.0.5.tar.gz.

File metadata

  • Download URL: flowprint-0.0.5.tar.gz
  • Upload date:
  • Size: 21.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.6.9

File hashes

Hashes for flowprint-0.0.5.tar.gz
Algorithm Hash digest
SHA256 0e649abe6c15e18830a2881539e31827870d24a2f5ba33a7d9d3fe7353d43e99
MD5 e9e8aa83e1b8c1a2ee56aa56f50ea97a
BLAKE2b-256 defa70d870ce82609a2c5937911cc3fbb8b9f6145a749f57504ee84f7917dfe0

See more details on using hashes here.

File details

Details for the file flowprint-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: flowprint-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 27.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.6.9

File hashes

Hashes for flowprint-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 6841e0f56f17feecdebb8bdcd6822991b9c24cf3e94b612d66ed5c7d27686057
MD5 af8dea582952086fd6c8c996abf6cb32
BLAKE2b-256 d712da5747854e6433faf0d9425c39f1297eac8faa7edda866434f92e198dd8a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page