Skip to main content

Convenience functionality to annotate and reduce wireshark captures.

Project description

SharkReduce 🦈🔻

Utility to convert wireshark captures to useful datasets.

Installation

pip install sharkreduce

Steps to create reduced dataset

Step 1: Export capture using tshark

tshark -t e -Ndmnt -r all.pcapng -T fields \
  -e frame.time \
  -e eth.src_resolved \
  -e eth.dst_resolved  \
  -e _ws.col.Source \
  -e _ws.col.Destination \
  -e _ws.col.Protocol \
  -e tcp.len \
  -e udp.length \
  -e tcp.srcport \
  -e tcp.dstport \
  -e udp.srcport \
  -e udp.dstport \
  -e _ws.col.Time '(eth.type == 0x800) or (eth.type == 0x86dd)' > reduceme.tsv

Step 2: Fill device id file

Sharkreduce requires a device mac-address map file to create pretty reduced output.

Run it once to pre-fill the file with all mac IDs from your capture:

python3 -m sharkreduce -t 60000000 -n devices.yaml reduceme.tsv reduced.tsv

Note: -t 60000000 means that SharkReduce will create reduced bins of connections up to 60 seconds.

Note: The command will initially create an empty reduced.tsv!

The command will create a file called devices.yaml which looks like this:

devices: []
ignore: {}
unclassified:
  some-mac-adress:
    - dns-name
    - ip-adress
    - etc.

For each unclassified mac address, you will see some aliases (IP/Hostname) which helps you to determine which device the MAC belongs to.

If the MAC belongs to a device you want to analyse, think of a good name and create an entry under devices. Otherwise create an entry under ignored. Do this for all unclassified MACs:

devices:
  SuspiciousDevice:
    - some-mac-address
    - some-hostname
  OtherInterestingDevice:
    - other-mac
ignore:
  - router-mac-address
  - broadcast address

Step 3: Create initial reduced output.

Re-run

python3 -m sharkreduce -t 60000000 -n devices.yaml reduceme.tsv reduced.tsv

This time reduced.tsv should have a lot of data in it, with device identifiers that look very readable to you.

Step 4: Create/fill activity file

In order to determine activity periods for your devices to correctly fill the active_use column in the reduced output, you have to create an activity file. You can initialise the file like this:

python3 -m sharkreduce --init-activity -t 3600000000 -n devices.yaml reduceme.tsv reduced.tsv

This will READ from reduced.tsv to create a file called activity.yaml, where you can annotate your activity per device in ~1h intervals.

Set the active field to True where you think you have been actively using the device.

Step 5: Re-run sharkreduce with filled activity file

If it exists, sharkreduce will use the content of activity.yaml to fill the active_use column. So you just need to re-run ...

python3 -m sharkreduce -t 60000000 -n devices.yaml reduceme.tsv reduced.tsv

Note: If you change the devices.yaml you have to re-do steps 3, 4 and 5.

Step 6: Party! 🥳

Your pretty reduced capture with annotated device names and activity is now stored under reduced.tsv.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sharkreduce-0.2.0.tar.gz (8.7 kB view hashes)

Uploaded Source

Built Distribution

sharkreduce-0.2.0-py3-none-any.whl (8.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page