Skip to main content

Parse pcap files and extract flow-related information

Project description

pcap-parser

Python tool to parse pcap files and extract flow-related network traffic information. Extracts hostnames from DNS, TLS SNI, DHCP, and reverse DNS lookups, then aggregates packets into flows with statistics.

Features

  • Parse .pcap and .pcapng files using tshark
  • Hostname enrichment from DNS queries, TLS SNI, DHCP, and reverse DNS
  • Device metadata extraction (OUI vendor, HTTP user-agent)
  • Flow aggregation with packet counts, byte counts, and inter-arrival times
  • Domain extraction from hostnames
  • Persistent IP-to-hostname cache across runs

Requirements

  • Python 3.9+
  • tshark (comes with Wireshark)

Installation

pip install git+https://github.com/nyu-mlab/pcap-parser.git

Or for development:

git clone https://github.com/nyu-mlab/pcap-parser.git
cd pcap-parser
pip install -e ".[dev]"

Usage

Parse pcap files

Parse a single file:

pcap-parse output.csv /path/to/capture.pcap

Parse all pcap files in a directory:

pcap-parse output.csv /path/to/pcap_directory/

Aggregate into flows

After parsing, aggregate packets into flows:

pcap-flow output.csv aggregated_flows.csv

Output

pcap-parse produces a CSV with columns including:

Column Description
frame.time_epoch Packet timestamp
ip.src / ip.dst Source and destination IPs
tcp.srcport / tcp.dstport TCP ports
udp.srcport / udp.dstport UDP ports
_ws.col.Protocol Protocol (TCP, UDP, DNS, TLS, etc.)
frame.len Packet length in bytes
src_hostname / dst_hostname Resolved hostnames
dhcp_hostname DHCP-advertised hostname
eth.src.oui_resolved Device vendor from MAC OUI
http.user_agent HTTP user-agent string

pcap-flow aggregates these into flows with start/end timestamps, byte counts, packet counts, and average inter-arrival times.

Running tests

pytest tests/ -v

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pcap_extract-0.1.0.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pcap_extract-0.1.0-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file pcap_extract-0.1.0.tar.gz.

File metadata

  • Download URL: pcap_extract-0.1.0.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pcap_extract-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4a7cbd00c0ea21d9dd8ed820a396deee0f2d048c8e6232e29202019845bf58e8
MD5 b5d86d6ba2f6d500c13f3227b48178ab
BLAKE2b-256 e90d5f86a80d8245dcc10fda1c86b8fad5b0a9b57d3779fba8b8f928a2a0353e

See more details on using hashes here.

Provenance

The following attestation bundles were made for pcap_extract-0.1.0.tar.gz:

Publisher: publish.yml on nyu-mlab/pcap-parser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pcap_extract-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pcap_extract-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pcap_extract-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 db8bb7119bf9419784cf4a14866a9f9d684c3063e70c0cc01c63d1e7ff8baf39
MD5 27670f180436557f7f39c2ec0d28058a
BLAKE2b-256 727abc8057b48a039936bd50183720c3af56f8b7ff1aad70f84dd8e25d90191d

See more details on using hashes here.

Provenance

The following attestation bundles were made for pcap_extract-0.1.0-py3-none-any.whl:

Publisher: publish.yml on nyu-mlab/pcap-parser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page