Parse pcap files and extract flow-related information
Project description
pcap-parser
Python tool to parse pcap files and extract flow-related network traffic information. Extracts hostnames from DNS, TLS SNI, DHCP, and reverse DNS lookups, then aggregates packets into flows with statistics.
Features
- Parse
.pcapand.pcapngfiles using tshark - Hostname enrichment from DNS queries, TLS SNI, DHCP, and reverse DNS
- Device metadata extraction (OUI vendor, HTTP user-agent)
- Flow aggregation with packet counts, byte counts, and inter-arrival times
- Domain extraction from hostnames
- Persistent IP-to-hostname cache across runs
Requirements
- Python 3.9+
- tshark (comes with Wireshark)
Installation
pip install git+https://github.com/nyu-mlab/pcap-parser.git
Or for development:
git clone https://github.com/nyu-mlab/pcap-parser.git
cd pcap-parser
pip install -e ".[dev]"
Usage
Parse pcap files
Parse a single file:
pcap-parse output.csv /path/to/capture.pcap
Parse all pcap files in a directory:
pcap-parse output.csv /path/to/pcap_directory/
Aggregate into flows
After parsing, aggregate packets into flows:
pcap-flow output.csv aggregated_flows.csv
Output
pcap-parse produces a CSV with columns including:
| Column | Description |
|---|---|
frame.time_epoch |
Packet timestamp |
ip.src / ip.dst |
Source and destination IPs |
tcp.srcport / tcp.dstport |
TCP ports |
udp.srcport / udp.dstport |
UDP ports |
_ws.col.Protocol |
Protocol (TCP, UDP, DNS, TLS, etc.) |
frame.len |
Packet length in bytes |
src_hostname / dst_hostname |
Resolved hostnames |
dhcp_hostname |
DHCP-advertised hostname |
eth.src.oui_resolved |
Device vendor from MAC OUI |
http.user_agent |
HTTP user-agent string |
pcap-flow aggregates these into flows with start/end timestamps, byte counts, packet counts, and average inter-arrival times.
Running tests
pytest tests/ -v
License
MIT — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pcap_extract-0.1.0.tar.gz.
File metadata
- Download URL: pcap_extract-0.1.0.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a7cbd00c0ea21d9dd8ed820a396deee0f2d048c8e6232e29202019845bf58e8
|
|
| MD5 |
b5d86d6ba2f6d500c13f3227b48178ab
|
|
| BLAKE2b-256 |
e90d5f86a80d8245dcc10fda1c86b8fad5b0a9b57d3779fba8b8f928a2a0353e
|
Provenance
The following attestation bundles were made for pcap_extract-0.1.0.tar.gz:
Publisher:
publish.yml on nyu-mlab/pcap-parser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pcap_extract-0.1.0.tar.gz -
Subject digest:
4a7cbd00c0ea21d9dd8ed820a396deee0f2d048c8e6232e29202019845bf58e8 - Sigstore transparency entry: 1191558700
- Sigstore integration time:
-
Permalink:
nyu-mlab/pcap-parser@7a22cec17b1bf743d546e03671ebe9b2362da67f -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/nyu-mlab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7a22cec17b1bf743d546e03671ebe9b2362da67f -
Trigger Event:
release
-
Statement type:
File details
Details for the file pcap_extract-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pcap_extract-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db8bb7119bf9419784cf4a14866a9f9d684c3063e70c0cc01c63d1e7ff8baf39
|
|
| MD5 |
27670f180436557f7f39c2ec0d28058a
|
|
| BLAKE2b-256 |
727abc8057b48a039936bd50183720c3af56f8b7ff1aad70f84dd8e25d90191d
|
Provenance
The following attestation bundles were made for pcap_extract-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on nyu-mlab/pcap-parser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pcap_extract-0.1.0-py3-none-any.whl -
Subject digest:
db8bb7119bf9419784cf4a14866a9f9d684c3063e70c0cc01c63d1e7ff8baf39 - Sigstore transparency entry: 1191558702
- Sigstore integration time:
-
Permalink:
nyu-mlab/pcap-parser@7a22cec17b1bf743d546e03671ebe9b2362da67f -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/nyu-mlab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7a22cec17b1bf743d546e03671ebe9b2362da67f -
Trigger Event:
release
-
Statement type: