Skip to main content

Python library for extracting ML-ready features from encrypted network traffic

Project description

JoyfulJay Logo

JoyfulJay - Encrypted Traffic Feature Extraction

CI PyPI version Python 3.10+ License: MIT

JoyfulJay ML Ready Encrypted Traffic Research Tool

JoyfulJay is a Python library for extracting standardized, ML-ready features from encrypted network traffic. It operates on PCAP files and live network interfaces, producing feature vectors that capture timing, size, and protocol metadata patterns - all without decrypting any traffic.

Features

  • Encrypted Traffic Focus: Extract features proven effective for classifying TLS, QUIC, VPN, and Tor traffic
  • ML-Ready Output: Pandas DataFrames, NumPy arrays, CSV, JSON, or Parquet - ready for scikit-learn, PyTorch, etc.
  • Streaming Architecture: Process multi-GB PCAPs without loading them into memory
  • Live Capture: Real-time feature extraction from network interfaces
  • Remote Capture: Stream packets from remote devices over secure WebSocket (TLS/WSS)
  • Protocol Metadata: TLS handshake parsing, JA3/JA3S fingerprints, QUIC metadata
  • Traffic Fingerprinting: Detect Tor, VPN, and DoH traffic patterns
  • Tranalyzer Compatible: 387 features across 21 extractors, matching research-grade tools
  • Enterprise Ready: Kafka streaming, Prometheus metrics, mDNS discovery

Installation

pip install joyfuljay
# or
uv pip install joyfuljay

For optional features (same syntax works with uv pip):

# Fast parsing with dpkt
pip install joyfuljay[fast]

# High-speed capture with libpcap
pip install joyfuljay[libpcap]

# Kafka streaming output
pip install joyfuljay[kafka]

# Prometheus metrics
pip install joyfuljay[monitoring]

# mDNS server discovery
pip install joyfuljay[discovery]

# Connection graph analysis
pip install joyfuljay[graphs]

# All optional features
pip install joyfuljay[fast,kafka,monitoring,discovery,graphs]

Quick Start

Python API

from joyfuljay import extract_features_from_pcap

# Extract features from a PCAP file
features_df = extract_features_from_pcap("capture.pcap")

print(features_df.shape)
print(features_df.columns.tolist())
print(features_df.head())

Command Line

# Extract features to CSV
jj extract capture.pcap -o features.csv

# Live capture for 60 seconds
jj live eth0 --duration 60 -o live_features.csv

# Output as JSON
jj extract capture.pcap -o features.json --format json

Feature Groups

Group Features
Flow Metadata 5-tuple, duration, packet/byte counts
Timing Inter-arrival time statistics, burst metrics
Size Packet length statistics, payload bytes
TLS Version, cipher suite, SNI, JA3/JA3S fingerprints
QUIC Version, ALPN, connection IDs
Padding Fixed-size detection, constant-rate detection
Fingerprint Tor/VPN/DoH classification
TCP Analysis Flags, handshake, sequence/window analysis
MAC/Layer 2 Source/dest MAC, VLAN, Ethernet type
ICMP Type/code, echo success ratio
Connection Graphs Fan-out, communities, centrality (requires [graphs])

Remote Capture

Stream packets from a remote device (e.g., Android phone, Raspberry Pi) to your analysis machine:

# On the capture device - start server with TLS
jj serve wlan0 --tls-cert server.crt --tls-key server.key --announce

# On your machine - discover and connect
jj discover                    # Find servers on LAN
jj connect jj://192.168.1.50:8765?token=xxx&tls=1 -o features.csv

Kafka Streaming

Stream features directly to Kafka for real-time pipelines:

from joyfuljay.output.kafka import KafkaWriter

with KafkaWriter("localhost:9092", topic="network-features") as writer:
    for features in extract_features_streaming("capture.pcap"):
        writer.write(features)

Prometheus Metrics

Export processing metrics for monitoring:

from joyfuljay.monitoring import PrometheusMetrics, start_prometheus_server

metrics = PrometheusMetrics()
start_prometheus_server(9090)  # Scrape at http://localhost:9090/metrics

Requirements

  • Python 3.10+
  • scapy >= 2.5.0
  • pandas >= 2.0.0
  • numpy >= 1.24.0

Cross-Platform Support

Feature Linux macOS Windows
PCAP file processing
Live capture ✅ (requires Npcap)

Check your system status with:

jj status

Documentation

Full documentation: https://joyfuljay.readthedocs.io

Citation

If you use JoyfulJay in academic research, please cite:

@software{joyfuljay2025,
  title = {{JoyfulJay}: Encrypted Traffic Feature Extraction Library},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/cenab/joyfuljay}
}

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

joyfuljay-0.1.0.tar.gz (281.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

joyfuljay-0.1.0-py3-none-any.whl (264.4 kB view details)

Uploaded Python 3

File details

Details for the file joyfuljay-0.1.0.tar.gz.

File metadata

  • Download URL: joyfuljay-0.1.0.tar.gz
  • Upload date:
  • Size: 281.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for joyfuljay-0.1.0.tar.gz
Algorithm Hash digest
SHA256 290f64dcaa952dc250fcd3bb1edade5e968c3e2b81427164af40e8abfc416878
MD5 f8918d02a69334ddf49ac7403b180c89
BLAKE2b-256 53008721ccfd40902b5c72f3887197b6292ff8353e48bc6f150951921634cae9

See more details on using hashes here.

File details

Details for the file joyfuljay-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: joyfuljay-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 264.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for joyfuljay-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 95c62c7b7879dc59616b9ea2b1c81813650ff834dcb1f5d39f07816ed85212e2
MD5 360bb6958c96c5e1dba7b2877d370b0e
BLAKE2b-256 86cd01a506087891cd2f89119a980174afbc59569cc1afa23394e236012dd21c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page