Skip to main content

Python library for extracting ML-ready features from encrypted network traffic

Project description

JoyfulJay Logo

JoyfulJay - Encrypted Traffic Feature Extraction

CI PyPI version Python 3.10+ License: MIT

JoyfulJay ML Ready Encrypted Traffic Research Tool

JoyfulJay is a Python library for extracting standardized, ML-ready features from encrypted network traffic. It operates on PCAP files and live network interfaces, producing feature vectors that capture timing, size, and protocol metadata patterns - all without decrypting any traffic.

Features

  • Encrypted Traffic Focus: Extract features proven effective for classifying TLS, QUIC, VPN, and Tor traffic
  • ML-Ready Output: Pandas DataFrames, NumPy arrays, CSV, JSON, or Parquet - ready for scikit-learn, PyTorch, etc.
  • Streaming Architecture: Process multi-GB PCAPs without loading them into memory
  • Live Capture: Real-time feature extraction from network interfaces
  • Remote Capture: Stream packets from remote devices over secure WebSocket (TLS/WSS)
  • Protocol Metadata: TLS handshake parsing, JA3/JA3S fingerprints, QUIC metadata
  • Traffic Fingerprinting: Detect Tor, VPN, and DoH traffic patterns
  • Tranalyzer Compatible: 387 features across 21 extractors, matching research-grade tools
  • Enterprise Ready: Kafka streaming, Prometheus metrics, mDNS discovery

Installation

pip install joyfuljay
# or
uv pip install joyfuljay

For optional features (same syntax works with uv pip):

# Fast parsing with dpkt
pip install joyfuljay[fast]

# High-speed capture with libpcap
pip install joyfuljay[libpcap]

# Kafka streaming output
pip install joyfuljay[kafka]

# Prometheus metrics
pip install joyfuljay[monitoring]

# mDNS server discovery
pip install joyfuljay[discovery]

# Connection graph analysis
pip install joyfuljay[graphs]

# All optional features
pip install joyfuljay[fast,kafka,monitoring,discovery,graphs]

Quick Start

Python API

from joyfuljay import extract_features_from_pcap

# Extract features from a PCAP file
features_df = extract_features_from_pcap("capture.pcap")

print(features_df.shape)
print(features_df.columns.tolist())
print(features_df.head())

Command Line

# Extract features to CSV
jj extract capture.pcap -o features.csv

# Live capture for 60 seconds
jj live eth0 --duration 60 -o live_features.csv

# Output as JSON
jj extract capture.pcap -o features.json --format json

Feature Groups

Group Features
Flow Metadata 5-tuple, duration, packet/byte counts
Timing Inter-arrival time statistics, burst metrics
Size Packet length statistics, payload bytes
TLS Version, cipher suite, SNI, JA3/JA3S fingerprints
QUIC Version, ALPN, connection IDs
Padding Fixed-size detection, constant-rate detection
Fingerprint Tor/VPN/DoH classification
TCP Analysis Flags, handshake, sequence/window analysis
MAC/Layer 2 Source/dest MAC, VLAN, Ethernet type
ICMP Type/code, echo success ratio
Connection Graphs Fan-out, communities, centrality (requires [graphs])

Remote Capture

Stream packets from a remote device (e.g., Android phone, Raspberry Pi) to your analysis machine:

# On the capture device - start server with TLS
jj serve wlan0 --tls-cert server.crt --tls-key server.key --announce

# On your machine - discover and connect
jj discover                    # Find servers on LAN
jj connect jj://192.168.1.50:8765?token=xxx&tls=1 -o features.csv

Kafka Streaming

Stream features directly to Kafka for real-time pipelines:

from joyfuljay.output.kafka import KafkaWriter

with KafkaWriter("localhost:9092", topic="network-features") as writer:
    for features in extract_features_streaming("capture.pcap"):
        writer.write(features)

Prometheus Metrics

Export processing metrics for monitoring:

from joyfuljay.monitoring import PrometheusMetrics, start_prometheus_server

metrics = PrometheusMetrics()
start_prometheus_server(9090)  # Scrape at http://localhost:9090/metrics

Requirements

  • Python 3.10+
  • scapy >= 2.5.0
  • pandas >= 2.0.0
  • numpy >= 1.24.0

Cross-Platform Support

Feature Linux macOS Windows
PCAP file processing
Live capture ✅ (requires Npcap)

Check your system status with:

jj status

Documentation

Full documentation: docs.joyfuljay.com

Citation

If you use JoyfulJay in academic research, please cite:

@software{joyfuljay2025,
  title = {{JoyfulJay}: Encrypted Traffic Feature Extraction Library},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/cenab/joyfuljay}
}

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

joyfuljay-0.1.6.1.tar.gz (287.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

joyfuljay-0.1.6.1-py3-none-any.whl (269.2 kB view details)

Uploaded Python 3

File details

Details for the file joyfuljay-0.1.6.1.tar.gz.

File metadata

  • Download URL: joyfuljay-0.1.6.1.tar.gz
  • Upload date:
  • Size: 287.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for joyfuljay-0.1.6.1.tar.gz
Algorithm Hash digest
SHA256 27909e9820567130a8ac6b48d0a87b85d4a5330af7838f15c66afa3e130dae69
MD5 34c18c2b77870a7bf8f2a23d8d5a1e0c
BLAKE2b-256 44b247eee94b8662e84f67cd4744324e93b489f3b6f75ae263e24ccd1b67972f

See more details on using hashes here.

File details

Details for the file joyfuljay-0.1.6.1-py3-none-any.whl.

File metadata

  • Download URL: joyfuljay-0.1.6.1-py3-none-any.whl
  • Upload date:
  • Size: 269.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for joyfuljay-0.1.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b049f0c96afde772c84ada6909400122e411019c3c321db18b197d763dad0124
MD5 ee325cd43f0a5d07f11a3023d2343844
BLAKE2b-256 2b2e50d11d02fdd3aeaabdd43a5c477e228769d346ff004e78f71d6c97b3c586

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page