Skip to main content

Python library for extracting ML-ready features from encrypted network traffic

Project description

JoyfulJay Logo

JoyfulJay - Encrypted Traffic Feature Extraction

CI PyPI version Python 3.10+ License: MIT

JoyfulJay ML Ready Encrypted Traffic Research Tool

JoyfulJay is a Python library for extracting standardized, ML-ready features from encrypted network traffic. It operates on PCAP files and live network interfaces, producing feature vectors that capture timing, size, and protocol metadata patterns - all without decrypting any traffic.

Features

  • Encrypted Traffic Focus: Extract features proven effective for classifying TLS, QUIC, VPN, and Tor traffic
  • ML-Ready Output: Pandas DataFrames, NumPy arrays, CSV, JSON, or Parquet - ready for scikit-learn, PyTorch, etc.
  • Streaming Architecture: Process multi-GB PCAPs without loading them into memory
  • Live Capture: Real-time feature extraction from network interfaces
  • Remote Capture: Stream packets from remote devices over secure WebSocket (TLS/WSS)
  • Protocol Metadata: TLS handshake parsing, JA3/JA3S fingerprints, QUIC metadata
  • Traffic Fingerprinting: Detect Tor, VPN, and DoH traffic patterns
  • Tranalyzer Compatible: 387 features across 21 extractors, matching research-grade tools
  • Enterprise Ready: Kafka streaming, Prometheus metrics, mDNS discovery

Installation

pip install joyfuljay
# or
uv pip install joyfuljay

For optional features (same syntax works with uv pip):

# Fast parsing with dpkt
pip install joyfuljay[fast]

# High-speed capture with libpcap
pip install joyfuljay[libpcap]

# Kafka streaming output
pip install joyfuljay[kafka]

# Prometheus metrics
pip install joyfuljay[monitoring]

# mDNS server discovery
pip install joyfuljay[discovery]

# Connection graph analysis
pip install joyfuljay[graphs]

# All optional features
pip install joyfuljay[fast,kafka,monitoring,discovery,graphs]

Quick Start

Python API

from joyfuljay import extract_features_from_pcap

# Extract features from a PCAP file
features_df = extract_features_from_pcap("capture.pcap")

print(features_df.shape)
print(features_df.columns.tolist())
print(features_df.head())

Command Line

# Extract features to CSV
jj extract capture.pcap -o features.csv

# Live capture for 60 seconds
jj live eth0 --duration 60 -o live_features.csv

# Output as JSON
jj extract capture.pcap -o features.json --format json

Feature Groups

Group Features
Flow Metadata 5-tuple, duration, packet/byte counts
Timing Inter-arrival time statistics, burst metrics
Size Packet length statistics, payload bytes
TLS Version, cipher suite, SNI, JA3/JA3S fingerprints
QUIC Version, ALPN, connection IDs
Padding Fixed-size detection, constant-rate detection
Fingerprint Tor/VPN/DoH classification
TCP Analysis Flags, handshake, sequence/window analysis
MAC/Layer 2 Source/dest MAC, VLAN, Ethernet type
ICMP Type/code, echo success ratio
Connection Graphs Fan-out, communities, centrality (requires [graphs])

Remote Capture

Stream packets from a remote device (e.g., Android phone, Raspberry Pi) to your analysis machine:

# On the capture device - start server with TLS
jj serve wlan0 --tls-cert server.crt --tls-key server.key --announce

# On your machine - discover and connect
jj discover                    # Find servers on LAN
jj connect jj://192.168.1.50:8765?token=xxx&tls=1 -o features.csv

Kafka Streaming

Stream features directly to Kafka for real-time pipelines:

from joyfuljay.output.kafka import KafkaWriter

with KafkaWriter("localhost:9092", topic="network-features") as writer:
    for features in extract_features_streaming("capture.pcap"):
        writer.write(features)

Prometheus Metrics

Export processing metrics for monitoring:

from joyfuljay.monitoring import PrometheusMetrics, start_prometheus_server

metrics = PrometheusMetrics()
start_prometheus_server(9090)  # Scrape at http://localhost:9090/metrics

Requirements

  • Python 3.10+
  • scapy >= 2.5.0
  • pandas >= 2.0.0
  • numpy >= 1.24.0

Cross-Platform Support

Feature Linux macOS Windows
PCAP file processing
Live capture ✅ (requires Npcap)

Check your system status with:

jj status

Documentation

Full documentation: docs.joyfuljay.com

Citation

If you use JoyfulJay in academic research, please cite:

@software{joyfuljay2025,
  title = {{JoyfulJay}: Encrypted Traffic Feature Extraction Library},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/cenab/joyfuljay}
}

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

joyfuljay-0.1.5.tar.gz (287.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

joyfuljay-0.1.5-py3-none-any.whl (269.2 kB view details)

Uploaded Python 3

File details

Details for the file joyfuljay-0.1.5.tar.gz.

File metadata

  • Download URL: joyfuljay-0.1.5.tar.gz
  • Upload date:
  • Size: 287.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for joyfuljay-0.1.5.tar.gz
Algorithm Hash digest
SHA256 33779e418870a4408e1758567b7e0c02628da36f0f6c30811af9d961eb5d949c
MD5 c7074053e6ed22af9d71e1afe297b07c
BLAKE2b-256 ee318c0855a916af74815d03e5448a411b1be59e34bc22900e994cbbd0c3ea85

See more details on using hashes here.

File details

Details for the file joyfuljay-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: joyfuljay-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 269.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for joyfuljay-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 ef186674a047a1bc7f4069d07946c0e5a38dde29bf2a789f6dde8e8eaf9b2f5e
MD5 ad64a90e668ad65541ecff71a1c2b0b4
BLAKE2b-256 131b8d4d4cfada73314f2bd16a804a3bd576ab47f0d4f6322d7cfdf08a566d9e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page