Python library for extracting ML-ready features from encrypted network traffic
Project description
JoyfulJay is a Python library for extracting standardized, ML-ready features from encrypted network traffic. It operates on PCAP files and live network interfaces, producing feature vectors that capture timing, size, and protocol metadata patterns - all without decrypting any traffic.
Features
- Encrypted Traffic Focus: Extract features proven effective for classifying TLS, QUIC, VPN, and Tor traffic
- ML-Ready Output: Pandas DataFrames, NumPy arrays, CSV, JSON, or Parquet - ready for scikit-learn, PyTorch, etc.
- Streaming Architecture: Process multi-GB PCAPs without loading them into memory
- Live Capture: Real-time feature extraction from network interfaces
- Remote Capture: Stream packets from remote devices over secure WebSocket (TLS/WSS)
- Protocol Metadata: TLS handshake parsing, JA3/JA3S fingerprints, QUIC metadata
- Traffic Fingerprinting: Detect Tor, VPN, and DoH traffic patterns
- Tranalyzer Compatible: 387 features across 21 extractors, matching research-grade tools
- Enterprise Ready: Kafka streaming, Prometheus metrics, mDNS discovery
Installation
pip install joyfuljay
# or
uv pip install joyfuljay
For optional features (same syntax works with uv pip):
# Fast parsing with dpkt
pip install joyfuljay[fast]
# High-speed capture with libpcap
pip install joyfuljay[libpcap]
# Kafka streaming output
pip install joyfuljay[kafka]
# Prometheus metrics
pip install joyfuljay[monitoring]
# mDNS server discovery
pip install joyfuljay[discovery]
# Connection graph analysis
pip install joyfuljay[graphs]
# All optional features
pip install joyfuljay[fast,kafka,monitoring,discovery,graphs]
Quick Start
Python API
from joyfuljay import extract_features_from_pcap
# Extract features from a PCAP file
features_df = extract_features_from_pcap("capture.pcap")
print(features_df.shape)
print(features_df.columns.tolist())
print(features_df.head())
Command Line
# Extract features to CSV
jj extract capture.pcap -o features.csv
# Live capture for 60 seconds
jj live eth0 --duration 60 -o live_features.csv
# Output as JSON
jj extract capture.pcap -o features.json --format json
Feature Groups
| Group | Features |
|---|---|
| Flow Metadata | 5-tuple, duration, packet/byte counts |
| Timing | Inter-arrival time statistics, burst metrics |
| Size | Packet length statistics, payload bytes |
| TLS | Version, cipher suite, SNI, JA3/JA3S fingerprints |
| QUIC | Version, ALPN, connection IDs |
| Padding | Fixed-size detection, constant-rate detection |
| Fingerprint | Tor/VPN/DoH classification |
| TCP Analysis | Flags, handshake, sequence/window analysis |
| MAC/Layer 2 | Source/dest MAC, VLAN, Ethernet type |
| ICMP | Type/code, echo success ratio |
| Connection Graphs | Fan-out, communities, centrality (requires [graphs]) |
Remote Capture
Stream packets from a remote device (e.g., Android phone, Raspberry Pi) to your analysis machine:
# On the capture device - start server with TLS
jj serve wlan0 --tls-cert server.crt --tls-key server.key --announce
# On your machine - discover and connect
jj discover # Find servers on LAN
jj connect jj://192.168.1.50:8765?token=xxx&tls=1 -o features.csv
Kafka Streaming
Stream features directly to Kafka for real-time pipelines:
from joyfuljay.output.kafka import KafkaWriter
with KafkaWriter("localhost:9092", topic="network-features") as writer:
for features in extract_features_streaming("capture.pcap"):
writer.write(features)
Prometheus Metrics
Export processing metrics for monitoring:
from joyfuljay.monitoring import PrometheusMetrics, start_prometheus_server
metrics = PrometheusMetrics()
start_prometheus_server(9090) # Scrape at http://localhost:9090/metrics
Requirements
- Python 3.10+
- scapy >= 2.5.0
- pandas >= 2.0.0
- numpy >= 1.24.0
Cross-Platform Support
| Feature | Linux | macOS | Windows |
|---|---|---|---|
| PCAP file processing | ✅ | ✅ | ✅ |
| Live capture | ✅ | ✅ | ✅ (requires Npcap) |
Check your system status with:
jj status
Documentation
Full documentation: docs.joyfuljay.com
Citation
If you use JoyfulJay in academic research, please cite:
@software{joyfuljay2025,
title = {{JoyfulJay}: Encrypted Traffic Feature Extraction Library},
year = {2025},
publisher = {GitHub},
url = {https://github.com/cenab/joyfuljay}
}
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file joyfuljay-0.1.5.tar.gz.
File metadata
- Download URL: joyfuljay-0.1.5.tar.gz
- Upload date:
- Size: 287.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33779e418870a4408e1758567b7e0c02628da36f0f6c30811af9d961eb5d949c
|
|
| MD5 |
c7074053e6ed22af9d71e1afe297b07c
|
|
| BLAKE2b-256 |
ee318c0855a916af74815d03e5448a411b1be59e34bc22900e994cbbd0c3ea85
|
File details
Details for the file joyfuljay-0.1.5-py3-none-any.whl.
File metadata
- Download URL: joyfuljay-0.1.5-py3-none-any.whl
- Upload date:
- Size: 269.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef186674a047a1bc7f4069d07946c0e5a38dde29bf2a789f6dde8e8eaf9b2f5e
|
|
| MD5 |
ad64a90e668ad65541ecff71a1c2b0b4
|
|
| BLAKE2b-256 |
131b8d4d4cfada73314f2bd16a804a3bd576ab47f0d4f6322d7cfdf08a566d9e
|