Comprehensive analysis tool for LLM streaming traffic from PCAP files
Project description
🦈 LLMShark
Comprehensive analysis tool for LLM streaming traffic from PCAP files
LLMShark is a powerful tool for analyzing Large Language Model (LLM) streaming traffic captured in PCAP files. It provides in-depth analysis of HTTP/SSE (Server-Sent Events) streaming sessions, extracting detailed timing statistics, detecting anomalies, and generating comprehensive reports.
✨ Features
🔍 Deep Analysis
- Time to First Token (TTFT) analysis
- Inter-Token Latency (ITL) measurement and statistics
- HTTP session reconstruction from PCAP files
- SSE chunk parsing and timing analysis
- Throughput and performance metrics
🚨 Anomaly Detection
- Large timing gaps detection
- Silence period identification
- Statistical outlier detection
- Pattern anomaly recognition
- Configurable thresholds
📊 Comparison & Reporting
- Multi-capture comparison analysis
- Performance ranking and scoring
- Statistical significance testing
- HTML and JSON report generation
- Interactive visualizations (optional)
🎨 Beautiful CLI
- Rich terminal interface with colors and progress bars
- Multiple output formats (console, JSON, HTML)
- Batch processing capabilities
- Verbose and quiet modes
🚀 Installation
From PyPI (Recommended)
pip install llmshark
From Source
git clone https://github.com/llmshark/llmshark.git
cd llmshark
pip install -e .
Development Installation
git clone https://github.com/llmshark/llmshark.git
cd llmshark
pip install -e ".[dev]"
With Visualization Support
pip install "llmshark[viz]"
📋 Requirements
- Python 3.10 or higher
- Wireshark PCAP files containing HTTP/SSE traffic
- Root privileges may be required for live packet capture
Dependencies
- Core:
scapy,pydantic,rich,typer,numpy,pandas,scipy - Visualization:
matplotlib,seaborn,plotly(optional) - Development:
pytest,black,ruff,mypy(optional)
🎯 Quick Start
Basic Analysis
# Analyze a single PCAP file
llmshark analyze capture.pcap
# Analyze multiple files with detailed output
llmshark analyze *.pcap --verbose
# Save results to files
llmshark analyze capture.pcap --output-dir ./results --format all
Comparison Analysis
# Compare multiple captures
llmshark analyze session1.pcap session2.pcap --compare
# Batch process directory
llmshark batch ./pcap_files/ --output-dir ./analysis_results
Quick File Information
# Get PCAP file information without full analysis
llmshark info capture.pcap
📖 Usage Examples
Single File Analysis
llmshark analyze llm_session.pcap --output-dir ./results --format html
Multi-File Comparison
llmshark analyze before_optimization.pcap after_optimization.pcap \
--compare --output-dir ./comparison --verbose
Batch Processing
llmshark batch ./captures/ --output-dir ./analysis \
--recursive --pattern "*.pcap"
Custom Configuration
llmshark analyze capture.pcap \
--detect-anomalies \
--format json \
--output-dir ./results \
--verbose
🏗️ Architecture
LLMShark is built with modern Python practices and consists of several key components:
llmshark/
├── models.py # Pydantic data models
├── parser.py # PCAP parsing and session extraction
├── analyzer.py # Statistical analysis and anomaly detection
├── comparator.py # Multi-capture comparison logic
├── visualization.py # Charts and HTML report generation
└── cli.py # Command-line interface
Key Models
- StreamSession: Complete HTTP streaming session
- StreamChunk: Individual SSE data chunk
- TimingStats: Comprehensive timing statistics
- AnalysisResult: Complete analysis results
- ComparisonReport: Multi-capture comparison results
📊 Analysis Metrics
Timing Metrics
- TTFT (Time to First Token): Time from request to first response chunk
- ITL (Inter-Token Latency): Time between consecutive tokens
- Mean, Median, P95, P99: Statistical distributions
- Throughput: Tokens per second, bytes per second
Quality Metrics
- Consistency: Variance and coefficient of variation
- Reliability: Gap detection and silence periods
- Performance: Comparative scoring across sessions
Anomaly Detection
- Large Gaps: Configurable threshold for timing gaps
- Silence Periods: Detection of inactive periods
- Statistical Outliers: Z-score based outlier detection
- Pattern Analysis: Unusual behavior identification
🔧 Configuration
Environment Variables
export LLMSHARK_LOG_LEVEL=INFO
export LLMSHARK_OUTPUT_DIR=./results
export LLMSHARK_ANOMALY_THRESHOLD=3.0
Command Line Options
llmshark analyze --help
📈 Output Formats
Console Output
Rich terminal interface with:
- Summary statistics tables
- Performance insights
- Anomaly warnings
- Recommendations
JSON Output
{
"session_count": 5,
"total_tokens_analyzed": 1250,
"overall_timing_stats": {
"ttft_ms": 245.6,
"mean_itl_ms": 67.8,
"p95_itl_ms": 124.5
},
"key_insights": [...],
"recommendations": [...]
}
HTML Reports
- Interactive charts and graphs
- Detailed session breakdowns
- Comparison tables
- Exportable results
🧪 Testing
Run the test suite:
# Run all tests
pytest
# Run with coverage
pytest --cov=llmshark
# Run only unit tests
pytest -m unit
# Run only integration tests
pytest -m integration
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
git clone https://github.com/llmshark/llmshark.git
cd llmshark
pip install -e ".[dev]"
pre-commit install
Code Quality
- Code Formatting:
blackandruff - Type Checking:
mypy - Testing:
pytestwith coverage - Pre-commit Hooks: Automated quality checks
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Scapy: For powerful packet analysis capabilities
- Pydantic: For robust data validation and modeling
- Rich: For beautiful terminal interfaces
- Typer: For excellent CLI framework
📚 Documentation
🐛 Bug Reports & Feature Requests
Please use the GitHub Issues page to report bugs or request features.
📊 Performance
LLMShark is designed for efficiency:
- Streams processing for large PCAP files
- Memory-efficient chunk processing
- Parallel analysis capabilities
- Optimized for Python 3.10+ features
🔮 Roadmap
- Real-time capture analysis
- WebUI dashboard
- Plugin system for custom analyzers
- Machine learning anomaly detection
- Distributed analysis capabilities
- Integration with monitoring systems
Made with ❤️ for the LLM and networking communities
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmshark-0.1.0.tar.gz.
File metadata
- Download URL: llmshark-0.1.0.tar.gz
- Upload date:
- Size: 177.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b803583abf4c7df64f0a97b1fe490ce828535a342939c4fd1e95c5f4eb52593a
|
|
| MD5 |
0096006cb386a507fbb02b76cd52e783
|
|
| BLAKE2b-256 |
cdc9d7d6a9ef33ca06cee139a9572e047135f1022e3e953455957be3e5e6e37f
|
File details
Details for the file llmshark-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llmshark-0.1.0-py3-none-any.whl
- Upload date:
- Size: 53.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5c3a3338cf24b521c90653cd6b759059d23a4ce9d5bbb056b8617a29f29f15a
|
|
| MD5 |
17231f8864933403f80286b1954c561f
|
|
| BLAKE2b-256 |
901bcdd6c46628cb601645463e7521a27e67ed4a90f490f6af11e4026c85c572
|