Local self-learning RAG system for Wireshark logs
Project description
🕵️ Wireshark Local RAG Analyst
A fully local, self-learning Retrieval-Augmented Generation (RAG) pipeline for analyzing Wireshark .pcap logs using local LLMs — with built-in support for Hugging Face models, learning feedback, and REST API access.
🚀 Features
- 📁 Drop-in Folder Watcher: Automatically processes
.pcaplogs when dropped into a folder. - 🔍 Protocol-Aware Preprocessing: Extracts HTTP, DNS, TCP, and other traffic.
- 🧠 Self-Learning Engine: Remembers helpful answers and improves responses over time.
- 📚 RAG Pipeline: Uses local vector database (FAISS) with sentence-transformer embeddings.
- 🤖 Local LLM Integration: Supports both:
- 🧱 Ollama-based LLMs (e.g., LLaMA, Mistral, GPT4All)
- 🤗 Hugging Face models (e.g., Mistral-7B, Zephyr, Falcon)
- 🌐 API Access (MCP-style): Query the system over HTTP from other clients.
- 💻 Extensible CLI: Flexible CLI and script interface for querying and development.
- 📦 PyPI-Ready: Fully packageable and publishable as a pip package.
- ✅ CI/CD Support: GitHub Actions pipelines for testing and PyPI publishing.
📂 Project Structure
.
├── app/ # Core logic
├── scripts/ # CLI scripts
├── config/ # Config files
├── data/ # Vector DB & learned responses
├── logs/ # Input PCAPs
├── processed/ # Archived PCAPs
├── .github/workflows/ # CI/CD workflows
⚙️ Installation
- Clone the repo
git clone https://github.com/pkbythebay29/wireshark-local-rag-analyst.git
cd wireshark-local-rag-analyst
- Install Dependencies
pip install -r requirements.txt
- Configuration
Edit config/config.yaml:
protocol_filter: ["http", "dns"]
learning: true
llm_backend: "ollama" # or "huggingface"
llm_model: "llama3" # or "mistralai/Mistral-7B-Instruct-v0.1"
vector_db_path: "./data/faiss.index"
learned_store: "./data/learned_data.jsonl"
- Usage
wireshark-watch
or
python scripts/run_pipeline.py
Drop .pcap files into the logs/ folder — they will be processed automatically.
#5. Ask questions
wireshark-query
or
python scripts/query_logs.py
Example questions:
- What HTTP requests failed with 404?
- Show DNS queries to suspicious domains.
- Were there any TCP handshakes that failed?
- Use as a REST API (MCP server)
python -m app.mcp_server
curl -X POST http://localhost:8080/query \
-H "Content-Type: application/json" \
-d '{"query": "Show me failed DNS requests"}'
- 🤗 Hugging Face Model Support
You can fork, fine-tune, or use any Hugging Face model by editing config.yaml
- Acknowledgements
🙏 Acknowledgements
This project stands on the shoulders of open-source giants:
Wireshark/tshark – For deep packet inspection
FAISS – Vector search from Meta
Sentence Transformers – Fast embeddings from Hugging Face
Ollama – Local model hosting made easy
Transformers – Hugging Face's interface to modern LLMs
Watchdog, FastAPI, Uvicorn, and many more
Thank you to all the contributors making open-source incredible.
- 🗺️ Roadmap
See ROADMAP.md for planned features and timeline.
#8. License 📜 License MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wireshark_rag_analyst-0.1.0.tar.gz.
File metadata
- Download URL: wireshark_rag_analyst-0.1.0.tar.gz
- Upload date:
- Size: 8.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
607dba8d902b94840d6c4494676c02ca3bd781c90a620f4376d6575adfd6e872
|
|
| MD5 |
14adb5e318faf582c5f680b36a8170d1
|
|
| BLAKE2b-256 |
806daca48f7caf9b228adb48207f798deae6ab7abf9151db8aaa6237bb2769f1
|
File details
Details for the file wireshark_rag_analyst-0.1.0-py3-none-any.whl.
File metadata
- Download URL: wireshark_rag_analyst-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc20bcdf8b69ef4c4afaaa242542985c07d92a0cc7df797f7bb8b03f6c7b7342
|
|
| MD5 |
31cfbe6b625e4eb18f2e562298f19d11
|
|
| BLAKE2b-256 |
458d9359de0ad3806f50e1464f8fcd0615a852f991594f91188d13ef18deeb6d
|