YARA generator inspired by yarGen
Project description
yarobot
yarobot is a high-performance YARA rule generator inspired by yarGen project, designed to automatically create quality YARA rules from malware samples while minimizing false positives through intelligent goodware database comparison.
✨ Features
- Automated YARA Rule Generation: Create both simple and super rules from malware samples
- Advanced Scoring System: String scoring with goodware database comparison
- High-Performance Engine: Rust-based core stringZZ for fast file processing
- Multiple Interfaces: CLI, Python API, and web interface
- Intelligent Filtering: Automatic exclusion of common goodware strings for your specific dataset
- Super Rules: Automatic creation of rules that match multiple related samples
🏗️ Architecture
flowchart TD
A[CLI] --> D
B[Web Upload] --> D
C[API Call] --> D
D[Token extraction] --> E[Scoring]
F[Goodware DB] --> E
E --> G[YARA Generator]
G --> H[Rule file]
G --> I[Web Display]
G --> J[API JSON]
🛠 Installation
1. Install from PyPI
pip install yarobot
2. Install from Source
# Clone repository
git clone https://github.com/ogre2007/yarobot
cd yarobot
# Install in development mode
pip install -e .
# Or install with all dependencies
pip install ".[dev]"
📖 Quick Start
1. First-Time Setup (optional but recommended)
# Create a goodware database
mkdir -p ./dbs
py -m yarobot.database create /path/to/goodware/files --recursive --opcodes
# The database will be saved in ./dbs/
2. Generate Your First Rules
# Basic rule generation
py -m yarobot.generate /path/to/malware/samples \
--output-rule-file my_rules.yar \
--author "Your Name" \
--ref "Case-001"
3. Launch Web Interface
# Start with your database
py -m yarobot.app -g ./dbs
# Access at http://localhost:5000
then locate http://localhost:5000 or use api directly from anywhere:
curl -X POST -F "files=@tests\\data\\binary" http://localhost:5000/api/analyze -F "min_score=5" -F "get_opcodes=true"
4. Advanced Configuration
py -m yarobot.generate /malware/samples -g <goodware dbs path> \
--opcodes \
--recursive \
--author "My Security Team" \
--ref "Internal Investigation 2024" \
--superrule-overlap 5 \
--strings-per-rule 15
5. Database Management
# Update existing database with new goodware samples
(TODO) py -m yarobot.database update /path/to/new/goodware --identifier corporate
# Create new database from scratch
py -m yarobot.database create /path/to/goodware --opcodes
🔧 Configuration Options
Rule Generation Options
--min-size,--max-size: String length boundaries--min-score: Minimum string score threshold--opcodes: Enable opcode feature for additional detection capabilities--superrule-overlap: Minimum overlapping strings for super rule creation--recursive: Scan directories recursively--excludegood: Force exclusion of all goodware strings--oe: only executable extensions
Database Options
--identifier: Database identifier for multi-environment support--update: Update existing databases with new samples--only-executable: Only process executable file extensions
🤝 Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
TODO's
- Global project refactoring & packaging
- Token extraction rewritten in Rust
- Tests & CI/CD pipeline
- Multiplatform PyPI release
- HTTP service with web UI
- Store regex patterns in configuration
- Wide/ASCII token merging
- Token deduplication
- Fix/improve imphash/exports handling
- Include default databases
- Rule generation improvements
- Separate token extraction to stringZZ package
- Regexp generation
- LLM Scoring support
📄 License
This project is licensed under the GPLv3 License - see the LICENSE file for details.
🙏 Credits
- yarGen by Florian Roth (initial idea and implementation)
- Pyo3 for Python-Rust integration
- goblin for binary parsing
📞 Support
- Issues: GitHub Issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file yarobot-0.5.0.tar.gz.
File metadata
- Download URL: yarobot-0.5.0.tar.gz
- Upload date:
- Size: 996.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.16 {"installer":{"name":"uv","version":"0.9.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
216892f05a60a2bbd4b91320c2eb064027b2e8f49d07e0fa8f5b67f0bb193968
|
|
| MD5 |
e6980411fbb2abffa26c323c4e4a6a67
|
|
| BLAKE2b-256 |
80485db4ee308293f26eeaf6200a6ac73e1f74b8927820a04bd9f2e5322e5496
|
File details
Details for the file yarobot-0.5.0-py3-none-any.whl.
File metadata
- Download URL: yarobot-0.5.0-py3-none-any.whl
- Upload date:
- Size: 67.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.16 {"installer":{"name":"uv","version":"0.9.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
270b0d31aab7d48ad1dd16f427a6e9b0981dfd1b26790af9f8da7e3321c88939
|
|
| MD5 |
7feb8e37969c9e9c120117dc48e90976
|
|
| BLAKE2b-256 |
3aef6b2c8ac79a482b8b9c63c909fbaba09b2cefa536b260209ea419d2583076
|