YARA generator inspired by yarGen
Project description
yarobot
yarobot is a high-performance YARA rule generator inspired by yarGen, designed to automatically create quality YARA rules from malware samples while minimizing false positives through intelligent goodware database comparison.
🚀 Features
- Automated YARA Rule Generation: Create both simple and super rules from malware samples
- Intelligent Scoring System: Advanced string scoring with goodware database comparison
- High Performance: Core engine written in Rust for maximum speed
🛠 Installation
Install from PyPI
pip install yarobot
Build Prerequisites
- Python 3.11 or higher
- Rust toolchain (for building the native extension)
Install from Source
git clone https://github.com/ogre2007/yarobot
cd yarobot
pip install .
📖 Quick Start
Create Custom Goodware Database (if needed)
py -m yarobot.database create /path/to/goodware/files --recursive
Generate Rules from Malware Samples (cli)
py -m yarobot.generate /path/to/malware/samples --output-rule-file my_rules.yar
Start as web service
py -m yarobot.app [-g <goodware dbs path>]
app
then locate http://localhost:5000 or use api directly from anywhere:
curl -X POST -F "files=@tests\\data\\binary" http://localhost:5000/api/analyze -F "min_score=5" -F "get_opcodes=true"
Advanced Configuration
py -m yarobot.generate /malware/samples -g <goodware dbs path> \
--opcodes \
--recursive \
--author "My Security Team" \
--ref "Internal Investigation 2024" \
--superrule-overlap 5 \
--strings-per-rule 15
Database Management
# Update existing database with new goodware samples
(TODO) py -m yarobot.database update /path/to/new/goodware --identifier corporate
# Create new database from scratch
py -m yarobot.database create /path/to/goodware --opcodes
🔧 Configuration Options
Rule Generation Options
--min-size,--max-size: String length boundaries--min-score: Minimum string score threshold--opcodes: Enable opcode feature for additional detection capabilities--superrule-overlap: Minimum overlapping strings for super rule creation--recursive: Scan directories recursively--excludegood: Force exclusion of all goodware strings--oe: only executable extensions
Database Options
--identifier: Database identifier for multi-environment support--update: Update existing databases with new samples--only-executable: Only process executable file extensions
🏗 Architecture
yarobot combines the performance of Rust with the flexibility of Python:
Core Components
- Rust Engine (
yarobot-rs): High-performance file processing and string analysis - Python Interface: CLI management, database operations, and rule formatting
- Scoring Engine: Intelligent string scoring with goodware comparison
- Rule Generator: YARA rule synthesis and optimization
Database Structure
good-strings.db: Common strings from goodware samplesgood-opcodes.db: Opcode frequency databasegood-imphashes.db: Import hash databasegood-exports.db: Export function database
🤝 Contributing
We welcome contributions!
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
TODO
- dropzone mode
- http-service
- web interface
- fix/drop imphash/exports
- default databases
- rule generation rewriting
- tokenizer code separated in different package
- dex opcode extraction
📄 License
This project is licensed under the GPLv3 License - see the LICENSE file for details.
🙏 Acknowledgments
- Based on yarGen by Florian Roth
- Built with Pyo3 for Python-Rust integration
- Uses goblin for binary parsing
📞 Support
- Issues: GitHub Issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file yarobot-0.3.0.tar.gz.
File metadata
- Download URL: yarobot-0.3.0.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
537590114d8698a211490e1759cfeb041f7fab22952f41e5157fef9e50ec4132
|
|
| MD5 |
cc89a3dda3d947313654325868e23b43
|
|
| BLAKE2b-256 |
4ecf8224d5f6bc6e5b5c509b50de89118b6fbbb84ae901f6a3412adb13f7a135
|
Provenance
The following attestation bundles were made for yarobot-0.3.0.tar.gz:
Publisher:
python-publish.yml on ogre2007/yarobot
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
yarobot-0.3.0.tar.gz -
Subject digest:
537590114d8698a211490e1759cfeb041f7fab22952f41e5157fef9e50ec4132 - Sigstore transparency entry: 685004788
- Sigstore integration time:
-
Permalink:
ogre2007/yarobot@34ba8c503d7a98f579b3e18d6aa016360b097d13 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/ogre2007
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@34ba8c503d7a98f579b3e18d6aa016360b097d13 -
Trigger Event:
release
-
Statement type:
File details
Details for the file yarobot-0.3.0-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: yarobot-0.3.0-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
74d7c962cb4ca7acc5846e7056eb49024e1f3046c5a1804dbb7509067f30c938
|
|
| MD5 |
51826eb9710e936495c72787512e0075
|
|
| BLAKE2b-256 |
4108fd64cf8a668b1b866be7cf01227a292423281d08570a1998747d9617f4f3
|
File details
Details for the file yarobot-0.3.0-cp313-cp313-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: yarobot-0.3.0-cp313-cp313-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 1.5 MB
- Tags: CPython 3.13, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7e6a5584af3e772ddc8843fc81de4532ed75535c39b4af4d22852e27ac31847
|
|
| MD5 |
55773845c2adf58b5ca92ad6bb5e724d
|
|
| BLAKE2b-256 |
00d1b55ee08be42b7ac39165dce0ecf187dc73a6f436b4b6e5fedd010ad4a13d
|
Provenance
The following attestation bundles were made for yarobot-0.3.0-cp313-cp313-manylinux_2_34_x86_64.whl:
Publisher:
python-publish.yml on ogre2007/yarobot
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
yarobot-0.3.0-cp313-cp313-manylinux_2_34_x86_64.whl -
Subject digest:
e7e6a5584af3e772ddc8843fc81de4532ed75535c39b4af4d22852e27ac31847 - Sigstore transparency entry: 685004799
- Sigstore integration time:
-
Permalink:
ogre2007/yarobot@34ba8c503d7a98f579b3e18d6aa016360b097d13 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/ogre2007
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@34ba8c503d7a98f579b3e18d6aa016360b097d13 -
Trigger Event:
release
-
Statement type: