Skip to main content

A Python library for parsing and manipulating YARA rules using Abstract Syntax Trees

Project description

YARAAST - YARA Abstract Syntax Tree

A powerful Python library and CLI tool for parsing, analyzing, and manipulating YARA rules through Abstract Syntax Tree (AST) representations.

Author: Marc Rivero | @seifreed
Email: mriverolopez@gmail.com
GitHub: https://github.com/seifreed/yaraast

Features

  • Parse YARA rules into a structured AST with multiple output formats
  • Analyze rules for optimization opportunities and best practices
  • Format and prettify YARA files with customizable styles
  • Validate syntax and semantic correctness
  • Generate comprehensive metrics and visualizations (complexity, strings, dependencies)
  • Support for large rulesets with thousands of rules
  • Extensible visitor pattern for custom analysis
  • Performance benchmarking and streaming for huge files
  • AST-based diff comparison between YARA files
  • LibYARA integration for compilation and scanning
  • Fluent API for programmatic rule construction
  • Roundtrip testing for serialization fidelity
  • Multi-file workspace analysis with dependency resolution
  • Export/import AST in JSON/YAML/Protobuf formats

Installation

pip install yaraast

From Source

git clone https://github.com/seifreed/yaraast
cd yaraast
pip install -r requirements.txt
pip install -e .

Quick Start

# Get help
yaraast --help

# Show version
yaraast --version

Command Reference

Core Commands

parse - Parse and Output YARA Files

# Parse and output in different formats
yaraast parse rule.yar                    # Default output
yaraast parse rule.yar --format json      # JSON representation
yaraast parse rule.yar --format yaml      # YAML representation
yaraast parse rule.yar --format tree      # Tree visualization

validate - Syntax Validation

# Validate YARA file syntax
yaraast validate ruleset.yar              # Check for syntax errors
yaraast validate *.yar                    # Validate multiple files

format - Code Formatting

# Format YARA files with consistent style
yaraast format input.yar output.yar       # Format to new file
yaraast format --help                     # See formatting options

fmt - In-place Formatting (like black)

# Format YARA files in place
yaraast fmt rule.yar                      # Format with default style
yaraast fmt --style compact rule.yar      # Use compact style
yaraast fmt --style readable rule.yar     # Use readable style
yaraast fmt --check rule.yar              # Check if formatting needed

Analysis Commands

analyze - AST-Based Analysis

# Optimization analysis
yaraast analyze optimize ruleset.yar      # Find optimization opportunities

# Best practices analysis
yaraast analyze best-practices rule.yar   # Check best practices
yaraast analyze best-practices -v rule.yar # Verbose output with suggestions

metrics - Rule Metrics and Visualization

# Complexity metrics
yaraast metrics complexity rule.yar       # Analyze rule complexity

# String analysis
yaraast metrics strings rule.yar          # Analyze string patterns

# Visualizations
yaraast metrics tree rule.yar --output tree.html    # HTML tree visualization
yaraast metrics graph rule.yar            # Generate dependency graph
yaraast metrics patterns rule.yar         # String pattern analysis
yaraast metrics report rule.yar           # Comprehensive report

semantic - Semantic Validation

# Semantic validation beyond syntax
yaraast semantic rule.yar                 # Check semantic correctness
yaraast semantic *.yar --quiet            # Check multiple files quietly
yaraast semantic rule.yar --strict        # Treat warnings as errors

Development Commands

serialize diff - Compare YARA Files

# Show differences between files
yaraast serialize diff old.yar new.yar    # AST-based diff comparison

roundtrip - Serialization Testing

# Test AST serialization/deserialization
yaraast roundtrip test rule.yar           # Verify round-trip consistency
yaraast roundtrip test rule.yar -v        # Verbose output
yaraast roundtrip serialize rule.yar      # Serialize to JSON/YAML
yaraast roundtrip deserialize ast.json    # Deserialize back to YARA
yaraast roundtrip pretty rule.yar         # Pretty print with style options
yaraast roundtrip pipeline rule.yar       # CI/CD pipeline format

serialize - Import/Export AST

# Serialize AST for storage or transmission
yaraast serialize export rule.yar --format json  # Export to JSON
yaraast serialize export rule.yar --format yaml  # Export to YAML
yaraast serialize import-ast ast.json     # Import from serialized format
yaraast serialize info rule.yar           # Show AST structure info
yaraast serialize validate ast.json       # Validate serialized format

Performance Commands

performance - Large Ruleset Tools

# Performance analysis and optimization
yaraast performance stream large.yar       # Stream processing for huge files
yaraast performance optimize rules/        # Get optimization recommendations

performance-check - Performance Analysis

# Check for performance issues
yaraast performance-check rule.yar        # Analyze performance issues

bench - Benchmarking Suite

# Run benchmarks
yaraast bench rule.yar                    # Default benchmarks
yaraast bench rule.yar --operations parse # Benchmark parsing only
yaraast bench rule.yar --iterations 10    # Custom iterations
yaraast bench *.yar --compare             # Compare performance across files

Integration Commands

libyara - LibYARA Integration

# Scan with LibYARA integration
yaraast libyara scan rule.yar target      # Scan files
yaraast libyara scan rule.yar target --optimize  # Use optimized compilation
yaraast libyara scan rule.yar target --stats     # Show scan statistics

# Optimize rules for LibYARA
yaraast libyara optimize rule.yar         # Optimize and show results
yaraast libyara optimize rule.yar --show-optimizations  # Detailed view

workspace - Multi-File Analysis

# Analyze directories with multiple YARA files
yaraast workspace analyze /path/to/rules  # Analyze all files in directory
yaraast workspace graph /path/to/rules    # Generate dependency graph
yaraast workspace resolve main.yar        # Resolve all includes

Advanced Commands

fluent - Fluent API Examples

# Demonstrate fluent API usage
yaraast fluent examples                   # Show example rules
yaraast fluent conditions                 # Demonstrate condition builders
yaraast fluent string-patterns            # Show string pattern builders
yaraast fluent template                   # Generate rule template
yaraast fluent transformations            # Show AST transformations

optimize - Rule Optimization

# Optimize YARA rules
yaraast optimize input.yar output.yar     # Optimize rules
yaraast optimize rule.yar optimized.yar --show-changes  # Show what changed

Usage Examples

As a Python Library

from yaraast import Parser
from yaraast.visitors import OptimizationAnalyzer

# Parse YARA rules
parser = Parser()
with open('ruleset.yar', 'r') as f:
    ast = parser.parse(f.read())

# Analyze for optimizations
analyzer = OptimizationAnalyzer()
analyzer.visit(ast)
suggestions = analyzer.get_suggestions()

for suggestion in suggestions:
    print(f"{suggestion.rule}: {suggestion.message}")

Batch Processing

# Process multiple files
for file in *.yar; do
    yaraast validate "$file" && \
    yaraast format "$file" && \
    yaraast analyze optimize "$file" > "${file%.yar}_report.txt"
done

CI/CD Integration

# GitHub Actions example
- name: Validate YARA Rules
  run: |
    pip install yaraast
    yaraast validate rules/*.yar
    yaraast analyze security rules/*.yar

Large Ruleset Analysis

# Analyze massive rulesets efficiently
yaraast performance stream huge_ruleset.yar | \
    yaraast analyze optimize - | \
    yaraast metrics --export-csv analysis.csv -

Complete Command List

Commands:
  analyze            AST-based analysis commands
  bench              Performance benchmarks for AST operations
  fluent             Fluent API demonstrations and examples
  fmt                Format YARA file in-place (like black for Python)
  format             Format a YARA file to new file
  libyara            LibYARA integration for scanning and optimization
  metrics            Analyze and visualize YARA metrics
  optimize           Optimize YARA rules for better performance
  parse              Parse YARA file and output in various formats
  performance        Performance tools for large rule collections
  performance-check  Analyze YARA rules for performance issues
  roundtrip          Round-trip serialization and pretty printing
  semantic           Perform semantic validation on YARA files
  serialize          AST serialization for export/import
  validate           Validate YARA file for syntax errors
  workspace          Multi-file analysis and dependency resolution

Real-World Usage

Processing Production Rulesets

The tool has been tested with production rulesets containing thousands of rules:

# Example: Analyzing a 10,000+ rule collection
$ yaraast analyze optimize master_yara.yar

Optimization Analysis: master_yara.yar

   Optimization
  Opportunities
┏━━━━━━━━┳━━━━━━━┓
┃ Impact  Count ┃
┡━━━━━━━━╇━━━━━━━┩
│ High        0 │
│ Medium   8184 │
│ Low      5962 │
└────────┴───────┘

Found 14146 optimization suggestions

Command Chaining

Many commands support piping and chaining:

# Parse, optimize, and format
yaraast parse rule.yar | \
    yaraast analyze optimize - | \
    yaraast format - > optimized.yar

# Validate and generate report
yaraast validate ruleset.yar && \
    yaraast metrics --detailed ruleset.yar > report.txt

Output Formats

Most commands support multiple output formats:

  • text - Human-readable output (default)
  • json - JSON for programmatic processing
  • yaml - YAML for configuration files
  • csv - CSV for spreadsheet analysis
  • tree - Tree visualization for structure
  • html - HTML reports with styling
# Examples
yaraast parse rule.yar --format json
yaraast metrics rule.yar --format csv
yaraast analyze optimize rule.yar --format html > report.html

Python Module Usage

The tool can be run as a Python module:

# Run as module
python -m yaraast --help
python -m yaraast analyze optimize rule.yar

# In Python scripts
from yaraast import Parser
from yaraast.cli import cli

# Use the parser
parser = Parser()
ast = parser.parse(yara_code)

# Or invoke CLI programmatically
cli(['analyze', 'optimize', 'rule.yar'])

https://github.com/seifreed/yaraast

Requirements

  • Python 3.13 or higher
  • Dependencies: click, rich, attrs, PyYAML
  • Optional: yara-python for LibYARA integration
  • Optional: protobuf for binary serialization

License

This project is licensed under the MIT License with an attribution requirement.

License Summary

  • Free to use: You can use this software freely for any purpose (commercial or non-commercial)
  • Attribution required: You must include attribution to the original author when using this software
  • Attribution format: "YARA AST by Marc Rivero (@seifreed) - https://github.com/seifreed/yaraast"

Full License

See the LICENSE file for the complete license text.

Copyright (c) 2025 Marc Rivero (@seifreed)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yaraast-0.3.0.tar.gz (281.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yaraast-0.3.0-py3-none-any.whl (291.7 kB view details)

Uploaded Python 3

File details

Details for the file yaraast-0.3.0.tar.gz.

File metadata

  • Download URL: yaraast-0.3.0.tar.gz
  • Upload date:
  • Size: 281.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for yaraast-0.3.0.tar.gz
Algorithm Hash digest
SHA256 23f44af0d65313e062dcb5c0976dea1198b009aae78d5e7895f13899cf7218eb
MD5 7992fe76ce41df988c822873848906c3
BLAKE2b-256 4987291323ed03e3df3a78a27853eb4483fd2e9bdcc9c6cf8080cec839511aaa

See more details on using hashes here.

File details

Details for the file yaraast-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: yaraast-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 291.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for yaraast-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bd5dae09882fcc9f28f12d39e624445ca831b1420c973335697c5ec9c6e19334
MD5 d15edcac86731997f0ff340a56ca41cb
BLAKE2b-256 febff7be52818e9ea140c5fc74f29144572a57d30d8d884405a51eadc5fc6b4b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page