Lightning fast rule generation library
Project description
Iguanas: A Lightning-Fast Rule Generation Python Library
| Package | |
| Quality | |
| Documentation | |
| Code style | |
| Downloads | |
| Community |
What is Iguanas?
Iguanas is a library built on top of Polars, designed to streamline the entire rule-based system development workflow — from raw data to production-ready rules — leveraging Polars' blazing-fast multi-core processing.
Built by the PSP Data Team at PayPal, Iguanas makes rule generation, evaluation, and selection both faster and simpler.
⚡ Key Features
- 🚀 Lightning Fast: Built on Polars for multi-core parallel processing
- 🎯 End-to-End: Generate, evaluate, combine, and select rules in one library
- 📦 Production Ready: Lightweight rule strings that deploy anywhere
- 🔧 Flexible: Sequential and parallel grid search strategies
- 🔗 Composable: Chain generation → evaluation → selection with a few function calls
- 🎓 Easy to Learn: Simple functional API with clear, consistent signatures
🛠️ What Can Iguanas Do?
⚙️ Rule Generation
Generate interpretable rules from labelled datasets using XGBoost tree extraction:
rule_grid_search_sequential- Single-process grid search over weight transformations and scale_pos_weight valuesrule_grid_search_parallel_weights- Parallel grid search parallelised over weight transformationsrule_grid_search_parallel_scales- Parallel grid search parallelised over scale_pos_weight valuesextract_rules- Extract rules from a fitted XGBoost model (with optional monotone constraints)extract_rule_by_max_gain- Extract the highest-gain rule path from a single treeextract_rule_with_monotone_constraints- Extract a rule path respecting monotone constraints
📊 Metrics
Compute classification performance metrics for rule predictions:
compute_metrics- Compute a full metrics table (accuracy, precision, recall, F-beta, TP/FP/TN/FN, flagged %) for a set of rulescompute_single_metric- Compute a single scalar metric (accuracy, precision, recall or F-beta) — optimised for hot-path evaluation
🔍 Rule Evaluation
Evaluate rules on data and filter by performance:
apply_rules- Evaluate rule expressions on a DataFrame and return a boolean prediction matrixapply_and_filter_by_performance- Evaluate rules and filter by user-defined metric thresholdsselect_diverse_top_rules- Select top-performing rules while removing highly correlated duplicatesapply_filter_and_deduplicate_rules- Complete end-to-end pipeline: evaluate → filter → deduplicate
🔀 Rule Combination
Combine individual rules into compound rules to improve performance:
combine_rules_full_search- Exhaustive search over all rule pairscombine_rules_cumulative- Incrementally combine rules with a running candidatecombine_rules_greedy- Greedy combination selecting the best pair at each stepcombine_rules_beam_search- Beam search combination balancing quality and efficiencycombine_rules_a_star- A* search combination using a heuristic cost function
✂️ Rule Selection
Deduplicate and prune rule sets:
filter_rules_by_feature_overlap- Remove rules that share too many features with higher-importance rulesfilter_correlated_rules- Remove rules whose predictions are highly correlatedselect_best_rule_per_column_combination- Keep only the best-performing rule for each unique column combinationextract_feature_names_from_rule- Parse a rule string and return the feature names it references
🔬 Rule Analysis
Inspect and report on rule sets:
generate_rule_performance_report- Generate a combined performance and structure report for a rule setparse_conditions- Parse a rule expression into its constituent conditionsparse_levels- Parse a rule expression into a structured level-by-level representationrebuild_from_levels- Reconstruct a rule string from a level representation
🖊️ Rule Formatting
Clean up rule expressions for display or logging:
simplify_rule- Simplify a rule expression by removing redundant conditions
📐 Monotone Constraints
Infer feature directionality to guide rule generation:
infer_monotone_constraints_from_correlations- Infer monotone constraints (±1) from feature–target correlationsinfer_monotone_constraints_from_stumps- Infer monotone constraints (±1) from decision stumps
⚖️ Sample Weight Transformations
Generate sample weight schedules to steer rule learning:
generate_increasing_weights- Weights that increase with feature value (power, log families)generate_decreasing_weights- Weights that decrease with feature value (reciprocal families)generate_weights- Generate both increasing and decreasing weight schedules in one call
🚀 Quick Start
import polars as pl
import numpy as np
from xgboost import XGBClassifier
from iguanas.weight_transformations import generate_weights
from iguanas.rule_generation import rule_grid_search_parallel_weights
from iguanas.rule_evaluation import apply_filter_and_deduplicate_rules
# 1. Load your data
X_train = pl.DataFrame({
"age": [25, 45, 35, 50, 30, 55, 40, 28],
"income": [30000, 80000, 50000, 90000, 40000, 95000, 70000, 35000],
})
y_train = pl.Series([0, 1, 0, 1, 0, 1, 1, 0])
# 2. Generate sample weight transformations
weights = generate_weights(X_train["income"])
# 3. Run a parallel grid search to extract rules
estimator = XGBClassifier(max_depth=2, n_estimators=5, random_state=42)
scale_pos_weights = np.logspace(0, 1, 5)
rules_df = rule_grid_search_parallel_weights(
estimator, X_train, y_train,
scale_pos_weights=scale_pos_weights,
weights_train_vec=weights,
n_jobs=-1,
)
# 4. Evaluate, filter, and deduplicate rules
R, metrics, selected_rules = apply_filter_and_deduplicate_rules(
X_train, y_train, rules_df,
metric_thresholds=[
{"name": "precision", "operator": ">=", "value": 0.6},
{"name": "recall", "operator": ">=", "value": 0.5},
],
max_corr=0.8,
)
print(selected_rules)
📦 Installation
Requires Python 3.10 or higher.
pip install iguanas
Or install from source:
git clone https://github.com/paypal/iguanas.git
cd iguanas
pip install -e . # Install in editable/development mode
📚 Documentation
For detailed documentation, tutorials, and API reference, visit:
https://paypal.github.io/iguanas/
🎯 Use Cases
Iguanas is perfect for:
- Fraud Detection - Generate high-precision rules to flag suspicious transactions
- Risk Scoring - Build interpretable rule sets for credit or operational risk
- Compliance & Policy - Encode business policies as auditable rule expressions
- Anomaly Detection - Surface rare but meaningful patterns in labelled data
- Model Explainability - Extract human-readable rules from gradient boosted models
🏢 Used By
Iguanas powers rule-based systems at:
- PayPal (internal use)
🤝 Contributing
We welcome contributions! Please check out our contributing guidelines.
📄 License
Iguanas is licensed under the Apache License 2.0. See LICENSE file for details.
🙏 Credits
Developed by the PSP Data Team at PayPal.
Built by data scientists, for data scientists
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file iguanas-1.0.2-py3-none-any.whl.
File metadata
- Download URL: iguanas-1.0.2-py3-none-any.whl
- Upload date:
- Size: 87.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a39e0aa75572f46b38be1afe16ef194830704e8e37596ea25745d860aa8a627
|
|
| MD5 |
d1c3c4c1f6fc6de1afe27939929dfaa5
|
|
| BLAKE2b-256 |
39b90a76609ab7e273160bc2164712d0c9f012dc39cb58508aefca5fd794cc9f
|