Skip to main content

RFX-Fuse: Breiman and Cutler's Unified ML Engine (CPU-only)

Project description

RFX-Fuse: Breiman and Cutler's Random Forests as a Forest Unified Learning and Similarity Engine - Extended with Native Explainable Similarity

License: MIT PyPI Python 3.9+ C++17 CUDA arXiv

RFX-Fuse (Random Forests X [X=compression] — Forest Unified Learning and Similarity Engine) delivers Breiman and Cutler's complete vision for Random Forests as a Forests Unified Machine Learning and Similarity Engine with native GPU/CPU support.

Breiman and Cutler designed Random Forests as more than an ensemble predictor. Their original implementation from the early 2000s included classification, regression, unsupervised learning, proximity-based similarity, outlier detection, missing value imputation, and visualization. Modern libraries like scikit-learn's random forests implementation (2010-2011) skipped many of these features.

These capabilities enable it to be a unified learning and similarity engine. With just 1-2 model objects, we can achieve comparable accuracy and output to 3-5 main industery tools. For example, 1 model has comparable output to 4 separate tools for Time Series Regression + native explainable similarity. 1 model = 1 set of trees grown once.

Key Use Cases

Use Case RFX-Fuse Comparable Approach
Recommender Systems 1–2 models 5 tools (FAISS + XGBoost + Shap + Isolation Forests + Custom Code)
Finance Explainability 1 model 3 tools (XGBoost + Shap + Isolation Forests)
Time Series Regression 1 model 4 tools (XGBoost + Shap + Isolation Forests + FAISS)
Imputation Validation 1 model time series methods (general tabular: RFX-Fuse)
Anomaly Detection 1 model 3 tools (Isolation Forests + Shap + Custom Code)

Novel Contributions

  1. Native Explainable Similarity: Breiman and Cutler's original similarity scoring via proximities enable comparable output with Faiss for NDCG + HR on retrieval. Proximity Importance gives the why.
Proximity Importance Example

Explanations available in arXiv paper.

  1. Imputation Quality Validation for General Tabular Data — Rank imputation methods by how "real" the imputed data looks, without ground truth labels.

Comparable Tools Functionality Comparison

Feature RFX-Fuse XGBoost sklearn RF FAISS
Classification
Regression
Unsupervised
Overall importance
Local importance (per-sample) SHAP
Proximity/similarity scoring
Overall proximity importance
Local proximity importance
Top-K similar with explanations
Outlier detection with explanations
Missing value imputation
Weighted bootstrap sampling

Installation

From PyPI (GPU)

pip install rfx-fuse

From PyPI (CPU-only, no build tools required)

pip install rfx-fuse-cpu

Pre-built binary wheel -- no CMake, compiler, or CUDA needed.

From Source (GPU)

git clone https://github.com/chriskuchar/RFX-Fuse.git
cd RFX-Fuse
pip install -e .

From Source (CPU-only)

git clone https://github.com/chriskuchar/RFX-Fuse.git
cd RFX-Fuse
RFX_CPU_ONLY=1 pip install -e .

Prerequisites

  • Python 3.9+
  • CMake 3.12+ (source builds only)
  • C++ compiler with C++17 support (GCC 7+, Clang 5+) (source builds only)
  • OpenMP (usually included with compiler)
  • CUDA toolkit 12.8+ (GPU acceleration only)

Verify Installation

import RFXFuse as rfx
print(f"RFX-Fuse version: {rfx.__version__}")
print(f"CUDA enabled: {rfx.__cuda_enabled__}")

Examples

Each use case has a complete demonstration script in the examples/ folder:

Use Case Demo Script Description
Recommender Systems examples/recommender_system/demo_recommender_system.py MovieLens 25M: similarity retrieval + ranking with explanations
Finance Explainability examples/classification/demo_loan_classification.py Loan default prediction with 4-type explainability
Time Series Regression examples/time_series/demo_time_series.py Bike sharing: prediction + outlier detection
Imputation Validation examples/data_imputation/demo_imputation.py Rank imputation methods without ground truth
Anomaly Detection examples/anomaly_detection/demo_anomaly_detection.py Breiman-Cutler outlier detection
Sample Weights examples/classification/demo_sample_weights.py Weighted bootstrap sampling for classification & regression

Run an example:

cd examples/time_series
python demo_time_series.py

Industry Use Cases

Use Case 1: Recommender Systems

RFX-Fuse Unsupervised for retrieval + RFX-Fuse Supervised for re-ranking on MovieLens 25M.

Recommender System Stage 1:

Recommender System Results Stage 1 Similarity Scoring

Explanations available in arXiv paper.



Recommender System Stage 2 Part 1:

Recommender System Results Stage 2 Supervised Modeling

Explanations available in arXiv paper.

Recommender System Stage 2 Part 2:

Recommender System Results Stage 2 Outlier Detection

Explanations available in arXiv paper.

Recommender System Stage 2 Part 3:

Recommender System Results Stage 2 Top K Retrieval

Explanations available in arXiv paper.

View Code →


Use Case 2: Finance Explainability

Single classifier provides regulatory-compliant explanations (ECOA, GDPR, Fair Lending).

Finance Explainability Results

Finance Explainability Results

Explanations available in arXiv paper.

View Code →


Use Case 3: Time Series Regression

RFX-Fuse Regressor on UCI Bike Sharing dataset with full explainability.

Time Series Results

Explanations available in arXiv paper.

View Code →


Use Case 4: Imputation Quality Validation

Novel capability for general tabular data. Rank imputation methods by how "real" the imputed data looks.

Imputation Validation Results

Explanations available in arXiv paper.

View Code →


Use Case 5: Anomaly Detection

Breiman-Cutler method: train on clean data, anomalies have high P(synthetic).

Anomaly Detection Results

Explanations available in arXiv paper.

View Code →

API Reference

For complete API documentation with all parameters, methods, and examples, see docs/API.md.

Performance

GPU Benchmarks

Environment: NVIDIA RTX 3060 (12GB), AMD Ryzen 7 5800X, 32GB RAM

Use Case Train Size Features Trees Training Time
Recommender (Unsup) 59,047 (×2) 23 1,000 ~1,040s
Recommender (Sup) 47,237 21 1,000 120s
Finance Classification 46,396 15 500 69s
Bike Regression 5,725 4 1,000 24s
Imputation Validation 3,000 12 100 3.6s
Anomaly Detection 15,000 8 100 112s

Training times include predictions, similarity scoring, proximity importance, local importance, and all explainability features where applicable.

CPU Benchmarks

Coming soon.

Methodology

For detailed methodology, see:

Citation

@article{kuchar2026rfxfuse,
  author       = {Kuchar, Chris},
  title        = {RFX-Fuse: Breiman and Cutler's Unified ML Engine + Native Explainable Similarity},
  year         = {2026},
  journal      = {arXiv preprint arXiv:2511.19493},
  url          = {https://arxiv.org/html/2603.13234v1}
}

Acknowledgments

This work aims to implement the full unified learning and similarity engine Dr. Leo Breiman and Dr. Cutler created when they made their Fortran/Java implementation in the early 2000s.

Special thanks to Dr. Adele Cutler for generously sharing original Breiman-Cutler Random Forest source materials, which made this faithful restoration and extension possible.

Work in Progress

  • Multi-class classification support

Previous Work

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rfx_fuse_cpu-1.1.3-cp313-cp313-win_amd64.whl (854.3 kB view details)

Uploaded CPython 3.13Windows x86-64

rfx_fuse_cpu-1.1.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

rfx_fuse_cpu-1.1.3-cp313-cp313-macosx_14_0_universal2.whl (412.4 kB view details)

Uploaded CPython 3.13macOS 14.0+ universal2 (ARM64, x86-64)

rfx_fuse_cpu-1.1.3-cp312-cp312-win_amd64.whl (854.2 kB view details)

Uploaded CPython 3.12Windows x86-64

rfx_fuse_cpu-1.1.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

rfx_fuse_cpu-1.1.3-cp312-cp312-macosx_14_0_universal2.whl (412.3 kB view details)

Uploaded CPython 3.12macOS 14.0+ universal2 (ARM64, x86-64)

rfx_fuse_cpu-1.1.3-cp311-cp311-win_amd64.whl (852.0 kB view details)

Uploaded CPython 3.11Windows x86-64

rfx_fuse_cpu-1.1.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

rfx_fuse_cpu-1.1.3-cp311-cp311-macosx_14_0_universal2.whl (408.9 kB view details)

Uploaded CPython 3.11macOS 14.0+ universal2 (ARM64, x86-64)

File details

Details for the file rfx_fuse_cpu-1.1.3-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for rfx_fuse_cpu-1.1.3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 8ee8eef7e73c1a4647dae8baf30e16573449759aa3068df1bbd957051b938ebf
MD5 8411f7e629aa38b8b8b4dd98b3aa2448
BLAKE2b-256 01a8fe0973ced106a7b34aacd7e160c3aab972d2b2e8cf6cec194994fdf3f872

See more details on using hashes here.

Provenance

The following attestation bundles were made for rfx_fuse_cpu-1.1.3-cp313-cp313-win_amd64.whl:

Publisher: build_wheels.yml on chriskuchar/RFX-Fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rfx_fuse_cpu-1.1.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for rfx_fuse_cpu-1.1.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0fa4c3e38a57586f44b9a2949bbc3634bc3ea013c9a0ca413236a58eec2670aa
MD5 85a450348891d9a62eebbcc74b07c23d
BLAKE2b-256 e505fcafe37cfbe769d3f5f430b14b24877cd0a0963aa99ab5581ad98f89868f

See more details on using hashes here.

Provenance

The following attestation bundles were made for rfx_fuse_cpu-1.1.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build_wheels.yml on chriskuchar/RFX-Fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rfx_fuse_cpu-1.1.3-cp313-cp313-macosx_14_0_universal2.whl.

File metadata

File hashes

Hashes for rfx_fuse_cpu-1.1.3-cp313-cp313-macosx_14_0_universal2.whl
Algorithm Hash digest
SHA256 9d92961dc8773a434abaeb5741675a0d251758737f6f0f6713805d22fb14e9ec
MD5 844aa7c6508f97069b7d7b672ce67a4f
BLAKE2b-256 8b32b3be8be6eb67148effa366f77ad86c2606f991351cd2a72eb5be83807e5a

See more details on using hashes here.

Provenance

The following attestation bundles were made for rfx_fuse_cpu-1.1.3-cp313-cp313-macosx_14_0_universal2.whl:

Publisher: build_wheels.yml on chriskuchar/RFX-Fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rfx_fuse_cpu-1.1.3-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rfx_fuse_cpu-1.1.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 fa47e336f06e6a6d5974e53c2311e4594f0a1c9113c9114e72ed7f5a48d96891
MD5 74b92739c6f0d26c1229db2c6b562afb
BLAKE2b-256 76a2ce8bb70b4c8169704cb0b2be751e44487c1d550e4afd823ca629dbafef35

See more details on using hashes here.

Provenance

The following attestation bundles were made for rfx_fuse_cpu-1.1.3-cp312-cp312-win_amd64.whl:

Publisher: build_wheels.yml on chriskuchar/RFX-Fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rfx_fuse_cpu-1.1.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for rfx_fuse_cpu-1.1.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1caeb5cece6911a72a3e8a0d2a818b08d3eec736d8deb87fd9905eae641e4dc1
MD5 9990dbefe7dab9063877012729dd53f0
BLAKE2b-256 425f7513c5a813659b68c8f51992f3e497db9238a6bfa851ad9d86891e04e022

See more details on using hashes here.

Provenance

The following attestation bundles were made for rfx_fuse_cpu-1.1.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build_wheels.yml on chriskuchar/RFX-Fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rfx_fuse_cpu-1.1.3-cp312-cp312-macosx_14_0_universal2.whl.

File metadata

File hashes

Hashes for rfx_fuse_cpu-1.1.3-cp312-cp312-macosx_14_0_universal2.whl
Algorithm Hash digest
SHA256 652ac5374450a69007e5ad4273904eaa420b7ae75758f490790a041f6a271008
MD5 be2c754c0129b49f9999909ebc3640cd
BLAKE2b-256 530e41f6275f2f63f189cc1018c1c44c66aa568600ac856100e4c7a9792b4fde

See more details on using hashes here.

Provenance

The following attestation bundles were made for rfx_fuse_cpu-1.1.3-cp312-cp312-macosx_14_0_universal2.whl:

Publisher: build_wheels.yml on chriskuchar/RFX-Fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rfx_fuse_cpu-1.1.3-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for rfx_fuse_cpu-1.1.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 9fc1793ad05fdf5f731585719190c267172c42ca27c2d01895ac563f061e06d3
MD5 a78810e42d947ae8d2501ea47ec3de34
BLAKE2b-256 7e53e3478ae7a3faa61dc667da7a2d3d6d7383c53f05e04042520bfb2a71d5cf

See more details on using hashes here.

Provenance

The following attestation bundles were made for rfx_fuse_cpu-1.1.3-cp311-cp311-win_amd64.whl:

Publisher: build_wheels.yml on chriskuchar/RFX-Fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rfx_fuse_cpu-1.1.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for rfx_fuse_cpu-1.1.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f4876b262d9d2bdd3b862dc720d85a9fcbbd624fea8722d2bc355a296b85c28e
MD5 7dc8fb3f4c9333faa57dd6b0e5ecbfa0
BLAKE2b-256 f3dce0e89efc7f1e4b57f1cea0d8c1312e4ba9ac18333989afc3d19e20643800

See more details on using hashes here.

Provenance

The following attestation bundles were made for rfx_fuse_cpu-1.1.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build_wheels.yml on chriskuchar/RFX-Fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rfx_fuse_cpu-1.1.3-cp311-cp311-macosx_14_0_universal2.whl.

File metadata

File hashes

Hashes for rfx_fuse_cpu-1.1.3-cp311-cp311-macosx_14_0_universal2.whl
Algorithm Hash digest
SHA256 53cd87b8e2302ee9f8b8a4a4a84e5183b1a08cc4eb0cc6b674c820150cb92249
MD5 5f9dd54412b362978873b421b4db13d9
BLAKE2b-256 781b62346df420314135a24908d6a183982551875bdd165a74706cdd72701abc

See more details on using hashes here.

Provenance

The following attestation bundles were made for rfx_fuse_cpu-1.1.3-cp311-cp311-macosx_14_0_universal2.whl:

Publisher: build_wheels.yml on chriskuchar/RFX-Fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page