Implementations of algorithms from Prof. Dr. Ethem Alpaydın's research papers and ML textbook (MIT Press).

These details have not been verified by PyPI

Project links

Homepage

Project description

neural-trees

PyTorch + sklearn implementations of the tree and mixture-of-experts algorithms from Alpaydın's research papers — the ones that never got a proper open-source home.

Why?

I was reading through Alpaydın's Introduction to Machine Learning and his papers and kept hitting the same wall: interesting algorithms, no usable Python code anywhere. The Soft Decision Tree paper (ICPR 2012) alone has hundreds of citations but the implementations floating around are incomplete, undocumented, or years out of date.

So I wrote them myself — clean, tested, and fully compatible with the sklearn API.

Covered so far:

Algorithm	Paper	Status
Soft Decision Trees	İrsoy, Yıldız, Alpaydın (ICPR 2012)	✅ PyTorch + sklearn API
Omnivariate Decision Trees	Yıldız & Alpaydın (IEEE TNN 2001)	✅
Hierarchical Mixture of Experts + Dropout	İrsoy & Alpaydın (Neurocomputing 2021)	✅ PyTorch
GAL: Grow and Learn Networks	Alpaydın (IJPRAI 1994)	✅
Combined 5×2cv F Test	Alpaydın (Neural Computation 1999)	✅ Gold-standard classifier comparison
McNemar's Test	—	✅
Naive Bayes (Gaussian/Bernoulli/Multinomial)	Textbook Ch. 3	✅
Distance-Weighted KNN + CNN	Alpaydın (AIR 1997)	✅

Installation

pip install neural-trees

Or install from source:

git clone https://github.com/cgrtml/neural-trees.git
cd neural-trees
pip install -e ".[dev]"

Quick Start

Soft Decision Trees

The flagship algorithm. Unlike hard decision trees, every sample reaches every leaf with some probability — making the tree fully differentiable and trainable end-to-end with backpropagation.

from neural_trees import SoftDecisionTree
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

sdt = SoftDecisionTree(depth=4, max_epochs=40, penalty_coef=1e-3)
sdt.fit(X_train, y_train)

print(f"Accuracy: {sdt.score(X_test, y_test):.4f}")

# Inspect what each leaf learned
leaf_distributions = sdt.get_leaf_distributions()  # shape: (n_leaves, n_classes)

# Inspect the split direction at each internal node
split_weights = sdt.get_split_weights()             # list of weight vectors

Key idea (Irsoy, Yıldız, Alpaydın, 2012):

At each internal node i:

$$p_i(\mathbf{x}) = \sigma(\mathbf{w}_i^\top \mathbf{x} + b_i)$$

The probability of reaching leaf $\ell$ is the product of gate values along the path. Final prediction:

$$P(y \mid \mathbf{x}) = \sum_\ell \mu_\ell(\mathbf{x}) \cdot Q_\ell(y)$$

Comparing Two Classifiers — The Gold Standard Test

Alpaydın's Combined 5×2cv F Test (Neural Computation, 1999) is the statistically correct way to compare two classifiers. It overcomes the inflated Type I error of the paired t-test.

from neural_trees.statistical_tests import combined_5x2cv_f_test
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_breast_cancer

X, y = load_breast_cancer(return_X_y=True)

result = combined_5x2cv_f_test(
    clf_A=DecisionTreeClassifier(),
    clf_B=SVC(kernel="rbf"),
    X=X, y=y,
    alpha=0.05
)
print(result)

StatisticalTestResult(
  test       = Alpaydın's Combined 5×2cv F Test
  statistic  = 12.4731
  p-value    = 0.0083
  alpha      = 0.05
  decision   = ✓ REJECT H0
  note       = Classifiers significantly differ
)

Why not just use a t-test? The paired t-test reuses training data across folds — the differences are correlated, inflating the false positive rate. Alpaydın's F test accounts for this by estimating variance within each 2-fold split, giving a much better calibrated test.

Hierarchical Mixture of Experts with Dropout

from neural_trees import HierarchicalMixtureOfExperts
from sklearn.datasets import load_digits

X, y = load_digits(return_X_y=True)

moe = HierarchicalMixtureOfExperts(
    depth=2,
    branching_factor=4,   # 4^2 = 16 expert leaves
    dropout_rate=0.3,     # Dropout on gating networks (Irsoy & Alpaydın, 2021)
    max_epochs=50,
    verbose=True,
)
moe.fit(X, y)
print(f"Accuracy: {moe.score(X, y):.4f}")

GAL — Grow and Learn Networks

No need to specify architecture. The network grows when it can't learn and prunes itself when neurons become redundant.

from neural_trees.classical import GALNetwork
from sklearn.datasets import load_wine

X, y = load_wine(return_X_y=True)

gal = GALNetwork(
    initial_hidden=2,
    max_hidden=40,
    grow_threshold=0.15,
    prune_threshold=1e-4,
    max_epochs=100,
    verbose=True,
)
gal.fit(X, y)
print(f"Final hidden units: {gal.n_hidden_final_}")
print(f"Accuracy: {gal.score(X, y):.4f}")

Omnivariate Decision Trees

At each node, automatically selects the best split type (univariate, linear LDA, or nonlinear MLP) using cross-validation.

from neural_trees import OmnivariateDecisionTree
from sklearn.datasets import load_wine

X, y = load_wine(return_X_y=True)

odt = OmnivariateDecisionTree(max_depth=4, cv_folds=3)
odt.fit(X, y)

# See how many nodes used each split type
print(odt.get_split_type_distribution())
# {'univariate': 3, 'linear': 4, 'nonlinear': 1}

All sklearn-compatible

Every model follows the fit / predict / predict_proba / score interface:

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

pipe = Pipeline([
    ("scaler", StandardScaler()),
    ("sdt", SoftDecisionTree(depth=4, max_epochs=30)),
])
pipe.fit(X_train, y_train)
pipe.score(X_test, y_test)

from sklearn.model_selection import GridSearchCV

param_grid = {"sdt__depth": [3, 4, 5], "sdt__penalty_coef": [1e-4, 1e-3, 1e-2]}
gs = GridSearchCV(pipe, param_grid, cv=5)
gs.fit(X_train, y_train)
print(gs.best_params_)

Notebooks

Notebook	Description
`01_soft_decision_trees.ipynb`	Training, visualization, comparison with CART
`02_classifier_comparison_tests.ipynb`	When to use which statistical test
`03_hierarchical_moe.ipynb`	HMoE training and expert specialization
`04_gal_network.ipynb`	Dynamic architecture growth/pruning
`05_omnivariate_trees.ipynb`	Node-level split type analysis

About the Author

Prof. Dr. Ethem Alpaydın is one of the world's leading machine learning researchers.

Professor Emeritus at Boğaziçi University (Istanbul), now at Özyeğin University
Author of Introduction to Machine Learning (MIT Press, 4 editions, 2004–2020) — used in hundreds of universities globally
Author of Machine Learning: The New AI (MIT Press, 2016)
PhD from EPFL (1990); research stays at UC Berkeley, MIT, and IDIAP
34,000+ citations on Google Scholar
IEEE Senior Member; Pattern Recognition journal editorial board

His 1999 paper on the Combined 5×2cv F Test is the standard reference for classifier comparison. His Soft Decision Trees paper (2012) remains one of the most elegant proposals for differentiable tree models — predating the modern neural tree literature.

Citation

If you use this library in academic work, please cite the original papers:

@book{alpaydin2020introduction,
  title     = {Introduction to Machine Learning},
  author    = {Alpayd{\i}n, Ethem},
  year      = {2020},
  edition   = {4th},
  publisher = {MIT Press}
}

@article{irsoy2021dropout,
  title   = {Dropout Regularization in Hierarchical Mixture of Experts},
  author  = {\.{I}rsoy, O{\u{g}}uzhan and Alpayd{\i}n, Ethem},
  journal = {Neurocomputing},
  volume  = {419},
  pages   = {148--156},
  year    = {2021}
}

@inproceedings{irsoy2012soft,
  title     = {Soft Decision Trees},
  author    = {\.{I}rsoy, O{\u{g}}uzhan and Y{\i}ld{\i}z, Olcay Taner and Alpayd{\i}n, Ethem},
  booktitle = {Proceedings of the 21st International Conference on Pattern Recognition (ICPR)},
  year      = {2012}
}

@article{alpaydin1999combined,
  title   = {Combined 5x2cv {F} Test for Comparing Supervised Classification Learning Algorithms},
  author  = {Alpayd{\i}n, Ethem},
  journal = {Neural Computation},
  volume  = {11},
  number  = {8},
  pages   = {1885--1892},
  year    = {1999}
}

Roadmap

Things I'm planning to add:

Multiple Kernel Learning (Gönen & Alpaydın, JMLR 2011)
Localized Multiple Kernel Learning (ICML 2008)
Convolutional Soft Decision Trees (ICANN 2018)
Decision boundary visualization utilities
Benchmark comparison on UCI datasets

If you find a bug or want to implement one of these, open an issue.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.1

Apr 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neural_trees-0.1.1.tar.gz (24.6 kB view details)

Uploaded Apr 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

neural_trees-0.1.1-py3-none-any.whl (25.6 kB view details)

Uploaded Apr 11, 2026 Python 3

File details

Details for the file neural_trees-0.1.1.tar.gz.

File metadata

Download URL: neural_trees-0.1.1.tar.gz
Upload date: Apr 11, 2026
Size: 24.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for neural_trees-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`ef6b86bcbd76af3f810e89495b9be63d76ca76b9bca0b9a2bbc5bc194e50b9bb`
MD5	`3a789bd14561b4709b75d83bd8597f7c`
BLAKE2b-256	`4f59edd2310574bb8803d0d3b4b79066f7f86e2b27a22bc46029344ace21cf2b`

See more details on using hashes here.

File details

Details for the file neural_trees-0.1.1-py3-none-any.whl.

File metadata

Download URL: neural_trees-0.1.1-py3-none-any.whl
Upload date: Apr 11, 2026
Size: 25.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for neural_trees-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3581fe59fc98277ebad3eddd99fa8f73df09448fc48cb7086719f36facf6c069`
MD5	`8e96d08bd54e197497f511e8d8bb3bfa`
BLAKE2b-256	`dec50209775292b830ebdc715817fcdf165e73550fc17030840f8587db6f69fa`

See more details on using hashes here.

neural-trees 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

neural-trees

Why?

Installation

Quick Start

Soft Decision Trees

Comparing Two Classifiers — The Gold Standard Test

Hierarchical Mixture of Experts with Dropout

GAL — Grow and Learn Networks

Omnivariate Decision Trees

All sklearn-compatible

Notebooks

About the Author

Citation

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes