Fit, score, and embed nested-dichotomy (ND) trees for multi-class classification.
Project description
ndscape
Fit, score, and embed nested-dichotomy (ND) trees for multi-class classification.
A nested dichotomy reduces a C-class problem to a tree of binary splits (e.g. {0,1,2} vs {3,4}, then {0} vs {1,2}, ...). ndscape lets you fit one, score it, or place a whole population of candidate trees in a 2-D "tree-space" to see how a property (accuracy, variance, ...) varies across tree structures.
Install
pip install ndscape
pip install ndscape[spatial] # adds Moran's I support (esda, libpysal)
pip install ndscape[plot] # adds plotting (matplotlib, bokeh)
Quickstart
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import ndscape as nds
X, y = load_iris(return_X_y=True)
classes = sorted(set(y))
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
nd = nds.fit(X_train, y_train, classes=classes, base="lr")
nd.predict(X_test)
nd.score(X_test, y_test) # {"accuracy": ..., "logloss": ...}
classes is the list of class labels in your y (sorted(set(y)) works for
integer or string labels). nds.fit samples one ND tree automatically; pass
tree=... to use a specific one (see "A tree is..." below).
Use cases
You have a dataset and a binary classifier.
base can be the string "lr" or "decisiontree", or your own unfitted
scikit-learn estimator — a fresh clone of it is fit at every split.
from sklearn.svm import SVC
nd = nds.fit(X_train, y_train, classes=classes, base=SVC(probability=True, kernel="linear"))
You have a train/test split and want a score.
nd = nds.ND(tree, classes).fit(X_train, y_train, base="lr")
nd.score(X_test, y_test) # {"accuracy": ..., "logloss": ...}
You already trained the per-split models yourself.
# models in the same order as tree, or a {(left, right): model} dict — either works
nd = nds.ND.from_trained(tree, classes, models=[fitted_model_1, fitted_model_2, ...])
nd.predict_proba(X_test)
You already scored a set of trees and want to see where they sit in tree-space.
trees, coords = nds.embed_trees(classes)
nds.spatial_autocorrelation(my_scores, coords) # {"I": ..., "p_sim": ...}
You just want the whole picture: fit, score, and embed every candidate tree.
rows = nds.analyze(X_train, y_train, classes, X_test=X_test, y_test=y_test, base="lr")
# [{"tree": ..., "accuracy": ..., "logloss": ..., "coord": array([...])}, ...]
You want a picture of that tree-space.
nds.plot(rows, metric="accuracy", path="tree_space.png") # static PNG/PDF
nds.plot_interactive(rows, metric="accuracy", path="tree_space.html") # pan/zoom/hover
Both color points by metric and mark the best tree with a black x. Needs
ndscape[plot].
The embedding is slow to recompute and you want to reuse it.
rows = nds.analyze(X_train, y_train, classes, cache="embedding.joblib")
The MDS step in embed_trees/analyze is the slow part. Pass cache= a
.joblib path: the first call computes the embedding and saves it there,
later calls with the same path just load it.
A tree is a list of (left, right) tuples of class labels, e.g.
[((0, 1), (2, 3)), ((0,), (1,)), ((2,), (3,))]. Use nds.all_trees(classes)
(exhaustive, for small C) or nds.sample_trees(classes, N) (for larger C)
to generate candidates.
base accepts "lr", "decisiontree", or any unfitted scikit-learn
estimator with fit/predict_proba.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ndscape-0.1.1.tar.gz.
File metadata
- Download URL: ndscape-0.1.1.tar.gz
- Upload date:
- Size: 24.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0019397adad7666233bddaa2b8e3a4923b136d2c97ef816417a30bc3c0cd6cc4
|
|
| MD5 |
2a20c5c148ab6509c6a05705dc22d4ac
|
|
| BLAKE2b-256 |
82811e703871a72f228c58d7b45893f35c0fe419c6f3b060233b11f26d4c9e5c
|
File details
Details for the file ndscape-0.1.1-py3-none-any.whl.
File metadata
- Download URL: ndscape-0.1.1-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f1c956b055293d05cd13a2ead2fad696ceedb3b871ab8371997ec3652f5382d0
|
|
| MD5 |
398ff76331e6afb1b1a6d179eda7e7d5
|
|
| BLAKE2b-256 |
f42b8ec0be710253d5aa62a16b4d1c6b5d17cb82e6f29dcd1392951927f6e3e7
|