Skip to main content

Train decision trees, both univariate and multivariate.

Project description

Oh, My Trees

Oh, My Trees (ohmt) is a library for hyperplane-based Decision Tree induction, which allows you to induce both Univariate (e.g., CART, C4.5) and Multivariate (OC1, Geometric) Decision Trees. It currently supports single-class classification trees, and does not support categorical variables as they don't play well with hyperplanes.

Quickstart

Installation

Installation through git:

git clone https://github.com/msetzu/oh-my-trees
mkvirtualenv -p python3.11 omt  # optional, creates virtual environment

cd oh-my-trees
pip install -r src/requirements.txt

or directly through pip:

pip install ohmt

Training trees

OMT follows the classic sklearn fit/predict interface.
You can find a full example in the examples notebook notebooks/examples.ipynb.

from ohmt.trees.multivariate import OmnivariateDT

dt = OmnivariateDT()
x = ...
y = ...

# trees all follow a similar sklearn-like training interface, with max_depth, min_samples, and min_eps as available parameters
dt.fit(x, y, max_depth=4)

OMT also offers a pruning toolkit, handled by trees.pruning.Gardener, which allows you to prune the inducted Tree. Find out more in the example notebook.

Induction algorithms

OMT offers several Tree induction algorithms

Algorithm Type Reference Info
C4.5 Univariate
CART Univariate
DKM Univariate
OC1 Multivariate Paper
Geometric Multivariate Paper Only traditional SVM cut
Omnivariate Multivariate Test all possible splits, pick the best one
Model tree Multivariate Paper
Linear tree Multivariate Paper
Optimal trees* Multivariate Paper Mirror of Interpretable AI's implementation

*As mirror of Interpretable AI's implementation, you need to install the appropriate license to use Optimal trees

Using Trees

You can get an explicit view of a tree by accessing:

  • tree.nodes: Dict[int, Node] its nodes,
  • tree.parent: Dict[int, int], tree.ancestors: Dict[int, List[int]] its parent and ancestors,
  • tree.descendants: Dict[int, List[int] its descendants,
  • tree.depth: Dict[int, int]: the depth of its nodes.

Trees can also be JSONized:

tree.json()

Growing your own Tree

Greedy trees follow the basic algorithmic core of

  • learning step: induce a node
  • if shall continue:
    • generate two children
    • recurse on the given children

We incorporate this algorithm in Tree, where step implements the node induction, thus, most greedy induction algorithms can implemented by simply overriding the step function:

    def step(self, parent_node: Optional[InternalNode],
             data: numpy.ndarray, labels: numpy.ndarray, classes: numpy.ndarray,
             direction: Optional[str] = None, depth: int = 1,
             min_eps: float = 0.000001, max_depth: int = 16, min_samples: int = 10,
             node_fitness_function: Optional[Callable] = None,
             node_hyperparameters: Optional[Dict] = None, **step_hyperparameters) -> Optional[Node]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ohmt-0.0.4.tar.gz (53.2 kB view details)

Uploaded Source

Built Distribution

ohmt-0.0.4-py3-none-any.whl (58.3 kB view details)

Uploaded Python 3

File details

Details for the file ohmt-0.0.4.tar.gz.

File metadata

  • Download URL: ohmt-0.0.4.tar.gz
  • Upload date:
  • Size: 53.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for ohmt-0.0.4.tar.gz
Algorithm Hash digest
SHA256 22bb7b07e4d05f31a748183a2a6cb11fe4e886f7298a2fec430141415f2e6043
MD5 c3292ae3358e2d61fabcf43f99b624d2
BLAKE2b-256 702eb5e7e81333e12acca929687cadcd00d47266ce15c627ea567423dca0cd98

See more details on using hashes here.

File details

Details for the file ohmt-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: ohmt-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 58.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for ohmt-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d1acef54536eb952bbb91c8e8321f993d57ba0dfc5cbf867ba3a45525340b477
MD5 da9c50ceb5bc664e8f889ec2167871c4
BLAKE2b-256 8b70f652cd0a895c99ee7e98b6791876dbb3b34d19a2a84bd05ef7bd1e34f789

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page