OC1 Oblique Decision Tree - Implementation of Murthy et al. (AAAI-1992)

These details have not been verified by PyPI

Project links

Project description

OC1: Oblique Classifier 1

A Python implementation of the OC1 oblique decision tree algorithm as described in:

"OC1: A randomized algorithm for building oblique decision trees"
Sreerama K. Murthy, Simon Kasif, Steven Salzberg, and Richard Beigel
AAAI-1992

Overview

OC1 (Oblique Classifier 1) builds decision trees using oblique hyperplanes (linear combinations of attributes) rather than axis-parallel splits. This often results in smaller and more accurate trees than traditional methods like C4.5 or CART, especially for datasets with diagonal or complex decision boundaries.

Key Features

Oblique Splits: Uses hyperplanes of the form ∑(aᵢxᵢ) + a_{d+1} = 0
Hill-Climbing Optimization: Sequential coefficient perturbation with local search
Randomization: Multiple random restarts to escape local minima
Impurity Measures: Sum Minority (SM) and Max Minority (MM)
Pruning: Impurity-based and Reduced Error Pruning (REP)
Evaluation Tools: Cross-validation, confusion matrix, classification report
Visualization: Decision boundary and hyperplane plotting (requires matplotlib)
Export Methods: JSON, dictionary, and DOT format for tree visualization

Team Members

GitHub Username	Name
HxRJILI	RJILI Houssam
Kim8x-srscb	Fatima-Ezzahrae AKEBLI
Yasseriads	Yasser

Installation

Prerequisites

Python 3.8 or higher
NumPy

Install from Source

# Clone the repository
git clone https://github.com/HxRJILI/Oblique-Classifier-1.git
cd Oblique-Classifier-1

# Create virtual environment (recommended)
python -m venv .venv
.venv\Scripts\activate  # Windows
# source .venv/bin/activate  # Linux/Mac

# Install in development mode
pip install -e .

# Or install dependencies directly
pip install numpy

Install Development Dependencies

pip install -r requirements-dev.txt

Quick Start

import numpy as np
from oc1 import ObliqueDecisionTree

# Create sample data
X = np.array([
    [0, 0], [0.2, 0.1], [0.1, 0.2],  # Class 0
    [1, 1], [0.8, 0.9], [0.9, 0.8],  # Class 1
])
y = np.array([0, 0, 0, 1, 1, 1])

# Train oblique decision tree
tree = ObliqueDecisionTree(
    max_depth=5,
    impurity_measure="sm",  # Sum Minority
    n_restarts=5,           # Random restarts
    random_state=42,
)
tree.fit(X, y)

# Make predictions
predictions = tree.predict(X)
print(f"Predictions: {predictions}")

# Get accuracy
accuracy = tree.score(X, y)
print(f"Accuracy: {accuracy:.2%}")

# View tree structure
print(tree.print_tree())

API Reference

ObliqueDecisionTree

from oc1 import ObliqueDecisionTree

tree = ObliqueDecisionTree(
    max_depth=None,           # Maximum tree depth (None = unlimited)
    min_samples_leaf=1,       # Minimum samples per leaf
    min_samples_split=2,      # Minimum samples to split
    impurity_measure="sm",    # "sm" (Sum Minority) or "mm" (Max Minority)
    max_iterations=100,       # Hill-climbing iterations per node
    n_restarts=5,             # Random restarts (1 = deterministic)
    random_state=None,        # Random seed for reproducibility
    impurity_threshold=0.0,   # Stop splitting threshold
    verbose=False,            # Enable verbose logging
    log_file=None,            # Optional log file path
)

Methods

Method	Description
`fit(X, y)`	Train the decision tree
`predict(X)`	Predict class labels
`predict_proba(X)`	Predict class probabilities
`score(X, y)`	Calculate classification accuracy
`get_depth()`	Get maximum tree depth
`get_n_leaves()`	Count number of leaf nodes
`get_hyperplanes()`	Get all splitting hyperplanes
`print_tree()`	Get string representation of tree
`prune(method, ...)`	Prune the tree
`to_dict()`	Export tree as dictionary
`to_json()`	Export tree as JSON string
`to_dot()`	Export tree in DOT format
`feature_importances_`	Get feature importance scores

Synthetic Datasets

from oc1.data import (
    make_diagonal_dataset,       # 45° diagonal boundary
    make_xor_dataset,            # XOR problem
    make_oblique_classification, # Custom angle boundary
    make_multiclass_oblique,     # Multi-class sectors
    make_3d_oblique,             # 3D oblique plane
)

# Example
X, y = make_diagonal_dataset(n_samples=100, random_state=42)

Evaluation Tools

from oc1.evaluation import (
    train_test_split,      # Split data into train/test sets
    cross_validate,        # K-fold cross-validation
    confusion_matrix,      # Classification confusion matrix
    classification_report, # Precision, recall, F1-score
)

Visualization (requires matplotlib)

from oc1.visualization import (
    plot_decision_boundary_2d,  # Plot 2D decision regions
    plot_hyperplanes_2d,        # Plot all hyperplanes
)

Usage Examples

Pruning

from oc1 import ObliqueDecisionTree
from oc1.evaluation import train_test_split
from oc1.data import make_diagonal_dataset

# Generate data
X, y = make_diagonal_dataset(n_samples=200, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2)

# Build tree
tree = ObliqueDecisionTree(max_depth=10, random_state=42)
tree.fit(X_train, y_train)

# Prune using Reduced Error Pruning
tree.prune(method="rep", X_val=X_val, y_val=y_val)

# Or prune by impurity threshold
tree.prune(method="impurity", impurity_threshold=2.0)

Cross-Validation

from oc1 import ObliqueDecisionTree
from oc1.evaluation import cross_validate
from oc1.data import make_diagonal_dataset

X, y = make_diagonal_dataset(n_samples=200, random_state=42)

tree = ObliqueDecisionTree(max_depth=5, random_state=42)

# 5-fold cross-validation
results = cross_validate(tree, X, y, cv=5, random_state=42)

print(f"Mean accuracy: {results['test_score'].mean():.3f}")
print(f"Std accuracy: {results['test_score'].std():.3f}")

Logging

from oc1 import ObliqueDecisionTree
from oc1.data import make_diagonal_dataset

X, y = make_diagonal_dataset(n_samples=100, random_state=42)

# Enable verbose logging
tree = ObliqueDecisionTree(max_depth=5, verbose=True, random_state=42)
tree.fit(X, y)

# Or log to file
tree = ObliqueDecisionTree(
    max_depth=5,
    log_file="tree_construction.log",
    random_state=42
)
tree.fit(X, y)

# Get log summary
summary = tree.logger.get_log_summary()
print(f"Nodes created: {summary['nodes_created']}")

Visualization

from oc1 import ObliqueDecisionTree
from oc1.visualization import plot_decision_boundary_2d, plot_hyperplanes_2d
from oc1.data import make_diagonal_dataset
import matplotlib.pyplot as plt

X, y = make_diagonal_dataset(n_samples=100, random_state=42)

tree = ObliqueDecisionTree(max_depth=5, random_state=42)
tree.fit(X, y)

# Plot decision boundary
plot_decision_boundary_2d(tree, X, y)
plt.show()

# Plot hyperplanes
plot_hyperplanes_2d(tree, X)
plt.show()

Export Tree

from oc1 import ObliqueDecisionTree
import json

tree = ObliqueDecisionTree(max_depth=3, random_state=42)
tree.fit(X, y)

# Export as dictionary
tree_dict = tree.to_dict()

# Export as JSON
tree_json = tree.to_json(indent=2)
with open("tree.json", "w") as f:
    f.write(tree_json)

# Export as DOT (for Graphviz)
dot_string = tree.to_dot()
with open("tree.dot", "w") as f:
    f.write(dot_string)

Project Structure

Oblique-Classifier-1/
├── oc1/
│   ├── __init__.py              # Package exports
│   ├── core/
│   │   ├── node.py              # ObliqueTreeNode class
│   │   ├── tree.py              # ObliqueDecisionTree classifier
│   │   ├── splits.py            # Impurity measures, partitioning
│   │   ├── hill_climb.py        # Coefficient perturbation
│   │   └── logging.py           # TreeConstructionLogger
│   ├── data/
│   │   └── datasets.py          # Synthetic test datasets
│   ├── evaluation/
│   │   └── __init__.py          # Evaluation tools
│   ├── visualization/
│   │   └── __init__.py          # Plotting utilities
│   └── tests/
│       ├── task1_tests/         # Core tree construction tests
│       ├── task2_tests/         # Randomization tests
│       └── task3_tests/         # Pruning & evaluation tests
├── examples/
│   ├── task2_demo.py            # Task 2 demonstration
│   └── task3_demo.py            # Task 3 demonstration
├── requirements.txt             # Runtime dependencies
├── requirements-dev.txt         # Development dependencies
├── setup.py                     # Package installation
└── TESTING_GUIDE.md             # Testing documentation

Running Tests

# Run all tests (236 tests)
pytest oc1/tests/ -v

# Run specific task tests
pytest oc1/tests/task1_tests/ -v  # Core tree construction
pytest oc1/tests/task2_tests/ -v  # Randomization
pytest oc1/tests/task3_tests/ -v  # Pruning & evaluation

# Run with coverage report
pytest oc1/tests/ --cov=oc1 --cov-report=html

# Run specific test file
pytest oc1/tests/task1_tests/test_tree.py -v

Paper Fidelity

This implementation follows the original OC1 paper exactly:

Task 1: Core Tree Construction

Feature	Paper Section	Implementation
Hyperplane equation `∑(aᵢxᵢ) + a_{d+1} = 0`	Section 2	`evaluate_hyperplane()`
Partition rule (V > 0 → left)	Section 2	`partition_data()`
Sum Minority impurity	Section 2.4	`calculate_impurity()`
Max Minority impurity	Section 2.4	`calculate_impurity()`
Perturbation formula (Eq. 1): `Uⱼ = aₘxⱼᵐ - Vⱼ/xⱼᵐ`	Section 2.2	`compute_u_values()`
Sequential hill-climbing	Section 2.1	`hill_climb()`

Task 2: Randomization

Feature	Paper Section	Implementation
Random hyperplane initialization	Section 2.3	`initialize_hyperplane(method="random")`
K random trials	Section 2.3	`n_restarts` parameter
Multi-coefficient perturbation	Section 2	`perturb_multiple_coefficients()`
Random perturbation order	Section 2	`use_random_perturbation_order`

Task 3: Pruning & Evaluation

Feature	Paper Section	Implementation
Impurity-based pruning	Section 2.4	`prune(method="impurity")`
Reduced Error Pruning	Section 2.4	`prune(method="rep")`
Stopping criteria	Section 2.4	`max_depth`, `min_samples_leaf`, `impurity_threshold`

Why Oblique Trees?

Axis-parallel decision trees (like CART or C4.5) can only create splits perpendicular to feature axes. For datasets with diagonal decision boundaries, this requires many splits to approximate the true boundary.

Oblique trees use hyperplanes that can be oriented at any angle, often resulting in:

✅ Smaller trees (fewer nodes)
✅ Better accuracy on oblique data
✅ More interpretable decision boundaries

Axis-Parallel Tree (many splits)     Oblique Tree (one split)
        |                                    \
   -----+-----                                \
        |                                      \
   -----+-----           vs                     \
        |                                        \
   -----+-----                                    \

References

@inproceedings{murthy1992oc1,
  title={OC1: A randomized algorithm for building oblique decision trees},
  author={Murthy, Sreerama K and Kasif, Simon and Salzberg, Steven and Beigel, Richard},
  booktitle={Proceedings of the National Conference on Artificial Intelligence (AAAI)},
  pages={322--327},
  year={1992}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Dec 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oblique_classifier_1-0.3.0.tar.gz (38.1 kB view details)

Uploaded Dec 15, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

oblique_classifier_1-0.3.0-py3-none-any.whl (41.4 kB view details)

Uploaded Dec 15, 2025 Python 3

File details

Details for the file oblique_classifier_1-0.3.0.tar.gz.

File metadata

Download URL: oblique_classifier_1-0.3.0.tar.gz
Upload date: Dec 15, 2025
Size: 38.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for oblique_classifier_1-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`06f4d5bd793b5442ea9f89b4bdb6dc02a9192dbcdee54d93bc0bf0d6eeea425b`
MD5	`963daf390b04b32dcdf8f3f9201dcb4b`
BLAKE2b-256	`54131eb254f9626fca19b99ebe10ee446854277e4fd52975f3c1b2d8a838f657`

See more details on using hashes here.

File details

Details for the file oblique_classifier_1-0.3.0-py3-none-any.whl.

File metadata

Download URL: oblique_classifier_1-0.3.0-py3-none-any.whl
Upload date: Dec 15, 2025
Size: 41.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for oblique_classifier_1-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`445231a61597807a53905598de6f8bf5e9f63390df53be95d146f3475edf842c`
MD5	`600670080c968b4fe401681e1bb4adb8`
BLAKE2b-256	`6074622955b7d821afa9fd4368cda4a37ac02d74f27c3df12888c11d03b5b94e`

See more details on using hashes here.

oblique-classifier-1 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OC1: Oblique Classifier 1

Overview

Key Features

Team Members

Installation

Prerequisites

Install from Source

Install Development Dependencies

Quick Start

API Reference

ObliqueDecisionTree

Methods

Synthetic Datasets

Evaluation Tools

Visualization (requires matplotlib)

Usage Examples

Pruning

Cross-Validation

Logging

Visualization

Export Tree

Project Structure

Running Tests

Paper Fidelity

Task 1: Core Tree Construction

Task 2: Randomization

Task 3: Pruning & Evaluation

Why Oblique Trees?

References

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes