Skip to main content

Data Mining package implementing the PrevhClassifier algorithm.

Project description

PrevhClassifier

This package implements the Prevh classification algorithm.

The algorithm is based in the follow research Pages 71-76.

Package Documentation.

Publish to PyPI and TestPyPI

User Guide

This package can be installed with the following command: pip install prevhlib

Python example:

import numpy as np
import pandas as pd
from __init__.PrevhClassifier import PrevhClassifier
from sklearn.preprocessing import StandardScaler, LabelEncoder

if __name__ == '__main__':
    iris = pd.read_csv('Datasets/iris.csv')

    X = iris.iloc[:, 0:4].values
    y = iris.iloc[:, 4].values
    header = (iris.columns[:-1], iris.columns[-1])

    prevh = PrevhClassifier(distance_algorithm="euclidean")
    prevh.fit(X, y, header=header, encoder=LabelEncoder(), scaler=StandardScaler())

    print(prevh)
    # Outputs: 
    # {  
    #    "dataset": {
    #       "header": {
    #          "features": "Index(['sepal length', 'sepal width', 'petal length', 'petal width'], dtype='object')",
    #          "classes": "class"
    #       },
    #       "encoder": "LabelEncoder()",
    #       "scaler": "StandardScaler()"
    #    },
    #    "distance": "euclidean"
    # }

    print(prevh.classify(np.array([5.1, 3.5, 1.4, 0.2]), K=3))
    # Outputs: (array(['Iris-setosa'], dtype=object), np.float64(0.2653212465045153))

    kfold_split_arguments = {
        "n_splits": 5,
        "random_state": 42,
        "shuffle": True
    }

    Evaluation_Results = prevh.evaluate(1, "kfold_cross_validation", kfold_split_arguments)

    print(Evaluation_Results.get_metrics())
    # Outputs:
    #       accuracy  precision    recall  f1-score
    # 0     0.966667   0.972222  0.962963  0.965899
    # 1     0.966667   0.969697  0.952381  0.958486
    # 2     0.966667   0.962963  0.966667  0.962848
    # 3     0.900000   0.911681  0.905556  0.907368
    # 4     0.966667   0.972222  0.972222  0.971014
    # Mean  0.953333   0.957757  0.951958  0.953123

    Evaluation_Results.plot_confusion_matrices()

Next Steps

  • In the next steps I will add support to other split, evaluation, encoder, and decoder methods.
  • I expect to in new versions to comparisons between other machine learn method and the prevh classifier.

Change Log

0.2.0 (27/09/2024)

MAJOR CHANGES

  • Update in the README.md
  • Added new R2, MSE, MAE metrics
  • Change in the confusion matrix plot function

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prevhlib-0.2.0.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prevhlib-0.2.0-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file prevhlib-0.2.0.tar.gz.

File metadata

  • Download URL: prevhlib-0.2.0.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for prevhlib-0.2.0.tar.gz
Algorithm Hash digest
SHA256 506fbc1b4f9016cc0dbfa89fd8ed308a8391b9b3abd2618be09ffb6a8acfda39
MD5 cdaa4c2318b762b3f01e33c061167165
BLAKE2b-256 64ad747b2e96a5d310a669094ba4f710588bf920758763f907024f21c05ac8b6

See more details on using hashes here.

File details

Details for the file prevhlib-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: prevhlib-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 22.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for prevhlib-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2d42bddac5c45e2b5de32edd8f59cdddcf967d4aac2c93080f690170a71e3ea4
MD5 5df6ad36ac98061f46b2ff7b5f1b69cd
BLAKE2b-256 1d452627ff86964b5c0c630b9e56702751ba883df3e883c67e3643e28a3389ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page