Skip to main content

A binary classifier using Student's t-distribution for univariate and multivariate continuous data.

Project description

TDistributionClassifier

Author: Abdul Mofique Siddiqui
License: MIT
Install via pip:

pip install TDistributionClassifier

Import it in your Python code:

from TDistributionClassifier import TDistributionClassifier

Overview

TDistributionClassifier is a binary classifier for continuous 1D or multi-dimensional data. It models each class using the Student's t-distribution, making it robust to outliers and suitable for both univariate and multivariate data.


Installation

Install the package via pip:

pip install tdistributionclassifier

How It Works

  • Univariate Mode: For 1D features, each class is modeled using a univariate t-distribution.
  • Multivariate Mode: For multi-dimensional features, each class is modeled using a multivariate t-distribution.
  • Uses log-probabilities and the log-sum-exp trick for numerical stability.
  • Automatically detects the input dimensionality and selects the appropriate mode.

Getting Started

1. Import the package

from TDistributionClassifier import TDistributionClassifier

2. Initialize the classifier

clf = TDistributionClassifier()

3. Fit the model

clf.fit(X_train, y_train)
  • X_train: numpy array of shape (n_samples,) or (n_samples, n_features)
  • y_train: binary labels (0 or 1)

4. Predict class probabilities

probs = clf.predict_proba(X_test)
  • Returns a numpy array of shape (n_samples, 2) with class 0 and class 1 probabilities.

5. Predict class labels

labels = clf.predict(X_test)
  • Returns predicted class labels (0 or 1)

API Reference

TDistributionClassifier()

Initializes the classifier. No arguments required.


.fit(X, y)

Fits the model to the training data.

  • Parameters:
    • X: numpy array of training features. Shape: (n_samples,) or (n_samples, n_features)
    • y: binary class labels (0 or 1). Shape: (n_samples,)

.predict_proba(X)

Returns predicted class probabilities.

  • Input:
    • X: Features. Shape: (n_samples,) or (n_samples, n_features)
  • Output:
    • probs: array of shape (n_samples, 2) with [P(class=0), P(class=1)]

.predict(X)

Returns predicted class labels based on highest probability.

  • Input:
    • X: Features. Shape: (n_samples,) or (n_samples, n_features)
  • Output:
    • labels: array of shape (n_samples,), values are 0 or 1

Example Usage

from TDistributionClassifier import TDistributionClassifier
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split

# Load data and binarize target
data = load_diabetes()
X = data.data
y = (data.target > 100).astype(int)  # Binary classification

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# Initialize and train
clf = TDistributionClassifier()
clf.fit(X_train, y_train)

# Predict
probs = clf.predict_proba(X_test)
preds = clf.predict(X_test)

Internals

  • PDF Estimation: Uses scipy.stats.t (univariate) or scipy.stats.multivariate_t (multivariate).
  • Regularization: Adds small noise (1e-6 * I) to covariance matrices to ensure invertibility.
  • Numerical Stability: Log-probabilities with log-sum-exp used for probability normalization.

Notes

  • Only supports binary classification (0 and 1).
  • Multivariate mode is triggered when input has >1 features.
  • If data is not linearly separable, consider applying feature transformation or dimensionality reduction before use.

Author

Abdul Mofique Siddiqui


License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tdistributionclassifier-1.1.8.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

TDistributionClassifier-1.1.8-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file tdistributionclassifier-1.1.8.tar.gz.

File metadata

  • Download URL: tdistributionclassifier-1.1.8.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for tdistributionclassifier-1.1.8.tar.gz
Algorithm Hash digest
SHA256 49382d5ef9c09bebe0dd7c84834a83094b7be06f9e3ba7c1669dce21fec19627
MD5 cc844c7ddcb170550e77263f34984f73
BLAKE2b-256 9f56631e121efeccbf34e9c833e309ae8d562879bff42ae40e99bff21caa793c

See more details on using hashes here.

File details

Details for the file TDistributionClassifier-1.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for TDistributionClassifier-1.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 caefe3819d2a3f6a782e959d42af4205dcb5d739f05c624825f1c7420bc9d708
MD5 12d95003c5432af732f5cf919dd36f93
BLAKE2b-256 748884c5fef80424a44a10bf9b885cd08675b8ac25c61cd46a8a341e127ec8b1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page