A machine learning library implementing algorithms from scratch

These details have not been verified by PyPI

Project links

Project description

VishuML

A comprehensive machine learning library implementing fundamental algorithms from scratch in Python. This library provides educational implementations of popular ML algorithms without relying on external ML frameworks like scikit-learn.

Features

🎯 sklearn-compatible API - Works seamlessly with pandas DataFrames and CSV data!

VishuML implements the following machine learning algorithms:

Supervised Learning

Linear Regression - For continuous target prediction
Logistic Regression - For binary classification
K-Nearest Neighbors (KNN) - For classification and regression
Support Vector Machine (SVM) - For binary classification with linear and RBF kernels
Decision Tree - For classification using CART algorithm
Naive Bayes - Gaussian Naive Bayes for classification
Perceptron - Linear binary classifier

Unsupervised Learning

K-Means Clustering - For data clustering

Utilities

Data splitting (train/test split)
Evaluation metrics (accuracy, R², MSE)
Distance functions
Data normalization
Confusion matrix

Installation

From PyPI (when published)

pip install vishuml

From Source

git clone https://github.com/vishuRizz/vishuml.git
cd vishuml
pip install -e .

Quick Start

🚀 Works with pandas DataFrames (Just like sklearn!)

import pandas as pd
from vishuml import LinearRegression, LogisticRegression
from vishuml.utils import train_test_split, r2_score, accuracy_score

# Load your CSV data (just like sklearn!)
df = pd.read_csv('your_data.csv')
X = df[['feature1', 'feature2', 'feature3']]  # Select features
y = df['target']                               # Select target

# Train-test split (works with DataFrames!)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model (accepts DataFrames!)
model = LinearRegression()
model.fit(X_train, y_train)  # DataFrame input!

# Make predictions (works with DataFrames!)
predictions = model.predict(X_test)
score = model.score(X_test, y_test)
print(f"R² Score: {score:.4f}")

# Classification Example with real data
from vishuml import datasets as ds
X, y = ds.load_iris()

# Convert to DataFrame for realistic workflow
iris_df = pd.DataFrame(X, columns=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'])
iris_df['species'] = y

# sklearn-like feature selection
features = iris_df[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']]
target = (iris_df['species'] == 0).astype(int)  # Binary classification

X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.3)

classifier = LogisticRegression()
classifier.fit(X_train, y_train)  # DataFrame input!
accuracy = classifier.score(X_test, y_test)
print(f"Accuracy: {accuracy:.4f}")

Traditional NumPy Arrays

import numpy as np
from vishuml import LinearRegression, KMeans

# NumPy arrays also work (backward compatibility)
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 6, 8, 10])

model = LinearRegression()
model.fit(X, y)
predictions = model.predict([[6], [7]])
print(f"Predictions: {predictions}")  # Should be close to [12, 14]

# Clustering Example
X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
kmeans = KMeans(k=2, random_state=42)
clusters = kmeans.fit_predict(X)
print(f"Cluster labels: {clusters}")

Algorithm Documentation

Linear Regression

from vishuml import LinearRegression

# Create and train model
model = LinearRegression(fit_intercept=True)
model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)

# Get R² score
score = model.score(X_test, y_test)

Logistic Regression

from vishuml import LogisticRegression

# Create and train model
model = LogisticRegression(learning_rate=0.01, max_iterations=1000)
model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)
probabilities = model.predict_proba(X_test)

# Get accuracy
accuracy = model.score(X_test, y_test)

K-Nearest Neighbors

from vishuml import KNearestNeighbors

# For classification
knn_clf = KNearestNeighbors(k=3, task_type='classification')
knn_clf.fit(X_train, y_train)
predictions = knn_clf.predict(X_test)

# For regression
knn_reg = KNearestNeighbors(k=5, task_type='regression')
knn_reg.fit(X_train, y_train)
predictions = knn_reg.predict(X_test)

Support Vector Machine

from vishuml import SupportVectorMachine

# Linear SVM
svm_linear = SupportVectorMachine(C=1.0, kernel='linear')
svm_linear.fit(X_train, y_train)

# RBF SVM
svm_rbf = SupportVectorMachine(C=1.0, kernel='rbf', gamma=1.0)
svm_rbf.fit(X_train, y_train)

predictions = svm_rbf.predict(X_test)
decision_scores = svm_rbf.decision_function(X_test)

Decision Tree

from vishuml import DecisionTree

# Create and train model
tree = DecisionTree(max_depth=5, min_samples_split=2, min_samples_leaf=1)
tree.fit(X_train, y_train)

# Make predictions
predictions = tree.predict(X_test)
accuracy = tree.score(X_test, y_test)

Naive Bayes

from vishuml import NaiveBayes

# Create and train model
nb = NaiveBayes()
nb.fit(X_train, y_train)

# Make predictions
predictions = nb.predict(X_test)
probabilities = nb.predict_proba(X_test)

Perceptron

from vishuml import Perceptron

# Create and train model
perceptron = Perceptron(learning_rate=0.01, max_iterations=1000)
perceptron.fit(X_train, y_train)

# Make predictions
predictions = perceptron.predict(X_test)
decision_scores = perceptron.decision_function(X_test)

K-Means Clustering

from vishuml import KMeans

# Create and train model
kmeans = KMeans(k=3, init='k-means++', random_state=42)
kmeans.fit(X)

# Get cluster labels
labels = kmeans.labels
# Or predict for new data
new_labels = kmeans.predict(X_new)

# Transform to distance space
distances = kmeans.transform(X)

Utility Functions

from vishuml.utils import (
    train_test_split, accuracy_score, r2_score,
    mean_squared_error, euclidean_distance,
    normalize, confusion_matrix
)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Evaluate predictions
accuracy = accuracy_score(y_true, y_pred)
r2 = r2_score(y_true, y_pred)
mse = mean_squared_error(y_true, y_pred)

# Normalize features
X_normalized = normalize(X)

# Confusion matrix
cm = confusion_matrix(y_true, y_pred)

Sample Datasets

The library includes sample datasets in CSV format:

datasets/iris.csv - Classic iris flower classification dataset
datasets/housing.csv - Housing price regression dataset
datasets/wine.csv - Wine quality classification dataset

import pandas as pd
import os

# Load sample datasets
iris_data = pd.read_csv('datasets/iris.csv')
housing_data = pd.read_csv('datasets/housing.csv')
wine_data = pd.read_csv('datasets/wine.csv')

Examples

Check out the examples/ directory for Jupyter notebook tutorials demonstrating each algorithm:

examples/linear_regression_example.ipynb
examples/logistic_regression_example.ipynb
examples/knn_example.ipynb
examples/svm_example.ipynb
examples/decision_tree_example.ipynb
examples/naive_bayes_example.ipynb
examples/perceptron_example.ipynb
examples/kmeans_example.ipynb

Development

Setup Development Environment

git clone https://github.com/vishuRizz/vishuml.git
cd vishuml
pip install -e ".[dev]"

Running Tests

pytest tests/ -v --cov=vishuml

Code Formatting

black vishuml/
flake8 vishuml/

Requirements

Python >= 3.7
NumPy >= 1.19.0

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Educational Purpose

This library is designed for educational purposes to help understand how machine learning algorithms work under the hood. For production use, consider using mature libraries like scikit-learn, which are more optimized and feature-complete.

Author

Vishu - GitHub Profile

Acknowledgments

Inspired by scikit-learn's API design
Algorithms implemented based on standard textbook descriptions
Built for educational and learning purposes

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.8

Aug 11, 2025

0.1.7

Aug 11, 2025

0.1.6

Aug 9, 2025

0.1.5

Aug 9, 2025

0.1.3

Aug 9, 2025

This version

0.1.2

Aug 9, 2025

0.1.1

Aug 9, 2025

0.1.0

Aug 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vishuml-0.1.2.tar.gz (26.0 kB view details)

Uploaded Aug 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vishuml-0.1.2-py3-none-any.whl (30.9 kB view details)

Uploaded Aug 9, 2025 Python 3

File details

Details for the file vishuml-0.1.2.tar.gz.

File metadata

Download URL: vishuml-0.1.2.tar.gz
Upload date: Aug 9, 2025
Size: 26.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for vishuml-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`32cb6c482274642e796a978bf076e7945d1fe4f9ad97ac408702fc25dedacb9f`
MD5	`b6a4101835215ee4d161412d052259e3`
BLAKE2b-256	`4d2abd4c24252195d563659023a9a01251d8c0dcf78767d58e2064f83655f5ad`

See more details on using hashes here.

File details

Details for the file vishuml-0.1.2-py3-none-any.whl.

File metadata

Download URL: vishuml-0.1.2-py3-none-any.whl
Upload date: Aug 9, 2025
Size: 30.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for vishuml-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4cf9a94742b0dddca778b362691cc413fa7dec284f1e2a662c7e288f6752d472`
MD5	`556377fc15003afb70e8b26929d710f1`
BLAKE2b-256	`e413ddc665cb077d46ca7003fc06eefbc4e1cdfd951080c72b03d76ed1d3cd73`

See more details on using hashes here.

vishuml 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

VishuML

Features

Supervised Learning

Unsupervised Learning

Utilities

Installation

From PyPI (when published)

From Source

Quick Start

🚀 Works with pandas DataFrames (Just like sklearn!)

Traditional NumPy Arrays

Algorithm Documentation

Linear Regression

Logistic Regression

K-Nearest Neighbors

Support Vector Machine

Decision Tree

Naive Bayes

Perceptron

K-Means Clustering

Utility Functions

Sample Datasets

Examples

Development

Setup Development Environment

Running Tests

Code Formatting

Requirements

License

Contributing

Educational Purpose

Author

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes