A Python package implementing fundamental machine learning algorithms from scratch with NumPy.

These details have not been verified by PyPI

Project links

Homepage

Project description

Machine Learning Package

ek_ml_package is a collection of fundamental machine learning algorithms implemented fully from scratch using Python and NumPy only. Designed basically for learners, educators and researchers, this package emphasizes understanding the core mechanics behind popular machine learning techniques without relying on high-level frameworks like scikit-learn, Pytorch or TensorFlow.

Key Features

Introduction to Machine Learning
- Introduction
Supervised Learning Algorithms:
Unsupervised Learning Algorithms:
- Principal Component Analysis (PCA)
- K-Means Clustering
Utilities:
- Common Activation Functions (ReLU, Sigmoid, Tanh, etc.)
- Loss Functions (Mean Squared Error, Cross-Entropy, etc.)

Why Use ek_ml_package?

From Scratch Implementation: Each algorithm is built with Python and NumPy only, offering transparent and educational insight into how machine learning models function internally.
No External Dependencies: Avoids reliance on heavy machine learning libraries to maintain simplicity and promote hands-on experimentation.
Learning-Focused: Perfect for students and practitioners wanting to deepen their understanding of machine learning algorithms beyond black-box usage.
Extensible & Customizable: Easily adapt and extend the base code for research, projects or tailored applications.

Installation

Install the latest stable version from PyPI using:

pip install ek_ml_package

For the latest development version, clone the repository and install manually:

git clone https://github.com/ekbarkacha/ek_ml_package.git
cd ek_ml_package
pip install -r requirements.txt
pip install -e .

Usage Example: Supervised Learning

Linear Regression

from ek_ml_package.linear_regression import LinearRegression
import numpy as np

# Generate some toy data
X = np.random.rand(100, 3)
y = X @ np.array([1.5, -2.0, 1.0]) + 0.5 + np.random.randn(100) * 0.1

# Initialize and train model with minibatch gradient descent
model = LinearRegression(lr=0.01, epochs=500, method='minibatch', batch_size=16, momentum=0.9)
model.fit(X, y, validation_split=0.2)

# Predict and check MSE on training data
predictions = model.predict(X)
mse = np.mean((y - predictions) ** 2)
print(f"Train MSE: {mse:.4f}")

# Plot loss
model.plot_loss()

Logistic Regression

import numpy as np
from ek_ml_package.logistic_regression import LogisticRegression

# Generate some synthetic data
np.random.seed(0)
X = np.random.randn(100, 2)
y = (X[:, 0] + X[:, 1] > 0).astype(int)

# Train logistic regression
model = LogisticRegression(lr=0.1, epochs=500, batch_size=16, random_seed=42)
model.fit(X, y, validation_split=0.2)

# Predict probabilities
y_probs = model.predict(X)

# Convert probabilities to classes
y_pred = model.convert_probabilities_to_classes(y_probs)

# Compute accuracy
acc = model.accuracy(y, y_probs)
print(f"Training accuracy: {acc:.2f}")

# Plot loss curve
model.plot_loss()

KNN Classification

from ek_ml_package.knn import KNNClassification
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

knn = KNNClassification(k=5)
knn.fit(X_train, y_train)
print("Test accuracy:", knn.accuracy(X_test, y_test))

Gaussian Discriminant Analysis (LDA and QDA)

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from ek_ml_package.gaussian_discriminant_analysis import QDA, LDA

# Generate 2D data with 4 classes
X, y = make_classification(n_samples=500, n_features=2, n_redundant=0,
                           n_informative=2, n_clusters_per_class=1,
                           n_classes=4, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train and evaluate LDA
lda = LDA()
lda.fit(X_train, y_train)
y_pred_lda = lda.predict(X_test)
print(f"LDA Accuracy: {lda.accuracy(y_test, y_pred_lda):.4f}")

# Train and evaluate QDA
qda = QDA()
qda.fit(X_train, y_train)
y_pred_qda = qda.predict(X_test)
print(f"QDA Accuracy: {qda.accuracy(y_test, y_pred_qda):.4f}")

Perceptron

import numpy as np
from ek_ml_package.perceptron import Perceptron

# Generate some simple linearly separable data
X = np.array([
    [2, 3],
    [1, 1],
    [2, 1],
    [3, 2],
    [-1, -1],
    [-2, -3],
    [-3, -2],
    [-2, -1]
])

# Labels must be -1 or 1
y = np.array([1, 1, 1, 1, -1, -1, -1, -1])

# Initialize the Perceptron model
model = Perceptron(max_iter=100, tol=1e-5, random_state=42)

# Train the model
model.fit(X, y)

# Make predictions on training data
predictions = np.array([model.predict(x) for x in X])

# Calculate accuracy
acc = model.accuracy(y, predictions)
print(f"Training accuracy: {acc:.2f}%")

# Predict a new sample
new_sample = np.array([0.5, 1.5])
predicted_label = model.predict(new_sample)
print(f"Prediction for new sample {new_sample}: {predicted_label}")

NeuralNetwork

import numpy as np
from ek_ml_package.neural_network import NeuralNetwork, Layer

# Define the network architecture as a list of Layers
architecture = [
    Layer(units=4, activation=None),      # Input layer (4 features)
    Layer(units=10, activation='relu'),  # Hidden layer with 10 neurons and ReLU activation
    Layer(units=3, activation='softmax') # Output layer with 3 neurons (e.g. for 3-class classification)
]

# Initialize the Neural Network
nn = NeuralNetwork(architecture, criterion='ce', learning_rate=0.01, random_seed=42)

# Optional: View model summary
nn.summary()

# Generate dummy data: 100 samples, 4 features
X_train = np.random.randn(100, 4)
y_train = np.random.randint(0, 3, size=(100,))  # Multiclass labels 0,1,2

# Train the network for 50 epochs with batch size 16
history = nn.train(X_train, y_train, epochs=50, batch_size=16)

# Predict class labels on new data
X_test = np.random.randn(10, 4)
predictions = nn.predict(X_test)

print("Predictions:", predictions)

# Evaluate accuracy on training set
accuracy = nn.score(X_train, y_train)
print(f"Training Accuracy: {accuracy:.4f}")

Usage Example: Unsupervised Learning

PCA

from ek_ml_package.pca import PCA
import numpy as np

# Sample data: 5 samples, 3 features
X = np.array([
    [2.5, 2.4, 0.5],
    [0.5, 0.7, 1.0],
    [2.2, 2.9, 0.3],
    [1.9, 2.2, 0.8],
    [3.1, 3.0, 0.4]
])

# Instantiate PCA to keep 2 principal components
pca = PCA(n_component=2)

# Fit PCA model
pca.fit(X)

# Transform data to lower dimension
X_proj = pca.transform()
print("Projected Data:\n", X_proj)

# Optionally reconstruct approximate original data
X_reconstructed_std = pca.inverse_transform()

# Convert back to original scale
X_reconstructed = pca.unstandardize(X_reconstructed_std)
print("Reconstructed Data (approx):\n", X_reconstructed)

# Reconstruction error
mse = np.mean((X - X_reconstructed)**2)
print(f"Reconstruction MSE: {mse:.4f}")

# Explained variance by each component
print("Explained Variance (%):", pca.explained_variance)
print("Cumulative Explained Variance (%):", pca.cum_explained_variance)

Kmeans

from ek_ml_package.kmeans import Kmeans
import numpy as np

# Create sample data
X = np.array([
    [1.0, 2.0],
    [1.5, 1.8],
    [5.0, 8.0],
    [8.0, 8.0],
    [1.0, 0.6],
    [9.0, 11.0]
])

# Initialize KMeans with 2 clusters, using kmeans++ initialization
kmeans = Kmeans(k=2, max_iters=100, initialization="kmean++", random_state=42)

# Fit model to data
kmeans.fit(X)

# Get cluster labels for input data
labels = kmeans.labels
print("Cluster labels:", labels)

# Access centroids
print("Centroids:\n", kmeans.centroids)

# Compute inertia (sum of squared distances)
print("Inertia:", kmeans.inertia)

# Predict cluster of new points
new_points = np.array([[0, 0], [10, 10]])
predicted_labels = kmeans.predict(new_points)
print("Predicted clusters for new points:", predicted_labels)

Documentation

Extensive documentation and tutorials are available to guide you through the theory and practical implementations of each algorithm:

View Documentation

The documentation covers:

Intuition and theory behind each algorithm
Step-by-step derivations and key concepts

For full implementations, check out the Jupyter notebooks in the notebooks folder:

Hands-on code from scratch using Python & NumPy
Visualizations, training steps, and outputs
Aligned with each theory doc (e.g., linear_regression.md ↔linear_regression.ipynb)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Jun 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ek_ml_package-0.1.0.tar.gz (21.8 kB view details)

Uploaded Jun 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ek_ml_package-0.1.0-py3-none-any.whl (22.3 kB view details)

Uploaded Jun 8, 2025 Python 3

File details

Details for the file ek_ml_package-0.1.0.tar.gz.

File metadata

Download URL: ek_ml_package-0.1.0.tar.gz
Upload date: Jun 8, 2025
Size: 21.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for ek_ml_package-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a983bb38ae2f7f928b713009fe7e2631a54cca6dfe48b5523e3afc319e885b0f`
MD5	`b32cbd0de4dab9c40309f27580053160`
BLAKE2b-256	`72e3a1bd1f2c76117c5634167375438c84fbecf1d2a4bd8cc22f1028e614fca5`

See more details on using hashes here.

File details

Details for the file ek_ml_package-0.1.0-py3-none-any.whl.

File metadata

Download URL: ek_ml_package-0.1.0-py3-none-any.whl
Upload date: Jun 8, 2025
Size: 22.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for ek_ml_package-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d28866e571aec8504760b8cb6d9d6bc09be4f8b4f043d8c04a12687bfc6b3b99`
MD5	`94e88a2b3c09de3b62e2deafd35e6e0a`
BLAKE2b-256	`8d32b353198703aa57f700bb699f94ebf34203c498031acc59c31eb8b783cdcf`

See more details on using hashes here.

ek-ml-package 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Machine Learning Package

Key Features

Why Use ek_ml_package?

Installation

Usage Example: Supervised Learning

Linear Regression

Logistic Regression

KNN Classification

Gaussian Discriminant Analysis (LDA and QDA)

Perceptron

NeuralNetwork

Usage Example: Unsupervised Learning

PCA

Kmeans

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes