Skip to main content

A brief description of your package

Project description

mini_scikit_learn

mini_scikit_learn is a minimalistic clone of scikit-learn, designed to provide essential machine learning functionalities with minimal overhead. It only depends on numpy and offers a variety of basic and advanced models, utility functions, metrics, data transformers, and model selection tools.

Features

Models

  • Linear Models: Implementations of linear regression, logistic regression, etc.
  • Tree-Based Models: Decision trees for classification and regression.
  • K-Nearest Neighbors: KNN for classification and regression.
  • Random Forest: Ensemble method for classification and regression.
  • SVM: Support Vector Machines for classification and regression.(Under Testing)
  • Naive Bayes: Various naive Bayes classifiers.
  • Neural Networks: Basic feedforward and backpropagation neural networks, with customizable layers and activation functions.
  • Ensembling Techniques: Advanced techniques such as voting, stacking, and boosting (AdaBoost, Gradient Boosting).

Utility Functions

  • Cross Validation: Functions for k-fold cross-validation, train-test splitting, etc.
  • Train-Test Split: Simple utility to split datasets into training and testing sets.
  • K-Folds: Functionality to split data into k folds for cross-validation.

Metrics

  • Accuracy: Measure the accuracy of predictions.
  • Precision: Calculate the precision for classification models.
  • Recall: Compute the recall for classification models.
  • F1 Score: Calculate the F1 score for classification models.
  • Mean Squared Error (MSE): Compute the mean squared error for regression models.
  • Mean Absolute Error (MAE): Compute the mean absolute error for regression models.
  • Log Loss: Calculate the logistic loss for classification models.

Data Transformers

  • Encoders: Various encoding techniques for categorical data.
  • Imputers: Different strategies for handling missing values, SimpleImputer, IterativeImputer, KNNImputer.
  • Scalers: MinMaxScaler and StandardScaler for feature scaling.

Model Selection

  • Grid Search: Exhaustive search over specified parameter values for an estimator.
  • Random Search: Random search over specified parameter values for an estimator.

System Architecture

The design of mini_scikit_learn is heavily inspired by scikit-learn, with a strong use of inheritance from abstract classes such as Estimator and Predictor. The library respects the fit, predict, and transform API for all models and transformers, ensuring consistency and ease of use.

Core Components

  • Estimator: Base class for all estimators in the library. Defines the fit method.
  • Predictor: Base class for all predictors, extending Estimator with the predict method.
  • Transformer: Base class for all transformers, extending Estimator with the transform method.

By adhering to these interfaces, mini_scikit_learn ensures that all components can be used interchangeably, promoting modularity and ease of integration.

Installation

To install mini_scikit_learn, you can use pip:

pip install mini_scikit_learn

Requirements

  • Python > 3
  • numpy

Example Usage

from mini_scikit_learn.model_selection import GridSearch
from mini_scikit_learn.linear_model import LinearRegression
from mini_scikit_learn.metrics import accuracy_score
from mini_scikit_learn.datasets import load_iris
from mini_scikit_learn.utils import train_test_split

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

# Define the model
model = LinearRegression()

# Define parameter grid
param_grid = {'alpha': [0.1, 0.01, 0.001]}

# Perform grid search
grid_search = GridSearch(model, param_grid)
grid_search.fit(X_train, y_train)

# Get the best model
best_model = grid_search.get_best_params()
print("Best Parameters:", best_model)

# Evaluate the model
accuracy = accuracy_score(y_test, best_model.predict(X_test))
print(f"Accuracy: {accuracy}")

Documentation

For more detailed documentation and examples, please refer to the official mini_scikit_learn documentation.

Contributing

Contributions are welcome! Please fork the repository and submit pull requests.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

Inspired by the simplicity and efficiency of scikit-learn. This project aims to provide a lightweight alternative for quick prototyping and educational purposes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mini_scikit_learn-0.1.5.tar.gz (24.4 kB view details)

Uploaded Source

Built Distribution

mini_scikit_learn-0.1.5-py3-none-any.whl (3.2 kB view details)

Uploaded Python 3

File details

Details for the file mini_scikit_learn-0.1.5.tar.gz.

File metadata

  • Download URL: mini_scikit_learn-0.1.5.tar.gz
  • Upload date:
  • Size: 24.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.6

File hashes

Hashes for mini_scikit_learn-0.1.5.tar.gz
Algorithm Hash digest
SHA256 cc2cdb99f0433cc008048d9f50e74b556ce742ab315e4a05d5c2f68a2b8b96a6
MD5 215317def09101526f50b67ac42a2a4b
BLAKE2b-256 7278d151332ab3c6de3fd12d5c8dd423e19b67c5a550e265f24b73a2e915f444

See more details on using hashes here.

File details

Details for the file mini_scikit_learn-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for mini_scikit_learn-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 5209cd4ec5b4bb11bd0497bd03c4fc46eceaa81278be8e5c579900e830e6545a
MD5 9114c3d8fb5f11c6ae94704d6bf8f6b2
BLAKE2b-256 847f145f6f590b31584780f171a717a9f375abaa8cc65e5a8a9b3d5ca01c66c0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page