MKYZ is a Python library for ML and data science tasks.

These details have not been verified by PyPI

Project links

Homepage

Project description

MKYZ - Machine Learning Library

Version Python License

MKYZ is a comprehensive Python machine learning library designed to simplify data processing, model training, evaluation, and visualization tasks. Built on top of scikit-learn, it provides a unified API for common ML workflows.

✨ Features

Core Capabilities

🔄 Data Preparation - Automatic handling of missing values, outliers, and categorical encoding
🎯 Model Training - Support for 20+ classification, regression, and clustering algorithms
📊 AutoML - Automatic model selection and hyperparameter optimization
📈 Evaluation - Comprehensive metrics with 10 cross-validation strategies
🎨 Visualization - 40+ built-in plotting functions for EDA and model results

New in v0.2.0

💾 Model Persistence - Save and load models with metadata
🔧 Feature Engineering - Polynomial, datetime, lag, and rolling features
📝 Auto Reports - Generate HTML reports with one line of code
⚡ Parallel Processing - Built-in utilities for faster training
🛡️ Robust Error Handling - Custom exceptions for better debugging

📦 Installation

pip install mkyz

From Source

git clone https://github.com/mmustafakapici/mkyz.git
cd mkyz
pip install -e .

Dependencies

pandas, scikit-learn, numpy, matplotlib, seaborn, 
plotly, xgboost, lightgbm, catboost, rich, mlxtend

🚀 Quick Start

Basic Usage (Original API)

import mkyz

# 1. Prepare data
data = mkyz.prepare_data('dataset.csv', target_column='price')

# 2. Train model
model = mkyz.train(data, task='classification', model='rf')

# 3. Make predictions
predictions = mkyz.predict(data, model)

# 4. Evaluate
scores = mkyz.evaluate(data, predictions)
print(scores)

# 5. Visualize
mkyz.visualize(data)

AutoML - Find the Best Model

import mkyz

data = mkyz.prepare_data('dataset.csv', target_column='target')

# Automatically train and compare all models
best_model = mkyz.auto_train(
    data, 
    task='classification',
    optimize_models=True,
    optimization_method='bayesian'
)

New Modular API (v0.2.0)

import mkyz

# Configure globally
mkyz.set_config(random_state=42, n_jobs=-1, verbose=1)

# Load data flexibly
df = mkyz.load_data('data.csv')  # Also supports Excel, JSON, Parquet

# Validate dataset
validation = mkyz.validate_dataset(df, target_column='target')
if not validation['is_valid']:
    print(validation['issues'])

# Feature Engineering
fe = mkyz.FeatureEngineer()
df = fe.create_datetime_features(df, 'date_column')
df = fe.create_polynomial_features(df, ['age', 'income'], degree=2)

# Select best features
selected = mkyz.select_features(X, y, k=10, method='mutual_info')

# Advanced Cross-Validation
results = mkyz.cross_validate(
    model, X, y,
    cv=mkyz.CVStrategy.STRATIFIED,
    n_splits=5,
    return_train_score=True
)
print(f"Mean accuracy: {results['mean_test_score']:.4f}")

# Save trained model
mkyz.save_model(model, 'models/my_model', metadata={'version': '1.0'})

# Load model later
model = mkyz.load_model('models/my_model.joblib')

# Generate comprehensive report
report = mkyz.ModelReport(model, X_test, y_test, task='classification')
report.generate()
report.export_html('reports/model_report.html')
print(report.summary())

📚 Documentation

Modules Overview

Module	Description
`mkyz.core`	Configuration, exceptions, base classes
`mkyz.data`	Data loading, preprocessing, feature engineering
`mkyz.evaluation`	Metrics, cross-validation, reporting
`mkyz.persistence`	Model saving and loading
`mkyz.utils`	Logging and parallel processing utilities

Detailed Guides

🔧 Supported Models

Classification

Model	Key	Description
Random Forest	`rf`	Ensemble of decision trees
Logistic Regression	`lr`	Linear classification
SVM	`svm`	Support Vector Machine
KNN	`knn`	K-Nearest Neighbors
Decision Tree	`dt`	Single decision tree
Naive Bayes	`nb`	Probabilistic classifier
Gradient Boosting	`gb`	Boosted trees
XGBoost	`xgb`	Extreme Gradient Boosting
LightGBM	`lgbm`	Light Gradient Boosting
CatBoost	`catboost`	Categorical Boosting

Regression

Model	Key	Description
Random Forest	`rf`	Ensemble regressor
Linear Regression	`lr`	OLS regression
SVR	`svm`	Support Vector Regression
KNN	`knn`	K-Nearest Neighbors
Decision Tree	`dt`	Single decision tree

Clustering

Model	Key	Description
K-Means	`kmeans`	Centroid-based
DBSCAN	`dbscan`	Density-based
Agglomerative	`agglomerative`	Hierarchical
GMM	`gmm`	Gaussian Mixture
Mean Shift	`mean_shift`	Mode-seeking

Dimensionality Reduction

Model	Key	Description
PCA	`pca`	Principal Component Analysis
SVD	`svd`	Truncated SVD
NMF	`nmf`	Non-negative Matrix Factorization

📊 Cross-Validation Strategies

from mkyz import cross_validate, CVStrategy

# Available strategies
strategies = [
    CVStrategy.KFOLD,              # Standard K-Fold
    CVStrategy.STRATIFIED,         # Stratified K-Fold (default)
    CVStrategy.TIME_SERIES,        # Time Series Split
    CVStrategy.GROUP,              # Group K-Fold
    CVStrategy.REPEATED,           # Repeated K-Fold
    CVStrategy.REPEATED_STRATIFIED,# Repeated Stratified
    CVStrategy.LEAVE_ONE_OUT,      # Leave-One-Out
    CVStrategy.SHUFFLE,            # Shuffle Split
    CVStrategy.STRATIFIED_SHUFFLE  # Stratified Shuffle
]

# Usage
results = cross_validate(model, X, y, cv=CVStrategy.TIME_SERIES, n_splits=5)

🔧 Configuration

import mkyz

# View current config
print(mkyz.get_config().to_dict())

# Update config
mkyz.set_config(
    random_state=42,
    n_jobs=-1,
    cv_folds=5,
    verbose=1,
    dark_mode=True
)

Available Settings

Setting	Default	Description
`random_state`	42	Random seed for reproducibility
`n_jobs`	-1	Parallel jobs (-1 = all CPUs)
`cv_folds`	5	Default CV folds
`test_size`	0.2	Train/test split ratio
`verbose`	1	Verbosity level
`optimization_method`	'grid_search'	'grid_search' or 'bayesian'
`missing_value_strategy`	'mean'	'mean', 'median', 'mode', 'drop'
`outlier_strategy`	'remove'	'remove', 'cap', 'keep'

🛡️ Error Handling

from mkyz import (
    MKYZError,           # Base exception
    DataValidationError, # Data issues
    ModelNotTrainedError,# Model not fitted
    UnsupportedTaskError,# Invalid task type
    PersistenceError     # Save/load failures
)

try:
    model = mkyz.load_model('nonexistent.joblib')
except PersistenceError as e:
    print(f"Failed to load model: {e}")

📈 Visualization

import mkyz

# EDA visualizations
mkyz.visualize(data, plot_type='histogram')
mkyz.visualize(data, plot_type='correlation')
mkyz.visualize(data, plot_type='boxplot')

# Available plot types:
# histogram, bar, box, violin, pie, scatter, line,
# heatmap, pair, swarm, strip, kde, ridge, density,
# joint, regression, residual, qq, ecdf, dendrogram...

🤝 Contributing

Contributions are welcome! Please read our Contributing Guide.

# Clone the repository
git clone https://github.com/mmustafakapici/mkyz.git

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

👤 Author

Mustafa Kapıcı

Email: m.mustafakapici@gmail.com
GitHub: @mmustafakapici

🙏 Acknowledgments

Built on top of scikit-learn
Boosting models from XGBoost, LightGBM, CatBoost
Visualization powered by Matplotlib, Seaborn, Plotly

Made with ❤️ in Turkey

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.3

Jan 15, 2026

0.2.2

Jan 15, 2026

This version

0.2.1

Jan 15, 2026

0.1.1

Oct 26, 2024

0.1

Oct 1, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mkyz-0.2.1.tar.gz (59.1 kB view details)

Uploaded Jan 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mkyz-0.2.1-py3-none-any.whl (63.6 kB view details)

Uploaded Jan 15, 2026 Python 3

File details

Details for the file mkyz-0.2.1.tar.gz.

File metadata

Download URL: mkyz-0.2.1.tar.gz
Upload date: Jan 15, 2026
Size: 59.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for mkyz-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`437e6c4899412613ad0d949a25b2ee3503010e32a99881ea21cd45bd6fba46ab`
MD5	`7597de779906174f7a3b175906288d15`
BLAKE2b-256	`29b71e9c15323f8c81511b98dbf2d860fd29b8431d97600eebc3493407198d7e`

See more details on using hashes here.

File details

Details for the file mkyz-0.2.1-py3-none-any.whl.

File metadata

Download URL: mkyz-0.2.1-py3-none-any.whl
Upload date: Jan 15, 2026
Size: 63.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for mkyz-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fe8586d37f65056b467e56edf294a918971601dc46f4eef733adc866afee2cb0`
MD5	`2cd66fbea325b50e0e7bcabafe49140a`
BLAKE2b-256	`067988f862c890ab7e65a71ffeb87845ce21580151667d4d84b7abf9b8fd7e7a`

See more details on using hashes here.

mkyz 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MKYZ - Machine Learning Library

✨ Features

Core Capabilities

New in v0.2.0

📦 Installation

From Source

Dependencies

🚀 Quick Start

Basic Usage (Original API)

AutoML - Find the Best Model

New Modular API (v0.2.0)

📚 Documentation

Modules Overview

Detailed Guides

🔧 Supported Models

Classification

Regression

Clustering

Dimensionality Reduction

📊 Cross-Validation Strategies

🔧 Configuration

Available Settings

🛡️ Error Handling

📈 Visualization

🤝 Contributing

📄 License

👤 Author

🙏 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes