High-performance machine learning library with scikit-learn compatibility - Pure Rust implementation
Project description
Sklears Python Bindings
Python bindings for the sklears machine learning library, providing a high-performance, scikit-learn compatible interface through PyO3.
Latest release:
0.1.0-rc.1(February 2026). See the workspace release notes for highlights and upgrade guidance.
Features
- Drop-in replacement for scikit-learn's most common algorithms
- Pure Rust implementation with ongoing performance optimization
- Full NumPy array compatibility with zero-copy operations where possible
- Comprehensive error handling with Python exceptions
- Memory-safe operations with automatic reference counting
- Scikit-learn compatible API for easy migration
Supported Algorithms
Linear Models
LinearRegression- Ordinary least squares linear regressionRidge- Ridge regression with L2 regularizationLasso- Lasso regression with L1 regularizationElasticNet- Elastic-net regularizationBayesianRidge- Bayesian ridge regressionARDRegression- Automatic Relevance Determination regressionLogisticRegression- Logistic regression for classification
Ensemble Methods
GradientBoostingClassifier- Gradient boosting for classificationGradientBoostingRegressor- Gradient boosting for regressionAdaBoostClassifier- Adaptive boosting classifierVotingClassifier- Voting ensemble classifierBaggingClassifier- Bagging ensemble classifier
Neural Networks
MLPClassifier- Multi-layer perceptron classifierMLPRegressor- Multi-layer perceptron regressor
Naive Bayes
GaussianNB- Gaussian Naive BayesMultinomialNB- Multinomial Naive BayesBernoulliNB- Bernoulli Naive BayesComplementNB- Complement Naive Bayes
Clustering
KMeans- K-Means clustering algorithmDBSCAN- Density-based spatial clustering
Preprocessing (coming soon)
StandardScaler- Standardize features by removing mean and scaling to unit varianceMinMaxScaler- Scale features to a given rangeLabelEncoder- Encode target labels with value between 0 and n_classes-1
Tree Models (coming soon)
RandomForestClassifier- Random forest for classificationDecisionTreeClassifier- Decision tree for classification
Model Selection
train_test_split- Split arrays into random train and test subsetsKFold- K-Fold cross-validatorStratifiedKFold(coming soon) - Stratified K-Fold cross-validatorcross_val_score(coming soon) - Evaluate metric(s) by cross-validation
Metrics (coming soon)
accuracy_score- Classification accuracymean_squared_error- Mean squared error for regressionmean_absolute_error- Mean absolute error for regressionr2_score- R² (coefficient of determination) scoreprecision_score- Precision for classificationrecall_score- Recall for classificationf1_score- F1 score for classificationconfusion_matrix- Confusion matrix for classificationclassification_report- Text report of classification metrics
Installation
Prerequisites
- Python 3.9 or later
- NumPy
- Rust 1.70 or later
- PyO3 and Maturin for building
Building from Source
-
Clone the repository:
git clone https://github.com/cool-japan/sklears.git cd sklears/crates/sklears-python
-
Install Maturin:
pip install maturin
-
Build and install the package:
maturin develop --release
-
Or build a wheel:
maturin build --release pip install target/wheels/sklears-*.whl
Quick Start
import numpy as np
import sklears as skl
# Generate sample data
X = np.random.randn(100, 4)
y = np.random.randn(100)
# Train a linear regression model
model = skl.LinearRegression()
model.fit(X, y)
predictions = model.predict(X)
# Calculate R² score
score = model.score(X, y)
print(f"R² score: {score:.3f}")
Performance Comparison
Here's a typical performance comparison with scikit-learn:
import time
import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
import sklears as skl
from sklearn.linear_model import LinearRegression as SklearnLR
# Generate data
X, y = make_regression(n_samples=10000, n_features=100, noise=0.1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Sklears
start = time.time()
model = skl.LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
sklears_time = time.time() - start
# Scikit-learn
start = time.time()
sklearn_model = SklearnLR()
sklearn_model.fit(X_train, y_train)
sklearn_predictions = sklearn_model.predict(X_test)
sklearn_time = time.time() - start
print(f"Sklears time: {sklears_time:.4f}s")
print(f"Sklearn time: {sklearn_time:.4f}s")
print(f"Speedup: {sklearn_time / sklears_time:.2f}x")
API Compatibility
The sklears Python bindings are designed to be API-compatible with scikit-learn. Most existing scikit-learn code should work with minimal changes:
Before (scikit-learn):
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
After (sklears):
import sklears as skl
# Available classes and functions
model = skl.LinearRegression()
X_train, X_test, y_train, y_test = skl.train_test_split(X, y)
# Note: StandardScaler, MinMaxScaler, LabelEncoder - coming soon
# Note: mean_squared_error, r2_score, accuracy_score, etc. - coming soon
Memory Management
The bindings are designed to be memory-efficient:
- Zero-copy operations where possible using NumPy's C API
- Automatic memory management through PyO3's reference counting
- Efficient data structures using ndarray and sprs for sparse matrices
- Streaming support for large datasets that don't fit in memory
Error Handling
All Rust errors are properly converted to Python exceptions:
import sklears as skl
import numpy as np
try:
# This will raise a ValueError if arrays have incompatible shapes
model = skl.LinearRegression()
model.fit(np.array([[1, 2], [3, 4]]), np.array([1, 2, 3])) # Shape mismatch
except ValueError as e:
print(f"Error: {e}")
System Information
Get information about your sklears installation:
import sklears as skl
# Version information
print(f"Version: {skl.get_version()}")
# Build information
build_info = skl.get_build_info()
for key, value in build_info.items():
print(f"{key}: {value}")
# Note: get_hardware_info() and benchmark_basic_operations() - coming soon
Examples
See the examples/ directory for comprehensive usage examples:
python_demo.py- Complete demonstration of all features- Performance comparison scripts
- Real-world use cases
Contributing
Contributions are welcome! Please see the main sklears repository for contribution guidelines.
License
This project is licensed under the Apache-2.0 license.
Acknowledgments
- Built with PyO3 for Rust-Python interoperability
- Compatible with NumPy arrays
- API inspired by scikit-learn
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sklears-0.1.0rc1.tar.gz.
File metadata
- Download URL: sklears-0.1.0rc1.tar.gz
- Upload date:
- Size: 3.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.8.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4fd928886403d05ba1273cc518b65752c07cf5101a114753c8501a4b06cd9e3e
|
|
| MD5 |
2a30ef46a0dee32afa4a2ebbe0a00167
|
|
| BLAKE2b-256 |
ec9fb2ff50eb669722ea522110977f767fa069855201ba96aca6390df1d03baf
|
File details
Details for the file sklears-0.1.0rc1-cp314-cp314-macosx_11_0_arm64.whl.
File metadata
- Download URL: sklears-0.1.0rc1-cp314-cp314-macosx_11_0_arm64.whl
- Upload date:
- Size: 780.6 kB
- Tags: CPython 3.14, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.8.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
222365ad31598355a0960e6206a6598fe8b9f37f7660f79cee3ce6c1d2fdabb3
|
|
| MD5 |
4256effdc141ed6861a7100f0058a2b0
|
|
| BLAKE2b-256 |
515aa970a9cdb62c37bfb8f55ee4615f396f04d2cbfad45e26d5f79c8ab08858
|
File details
Details for the file sklears-0.1.0rc1-cp310-cp310-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: sklears-0.1.0rc1-cp310-cp310-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 897.6 kB
- Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.8.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d07a70df843fadc5c1b16d8078ad7c0075077d9bccd2f85cd20e3f1acc4f263
|
|
| MD5 |
d9880c960d3604cbed1af12f70e75c51
|
|
| BLAKE2b-256 |
71f2c231aafadc380f51aac8d045e9587b247a199847f9906fb0131b0b16b936
|