Skip to main content

High-performance machine learning library with scikit-learn compatibility - Pure Rust implementation

Project description

Sklears Python Bindings

Crates.io Documentation License Minimum Rust Version

Python bindings for the sklears machine learning library, providing a high-performance, scikit-learn compatible interface through PyO3.

Latest release: 0.1.0-rc.1 (February 2026). See the workspace release notes for highlights and upgrade guidance.

Features

  • Drop-in replacement for scikit-learn's most common algorithms
  • Pure Rust implementation with ongoing performance optimization
  • Full NumPy array compatibility with zero-copy operations where possible
  • Comprehensive error handling with Python exceptions
  • Memory-safe operations with automatic reference counting
  • Scikit-learn compatible API for easy migration

Supported Algorithms

Linear Models

  • LinearRegression - Ordinary least squares linear regression
  • Ridge - Ridge regression with L2 regularization
  • Lasso - Lasso regression with L1 regularization
  • ElasticNet - Elastic-net regularization
  • BayesianRidge - Bayesian ridge regression
  • ARDRegression - Automatic Relevance Determination regression
  • LogisticRegression - Logistic regression for classification

Ensemble Methods

  • GradientBoostingClassifier - Gradient boosting for classification
  • GradientBoostingRegressor - Gradient boosting for regression
  • AdaBoostClassifier - Adaptive boosting classifier
  • VotingClassifier - Voting ensemble classifier
  • BaggingClassifier - Bagging ensemble classifier

Neural Networks

  • MLPClassifier - Multi-layer perceptron classifier
  • MLPRegressor - Multi-layer perceptron regressor

Naive Bayes

  • GaussianNB - Gaussian Naive Bayes
  • MultinomialNB - Multinomial Naive Bayes
  • BernoulliNB - Bernoulli Naive Bayes
  • ComplementNB - Complement Naive Bayes

Clustering

  • KMeans - K-Means clustering algorithm
  • DBSCAN - Density-based spatial clustering

Preprocessing (coming soon)

  • StandardScaler - Standardize features by removing mean and scaling to unit variance
  • MinMaxScaler - Scale features to a given range
  • LabelEncoder - Encode target labels with value between 0 and n_classes-1

Tree Models (coming soon)

  • RandomForestClassifier - Random forest for classification
  • DecisionTreeClassifier - Decision tree for classification

Model Selection

  • train_test_split - Split arrays into random train and test subsets
  • KFold - K-Fold cross-validator
  • StratifiedKFold (coming soon) - Stratified K-Fold cross-validator
  • cross_val_score (coming soon) - Evaluate metric(s) by cross-validation

Metrics (coming soon)

  • accuracy_score - Classification accuracy
  • mean_squared_error - Mean squared error for regression
  • mean_absolute_error - Mean absolute error for regression
  • r2_score - R² (coefficient of determination) score
  • precision_score - Precision for classification
  • recall_score - Recall for classification
  • f1_score - F1 score for classification
  • confusion_matrix - Confusion matrix for classification
  • classification_report - Text report of classification metrics

Installation

Prerequisites

  • Python 3.9 or later
  • NumPy
  • Rust 1.70 or later
  • PyO3 and Maturin for building

Building from Source

  1. Clone the repository:

    git clone https://github.com/cool-japan/sklears.git
    cd sklears/crates/sklears-python
    
  2. Install Maturin:

    pip install maturin
    
  3. Build and install the package:

    maturin develop --release
    
  4. Or build a wheel:

    maturin build --release
    pip install target/wheels/sklears-*.whl
    

Quick Start

import numpy as np
import sklears as skl

# Generate sample data
X = np.random.randn(100, 4)
y = np.random.randn(100)

# Train a linear regression model
model = skl.LinearRegression()
model.fit(X, y)
predictions = model.predict(X)

# Calculate R² score
score = model.score(X, y)
print(f"R² score: {score:.3f}")

Performance Comparison

Here's a typical performance comparison with scikit-learn:

import time
import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
import sklears as skl
from sklearn.linear_model import LinearRegression as SklearnLR

# Generate data
X, y = make_regression(n_samples=10000, n_features=100, noise=0.1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Sklears
start = time.time()
model = skl.LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
sklears_time = time.time() - start

# Scikit-learn
start = time.time()
sklearn_model = SklearnLR()
sklearn_model.fit(X_train, y_train)
sklearn_predictions = sklearn_model.predict(X_test)
sklearn_time = time.time() - start

print(f"Sklears time: {sklears_time:.4f}s")
print(f"Sklearn time: {sklearn_time:.4f}s")
print(f"Speedup: {sklearn_time / sklears_time:.2f}x")

API Compatibility

The sklears Python bindings are designed to be API-compatible with scikit-learn. Most existing scikit-learn code should work with minimal changes:

Before (scikit-learn):

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

After (sklears):

import sklears as skl

# Available classes and functions
model = skl.LinearRegression()
X_train, X_test, y_train, y_test = skl.train_test_split(X, y)

# Note: StandardScaler, MinMaxScaler, LabelEncoder - coming soon
# Note: mean_squared_error, r2_score, accuracy_score, etc. - coming soon

Memory Management

The bindings are designed to be memory-efficient:

  • Zero-copy operations where possible using NumPy's C API
  • Automatic memory management through PyO3's reference counting
  • Efficient data structures using ndarray and sprs for sparse matrices
  • Streaming support for large datasets that don't fit in memory

Error Handling

All Rust errors are properly converted to Python exceptions:

import sklears as skl
import numpy as np

try:
    # This will raise a ValueError if arrays have incompatible shapes
    model = skl.LinearRegression()
    model.fit(np.array([[1, 2], [3, 4]]), np.array([1, 2, 3]))  # Shape mismatch
except ValueError as e:
    print(f"Error: {e}")

System Information

Get information about your sklears installation:

import sklears as skl

# Version information
print(f"Version: {skl.get_version()}")

# Build information
build_info = skl.get_build_info()
for key, value in build_info.items():
    print(f"{key}: {value}")

# Note: get_hardware_info() and benchmark_basic_operations() - coming soon

Examples

See the examples/ directory for comprehensive usage examples:

  • python_demo.py - Complete demonstration of all features
  • Performance comparison scripts
  • Real-world use cases

Contributing

Contributions are welcome! Please see the main sklears repository for contribution guidelines.

License

This project is licensed under the Apache-2.0 license.

Acknowledgments

  • Built with PyO3 for Rust-Python interoperability
  • Compatible with NumPy arrays
  • API inspired by scikit-learn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklears-0.1.0rc1.tar.gz (3.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sklears-0.1.0rc1-cp314-cp314-macosx_11_0_arm64.whl (780.6 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

sklears-0.1.0rc1-cp310-cp310-manylinux_2_34_x86_64.whl (897.6 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file sklears-0.1.0rc1.tar.gz.

File metadata

  • Download URL: sklears-0.1.0rc1.tar.gz
  • Upload date:
  • Size: 3.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.7

File hashes

Hashes for sklears-0.1.0rc1.tar.gz
Algorithm Hash digest
SHA256 4fd928886403d05ba1273cc518b65752c07cf5101a114753c8501a4b06cd9e3e
MD5 2a30ef46a0dee32afa4a2ebbe0a00167
BLAKE2b-256 ec9fb2ff50eb669722ea522110977f767fa069855201ba96aca6390df1d03baf

See more details on using hashes here.

File details

Details for the file sklears-0.1.0rc1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for sklears-0.1.0rc1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 222365ad31598355a0960e6206a6598fe8b9f37f7660f79cee3ce6c1d2fdabb3
MD5 4256effdc141ed6861a7100f0058a2b0
BLAKE2b-256 515aa970a9cdb62c37bfb8f55ee4615f396f04d2cbfad45e26d5f79c8ab08858

See more details on using hashes here.

File details

Details for the file sklears-0.1.0rc1-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for sklears-0.1.0rc1-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 1d07a70df843fadc5c1b16d8078ad7c0075077d9bccd2f85cd20e3f1acc4f263
MD5 d9880c960d3604cbed1af12f70e75c51
BLAKE2b-256 71f2c231aafadc380f51aac8d045e9587b247a199847f9906fb0131b0b16b936

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page