Neo LS-SVM
Project description
Neo LS-SVM
Neo LS-SVM is a modern least-squares support vector machine implementation in Python that offers several benefits over sklearn's classic sklearn.svm.SVC
classifier and sklearn.svm.SVR
regressor:
- โก Linear complexity in the number of training examples with Orthogonal Random Features.
- ๐ Hyperparameter free: zero-cost optimization of the regularisation parameter ฮณ and kernel parameter ฯ.
- ๐๏ธ Adds a new tertiary objective that minimizes the complexity of the prediction surface.
- ๐ Returns the leave-one-out residuals, leverage, and error for free after fitting.
- ๐ Learns an affine transformation of the feature matrix to optimally separate the target's bins.
- ๐ช Can solve the LS-SVM both in the primal and dual space.
- ๐ก๏ธ Isotonically calibrated
predict_proba
based on the leave-one-out predictions.
Using
First, install this package with:
pip install neo-ls-svm
Then, you can import neo_ls_svm.NeoLSSVM
as an sklearn-compatible binary classifier and regressor. Example usage:
from neo_ls_svm import NeoLSSVM
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from skrub import TableVectorizer # Vectorizes a pandas DataFrame into a NumPy array.
# Binary classification example:
X, y = fetch_openml("credit-g", return_X_y=True, as_frame=True, parser="auto")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=42)
model = make_pipeline(TableVectorizer(), NeoLSSVM())
model.fit(X_train, y_train)
print(model.score(X_test, y_test)) # 77.3% (compared to sklearn.svm.SVC's 70.7%)
# Regression example:
X, y = fetch_openml("ames_housing", return_X_y=True, as_frame=True, parser="auto")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=42)
model = make_pipeline(TableVectorizer(), NeoLSSVM())
model.fit(X_train, y_train)
print(model.score(X_test, y_test)) # 82.0% (compared to sklearn.svm.SVR's -11.8%)
Benchmarks
We select all binary classification and regression datasets below 1M entries from the AutoML Benchmark. Each dataset is split into 85% for training and 15% for testing. We apply skrub.TableVectorizer
as a preprocessing step for neo_ls_svm.NeoLSSVM
and sklearn.svm.SVC,SVR
to vectorize the pandas DataFrame training data into a NumPy array. Models are fitted only once on each dataset, with their default settings and no hyperparameter tuning.
Binary classification
ROC-AUC on 15% test set:
dataset | LGBMClassifier | NeoLSSVM | SVC |
---|---|---|---|
ada | ๐ฅ 90.9% (0.2s) | ๐ฅ 90.9% (1.1s) | 83.1% (1.1s) |
adult | ๐ฅ 93.0% (1.9s) | ๐ฅ 89.0% (6.5s) | / |
amazon_employee_access | ๐ฅ 85.6% (1.0s) | ๐ฅ 64.5% (3.9s) | / |
arcene | ๐ฅ 78.0% (0.7s) | 66.0% (6.4s) | ๐ฅ 82.0% (3.4s) |
australian | ๐ฅ 88.3% (0.3s) | 80.2% (0.6s) | ๐ฅ 81.9% (0.0s) |
bank-marketing | ๐ฅ 93.5% (0.8s) | ๐ฅ 91.0% (5.5s) | / |
blood-transfusion-service-center | 62.0% (0.2s) | ๐ฅ 69.9% (0.5s) | ๐ฅ 69.7% (0.0s) |
churn | ๐ฅ 91.7% (0.9s) | ๐ฅ 81.0% (1.4s) | 70.6% (0.8s) |
click_prediction_small | ๐ฅ 67.7% (1.0s) | ๐ฅ 66.6% (4.5s) | / |
jasmine | ๐ฅ 86.1% (0.5s) | 79.7% (1.2s) | ๐ฅ 85.3% (1.8s) |
kc1 | ๐ฅ 78.9% (0.4s) | ๐ฅ 76.6% (0.7s) | 45.7% (0.2s) |
kr-vs-kp | ๐ฅ 100.0% (0.6s) | 99.2% (1.0s) | ๐ฅ 99.4% (0.6s) |
madeline | ๐ฅ 93.1% (1.0s) | 64.9% (1.2s) | ๐ฅ 82.5% (4.6s) |
ozone-level-8hr | ๐ฅ 91.2% (0.6s) | ๐ฅ 91.6% (1.0s) | 72.8% (0.2s) |
pc4 | ๐ฅ 95.3% (0.5s) | ๐ฅ 90.9% (0.6s) | 74.3% (0.1s) |
phishingwebsites | ๐ฅ 99.5% (0.5s) | ๐ฅ 98.9% (1.9s) | 98.7% (2.7s) |
phoneme | ๐ฅ 95.6% (0.4s) | ๐ฅ 93.5% (1.1s) | 91.2% (0.7s) |
qsar-biodeg | ๐ฅ 92.7% (0.4s) | ๐ฅ 90.7% (0.7s) | 86.8% (0.1s) |
satellite | ๐ฅ 98.7% (0.4s) | ๐ฅ 99.5% (1.1s) | 98.5% (0.1s) |
sylvine | ๐ฅ 98.5% (0.3s) | ๐ฅ 97.1% (1.0s) | 96.5% (1.0s) |
wilt | ๐ฅ 99.5% (0.3s) | ๐ฅ 99.8% (1.0s) | 98.9% (0.2s) |
Regression
Rยฒ on 15% test set:
dataset | LGBMRegressor | NeoLSSVM | SVR |
---|---|---|---|
abalone | ๐ฅ 56.2% (0.2s) | ๐ฅ 59.5% (1.4s) | 51.3% (0.2s) |
boston | ๐ฅ 91.7% (0.4s) | ๐ฅ 87.9% (0.6s) | 35.1% (0.0s) |
brazilian_houses | ๐ฅ 55.9% (0.6s) | ๐ฅ 88.3% (1.8s) | 5.4% (2.0s) |
colleges | ๐ฅ 58.5% (0.5s) | ๐ฅ 42.6% (4.3s) | 40.2% (5.3s) |
diamonds | ๐ฅ 98.2% (0.4s) | ๐ฅ 95.2% (5.5s) | / |
elevators | ๐ฅ 87.7% (0.4s) | ๐ฅ 82.6% (3.0s) | / |
house_16h | ๐ฅ 67.7% (0.4s) | ๐ฅ 52.8% (2.7s) | / |
house_prices_nominal | ๐ฅ 89.0% (0.3s) | ๐ฅ 78.2% (1.1s) | -2.9% (0.3s) |
house_sales | ๐ฅ 89.2% (0.5s) | ๐ฅ 77.8% (2.6s) | / |
mip-2016-regression | ๐ฅ 59.2% (0.5s) | ๐ฅ 32.5% (0.8s) | -27.3% (0.1s) |
moneyball | ๐ฅ 93.2% (0.2s) | ๐ฅ 91.2% (0.7s) | 0.8% (0.1s) |
pol | ๐ฅ 98.7% (0.4s) | ๐ฅ 75.2% (2.3s) | / |
quake | -10.7% (0.3s) | ๐ฅ -0.1% (0.9s) | ๐ฅ -10.7% (0.0s) |
sat11-hand-runtime-regression | ๐ฅ 78.3% (0.5s) | ๐ฅ 61.7% (1.3s) | -56.3% (1.0s) |
sensory | ๐ฅ 29.2% (0.2s) | 3.7% (0.5s) | ๐ฅ 16.4% (0.0s) |
socmob | ๐ฅ 79.6% (0.2s) | ๐ฅ 70.7% (0.6s) | 30.8% (0.0s) |
space_ga | ๐ฅ 70.3% (0.4s) | ๐ฅ 43.7% (0.8s) | 35.9% (0.1s) |
tecator | ๐ฅ 98.3% (0.2s) | ๐ฅ 99.3% (0.6s) | 78.5% (0.0s) |
us_crime | ๐ฅ 62.8% (0.6s) | ๐ฅ 63.0% (1.2s) | 6.7% (0.2s) |
wine_quality | ๐ฅ 45.6% (0.3s) | -7.8% (1.3s) | ๐ฅ 16.4% (0.5s) |
Contributing
Prerequisites
1. Set up Git to use SSH
- Generate an SSH key and add the SSH key to your GitHub account.
- Configure SSH to automatically load your SSH keys:
cat << EOF >> ~/.ssh/config Host * AddKeysToAgent yes IgnoreUnknown UseKeychain UseKeychain yes EOF
2. Install Docker
- Install Docker Desktop.
- Enable Use Docker Compose V2 in Docker Desktop's preferences window.
- Linux only:
- Export your user's user id and group id so that files created in the Dev Container are owned by your user:
cat << EOF >> ~/.bashrc export UID=$(id --user) export GID=$(id --group) EOF
- Export your user's user id and group id so that files created in the Dev Container are owned by your user:
3. Install VS Code or PyCharm
- Install VS Code and VS Code's Dev Containers extension. Alternatively, install PyCharm.
- Optional: install a Nerd Font such as FiraCode Nerd Font and configure VS Code or configure PyCharm to use it.
Development environments
The following development environments are supported:
- โญ๏ธ GitHub Codespaces: click on Code and select Create codespace to start a Dev Container with GitHub Codespaces.
- โญ๏ธ Dev Container (with container volume): click on Open in Dev Containers to clone this repository in a container volume and create a Dev Container with VS Code.
- Dev Container: clone this repository, open it with VS Code, and run Ctrl/โ + โง + P โ Dev Containers: Reopen in Container.
- PyCharm: clone this repository, open it with PyCharm, and configure Docker Compose as a remote interpreter with the
dev
service. - Terminal: clone this repository, open it with your terminal, and run
docker compose up --detach dev
to start a Dev Container in the background, and then rundocker compose exec dev zsh
to open a shell prompt in the Dev Container.
Developing
- This project follows the Conventional Commits standard to automate Semantic Versioning and Keep A Changelog with Commitizen.
- Run
poe
from within the development environment to print a list of Poe the Poet tasks available to run on this project. - Run
poetry add {package}
from within the development environment to install a run time dependency and add it topyproject.toml
andpoetry.lock
. Add--group test
or--group dev
to install a CI or development dependency, respectively. - Run
poetry update
from within the development environment to upgrade all dependencies to the latest versions allowed bypyproject.toml
. - Run
cz bump
to bump the package's version, update theCHANGELOG.md
, and create a git tag.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file neo_ls_svm-0.1.0.tar.gz
.
File metadata
- Download URL: neo_ls_svm-0.1.0.tar.gz
- Upload date:
- Size: 28.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.13 Linux/6.2.0-1018-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b5443758982d8d4aef58a6e4351ffb509870ec9640413b72e7a93ff9c8a3cfa |
|
MD5 | 7e0c8c57e6edd78c190c3941bcde81aa |
|
BLAKE2b-256 | 317f299a00d702976e06e80864f48d7cabce6b8ceef30b76e8a6b7e1a25e0e4d |
File details
Details for the file neo_ls_svm-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: neo_ls_svm-0.1.0-py3-none-any.whl
- Upload date:
- Size: 28.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.13 Linux/6.2.0-1018-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6cae667cc3ba8a791391b26bedb0f0b3eeca1f3f3d06c58a281f9619d8eb84b |
|
MD5 | 1992a81de8aa560d8eacdac5f9c32a82 |
|
BLAKE2b-256 | 99cb7d8a751c853ca7e2746d861139b1e73f65461f6b314fd04b9b95b85c5087 |