Skip to main content

Comparison-based Machine Learning in Python.

Project description

cblearn

Comparison-based Machine Learning in Python

:warning: cblearn is work in progress. The API can change and bugs appear. Please help us by posting an issue :construction:

Unit tests Test Coverage Documentation

Comparison-based Learning algorithms are the Machine Learning algorithms to use when training data contains similarity comparisons ("A and B are more similar than C and D") instead of data points.

:eyes: VSS 2022: Please find an example of psychophysical scaling with triplets and ordinal embedding here :eyes:

Triplet comparisons from human observers help model the perceived similarity of objects. These human triplets are collected in studies, asking questions like "Which of the following bands is most similar to Queen?" or "Which colour appears most similar to the reference?".

This library provides an easy to use interface to comparison-based learning algorithms. It plays hand-in-hand with scikit-learn:

from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score

from cblearn.datasets import make_random_triplets
from cblearn.embedding import SOE
from cblearn.metrics import QueryScorer

X = load_iris().data
triplets = make_random_triplets(X, result_format="list-order", size=1000)

estimator = SOE(n_components=2)
# Measure the fit with scikit-learn's cross-validation
scores = cross_val_score(estimator, triplets, cv=5)
print(f"The 5-fold CV triplet error is {sum(scores) / len(scores)}.")

# Estimate the scale on all triplets
embedding = estimator.fit_transform(triplets)
print(f"The embedding has shape {embedding.shape}.")

Please try the Examples.

Getting Started

Install cblearn as described here and try the examples.

Find a theoretical introduction to comparison-based learning, the datatypes, algorithms, and datasets in the User Guide.

Features

Datasets

cblearn provides utility methods to simplify the loading and conversion of your comparison datasets. In addition, some functions download and load multiple real-world comparisons.

Dataset Query #Object #Response #Triplet
Vogue Cover Odd-out Triplet 60 1,107 2,214
Nature Scene Odd-out Triplet 120 3,355 6,710
Car Most-Central Triplet 60 7,097 14,194
Material Standard Triplet 100 104,692 104,692
Food Standard Triplet 100 190,376 190,376
Musician Standard Triplet 413 224,792 224,792
Things Image Testset Odd-out Triplet 1,854 146,012 292,024
ImageNet Images v0.1 Rank 2 from 8 1,000 25,273 328,549
ImageNet Images v0.2 Rank 2 from 8 50,000 384,277 5M

Embedding Algorithms

Algorithm Default Pytorch (GPU) Reference Wrapper
Crowd Kernel Learning (CKL) X X
FORTE X
GNMDS X X
Maximum-Likelihood Difference Scaling (MLDS) X MLDS (R)
Soft Ordinal Embedding (SOE) X X loe (R)
Stochastic Triplet Embedding (STE/t-STE) X X

Contribute

We are happy about your bug reports, questions or suggestions as Github Issues and code or documentation contributions as Github Pull Requests. Please see our Contributor Guide.

Authors and Acknowledgement

cblearn was initiated by current and former members of the Theory of Machine Learning group of Prof. Dr. Ulrike von Luxburg at the University of Tübingen. The leading developer is David-Elias Künstle.

We would like to thank all the contributors here on Github. This work has been supported by the Machine Learning Cluster of Excellence, funded by EXC number 2064/1 – Project number 390727645. The authors would like to thank the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting David-Elias Künstle.

License

This library is free to use under the MIT License conditions. Please reference this library appropriately if it contributes to your scientific publication. We would also appreciate a short email (optionally) to see how our library is being used.

Changelog

0.0.1

  • Versioning and publishing to PyPI MIT License

Copyright (c) 2020-2021 The cblearn developers.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cblearn-0.0.1.tar.gz (90.1 kB view details)

Uploaded Source

Built Distribution

cblearn-0.0.1-py3-none-any.whl (92.6 kB view details)

Uploaded Python 3

File details

Details for the file cblearn-0.0.1.tar.gz.

File metadata

  • Download URL: cblearn-0.0.1.tar.gz
  • Upload date:
  • Size: 90.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.28.1 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.64.1 CPython/3.8.10

File hashes

Hashes for cblearn-0.0.1.tar.gz
Algorithm Hash digest
SHA256 e7ebf6c0a6b6d892fbe0f7b062d048ee073d9261c0006da0bca82aba338b8d22
MD5 41f365134603dc0423940c1cd8111bd8
BLAKE2b-256 50c786f2985c11f1835ee33864c6fd4ccb64f9c9ce5ab120946efda417f755d1

See more details on using hashes here.

File details

Details for the file cblearn-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: cblearn-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 92.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.28.1 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.64.1 CPython/3.8.10

File hashes

Hashes for cblearn-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 29d9f47c2327820fd36c6109531737911c46e75f320245e5781cf5132e20a5ca
MD5 dc3fcb3f054f534b7c58d851d492b2d3
BLAKE2b-256 c3ccdc09749939c5125467276e84ca83fdff57c2276538c24b088aea72c155bc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page