A lightweight toolbox for multilabel classification algorithms based on the k-nearest neighbors
Project description
multilabel_knn
multilabel_knn
is a lightweight toolbox for the multilabel classifications based on the k-nearest neighbor algorithms [Doc].
The following algorithms are implemented:
- k-nearest neighbor classifier
- multilabel k-nearest neighbor classifier
- Binomial multilabel k-nearest neighbor classifier
- Binomial multilabel graph neighbor classifer
Usage
k-nearest neighbor algorithm (Predict a single label per sample)
import multilabel_knn as mlk
model = mlk.kNN(k=10, metric = "cosine") #k: number of neighbors, metric: distance metric {"euclidean", "cosine"}
model.fit(X, Y) # X :2d feature vectors. Y: label matrix, where Y[i,k] = 1 if i has label k.
Y_pred = model.predict(X_test) # Y_pred[i,k] = 1 is i is predicted to have label k.
mutilabel kNN (Can predict multiple labels per sample) [1]
import multilabel_knn as mlk
model = mlk.multilabel_kNN(k=10, metric = "cosine")
model.fit(X, Y)
Y_pred = model.predict(X_test)
Binomial mutilabel kNN (Can predict multiple labels per sample)
import multilabel_knn as mlk
model = mlk.binomial_multilabel_kNN(k=10, metric = "cosine")
model.fit(X, Y)
Y_pred = model.predict(X_test)
Binomial multilabel kNN is a mobidifed version of multilabel kNN. It can perform well for data with a large number of samples and labels. See the docstring for details.
Binomial mutilabel graph (Take a graph as input. Can predict multiple labels per node)
import multilabel_knn as mlk
model = mlk.binomial_multilabel_graph()
model.fit(A, Y) # A is the adjacency matrix of the graph for training. A[i,j] =1 if node i has a link to node j.
Y_pred = model.predict(B) # B is the adjacency matrix of the biparite network, where B[i,j] =1 if node i has a link to node j in the training graph.
Install
Requirements: Python 3.7 or later
pip install multilabel_knn
multilabel_knn
uses faiss library, which has two versions, faiss-cpu
and faiss-gpu
.
As the name stands, faiss-gpu
can leverage GPUs, thureby faster if you have GPUs. multilabel_knn
uses faiss-cpu
by default to avoid unnecessary GPU-related troubles.
But, if you have gpus compatible with the faiss-gpu
, you can benefit the gpu accelarations by installing faiss-gpu
by
with conda:
conda install -c conda-forge faiss-gpu
or with pip:
pip install faiss-gpu
Don't forget to pass gpu_id
to the init
argument to enable GPU
Maintenance
Code Linting:
conda install -y -c conda-forge pre-commit
pre-commit install
Docsctring: sphinx format
Test:
python -m unittest tests/simple_test.py
Reference
[1] Zhang, Min-Ling, and Zhi-Hua Zhou. 2007. “ML-KNN: A Lazy Learning Approach to Multi-Label Learning.” Pattern Recognition 40 (7): 2038–48.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for multilabel_knn-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf8116e11e951b449e8145ab28582a608576f9c0d79a616368bdb5b70f8e2fee |
|
MD5 | 7646ebbbbb4c3e18dce844e3030aa7e1 |
|
BLAKE2b-256 | 4a2b4231d754b30edf2168ae448d333873938a9b8718822c4a1361a8cc3a3a67 |