Skip to main content

A lightweight toolbox for multilabel classification algorithms based on the k-nearest neighbors

Project description

multilabel_knn

Unit Test & Deploy

multilabel_knn is a lightweight toolbox for the multilabel classifications based on the k-nearest neighbor algorithms [Doc].

The following algorithms are implemented:

Usage

k-nearest neighbor algorithm (Predict a single label per sample)

import multilabel_knn as mlk
model = mlk.kNN(k=10, metric = "cosine") #k: number of neighbors, metric: distance metric {"euclidean", "cosine"}
model.fit(X, Y) # X :2d feature vectors. Y: label matrix, where Y[i,k] = 1 if i has label k.
Y_pred = model.predict(X_test) # Y_pred[i,k] = 1 is i is predicted to have label k.

mutilabel kNN (Can predict multiple labels per sample) [1]

import multilabel_knn as mlk
model = mlk.multilabel_kNN(k=10, metric = "cosine")
model.fit(X, Y)
Y_pred = model.predict(X_test) 

Binomial mutilabel kNN (Can predict multiple labels per sample)

import multilabel_knn as mlk
model = mlk.binomial_multilabel_kNN(k=10, metric = "cosine")
model.fit(X, Y) 
Y_pred = model.predict(X_test) 

Binomial multilabel kNN is a mobidifed version of multilabel kNN. It can perform well for data with a large number of samples and labels. See the docstring for details.

Binomial mutilabel graph (Take a graph as input. Can predict multiple labels per node)

import multilabel_knn as mlk
model = mlk.binomial_multilabel_graph()
model.fit(A, Y) # A is the adjacency matrix of the graph for training. A[i,j] =1 if node i has a link to node j. 
Y_pred = model.predict(B) # B is the adjacency matrix of the biparite network, where B[i,j] =1 if node i has a link to node j in the training graph.

Install

Requirements: Python 3.7 or later

pip install multilabel_knn

multilabel_knn uses faiss library, which has two versions, faiss-cpu and faiss-gpu. As the name stands, faiss-gpu can leverage GPUs, thureby faster if you have GPUs. multilabel_knn uses faiss-cpu by default to avoid unnecessary GPU-related troubles. But, if you have gpus compatible with the faiss-gpu, you can benefit the gpu accelarations by installing faiss-gpu by

with conda:

conda install -c conda-forge faiss-gpu

or with pip:

pip install faiss-gpu

Don't forget to pass gpu_id to the init argument to enable GPU

Maintenance

Code Linting:

conda install -y -c conda-forge pre-commit
pre-commit install

Docsctring: sphinx format

Test:

python -m unittest tests/simple_test.py

Reference

[1] Zhang, Min-Ling, and Zhi-Hua Zhou. 2007. “ML-KNN: A Lazy Learning Approach to Multi-Label Learning.” Pattern Recognition 40 (7): 2038–48.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

multilabel_knn-0.0.2-py3-none-any.whl (12.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page