Code for converting a list of labels in a scikit-like semi-supervised labels.
Project description
masksemi
Code for converting a label list in a scikit-like semi-supervised label list.
Getting Started
Dependencies
You need Python 3.7 or later to use masksemi. You can find it at python.org.
You also need numpy package, which is available from PyPI. If you have pip, just run:
pip install numpy
Installation
Clone this repo to your local machine using:
git clone https://github.com/caiocarneloz/masksemi.git
Features
Given the label list and a certain percentage, mask the amount of unlabeled data based on percentage and also encode the data for scikit usage with semi-supervised models. The percentage split is considered by class. This way, all classes will have the given percentage as labeled data.
Usage
Considering iris dataset labels:
array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
...,
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
...,
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
'Iris-virginica', 'Iris-virginica', ...], dtype=object)
The maskData function is called by passing labels and the percentage of labeled data:
masked_labels = maskData(labels, 0.1)
The labels are encoded and masked:
array([-1, -1, -1, 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, -1, -1, -1,
-1, 0, 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, -1, -1, -1, -1, 1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, 1, -1, 1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, 2, -1, -1, -1, -1, 2, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 2, 2, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 2])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file masksemi-0.0.1.tar.gz
.
File metadata
- Download URL: masksemi-0.0.1.tar.gz
- Upload date:
- Size: 2.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9292d9dba83a02bebda699b5a471dbcb1d165ddb815d6003fe9838ba6453e20b |
|
MD5 | 926f16e0199d854c0a93c48745bf748d |
|
BLAKE2b-256 | 99e539c2f7a25a2eb031bddd47a0e53fd1b3d6ccade57136821a12fe22213db7 |