Skip to main content

Code for converting a list of labels in a scikit-like semi-supervised labels.

Project description

masksemi

Code for converting a label list in a scikit-like semi-supervised label list.

Getting Started

Dependencies

You need Python 3.7 or later to use masksemi. You can find it at python.org.

You also need numpy package, which is available from PyPI. If you have pip, just run:

pip install numpy

Installation

Clone this repo to your local machine using:

git clone https://github.com/caiocarneloz/masksemi.git

Features

Given the label list and a certain percentage, mask the amount of unlabeled data based on percentage and also encode the data for scikit usage with semi-supervised models. The percentage split is considered by class. This way, all classes will have the given percentage as labeled data.

Usage

Considering iris dataset labels:

array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
        ...,
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
        ...,
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', 'Iris-virginica',
       'Iris-virginica', 'Iris-virginica', ...], dtype=object)

The maskData function is called by passing labels and the percentage of labeled data:

masked_labels = maskData(labels, 0.1)

The labels are encoded and masked:

array([-1, -1, -1,  0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  0,
       -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  0, -1, -1, -1,
       -1,  0,  0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
       -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  1, -1, -1, -1, -1,  1, -1,
       -1, -1, -1, -1, -1, -1, -1, -1,  1, -1,  1, -1, -1, -1, -1, -1, -1,
       -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  1, -1, -1, -1,
       -1, -1, -1, -1, -1, -1,  2, -1, -1, -1, -1,  2, -1, -1, -1, -1, -1,
       -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  2,  2, -1, -1,
       -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  2])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

masksemi-0.0.1.tar.gz (2.2 kB view details)

Uploaded Source

File details

Details for the file masksemi-0.0.1.tar.gz.

File metadata

  • Download URL: masksemi-0.0.1.tar.gz
  • Upload date:
  • Size: 2.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for masksemi-0.0.1.tar.gz
Algorithm Hash digest
SHA256 9292d9dba83a02bebda699b5a471dbcb1d165ddb815d6003fe9838ba6453e20b
MD5 926f16e0199d854c0a93c48745bf748d
BLAKE2b-256 99e539c2f7a25a2eb031bddd47a0e53fd1b3d6ccade57136821a12fe22213db7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page