Python library with tools dealing with weak labels.
Project description
WeakLabelModel
A library for training multiclass classifiers with weak labels
Installation
python3.7 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
Run code
See example of a call inside one of the add_queue scripts
Unittest
Run the unit tests running the following script from a terminal
./runtests.sh
Run Jupyter Notebooks
In order to run any notebook in this repository, first create a new kernel that will be available from the Jupyter Notebook.
# Load the virtual environment
source venv/bin/activate
# Create a new kernel
ipython kernel install --name "weaklabels" --user
# Open Jupyter
jupyter notebook
After opening or creating a notebook, you can select the "weaklabels" kernel in kernel -> Change kernel -> weaklabels
Usage
Current usage (may need updating)
Usage: main.py [options]
Options:
-h, --help show this help message and exit
-p DATASETS, --datasets=DATASETS
List of datasets or toy examples totest separated by
with no spaces.
-s NS, --n-samples=NS
Number of samples if toy dataset.
-f NF, --n-features=NF
Number of features if toy dataset.
-c N_CLASSES, --n-classes=N_CLASSES
Number of classes if toy dataset.
-m N_SIM, --n-simulations=N_SIM
Number of times to run every model.
-l LOSS, --loss=LOSS Loss function to minimize between square (brier score)
or CE (cross entropy)
-u PATH_RESULTS, --path-results=PATH_RESULTS
Path to save the results
-r RHO, --rho=RHO Learning step for the Gradient Descent
-a ALPHA, --alpha=ALPHA
Alpha probability parameter
-b BETA, --beta=BETA Beta probability parameter
-g GAMMA, --gamma=GAMMA
Gamma probability parameter
-i N_IT, --n-iterations=N_IT
Number of iterations of Gradient Descent.
-e MIXING_MATRIX, --mixing-matrix=MIXING_MATRIX
Method to generate the mixing matrix M.One of the
following: IPL, quasi-IPL, noisy, random_noise,
random_weak
Check that all datasets work
The python code in utils/data.py can be run in order to check that all datasets can be downloaded, preprocessed and a classifier can be trained on them. This can be done by calling the python code as a main file
python utils/data.py
Should output the following for each dataset
Testing all datasets
Evaluating iris[61] dataset
----------------
Dataset description:
Dataset name: iris
Sample size: 150
Number of features: 4
Number of classes: 3
Logistic Regression score = 0.9733333333333334
Upload to PyPi
Create the files to distribute
python3.8 setup.py sdist
Ensure twine is installed
pip install twine
Upload the distribution files
twine upload dist/*
It may require user and password if these are not set in your home directory a file .pypirc
[pypi]
username = __token__
password = pypi-yourtoken
Check the README file for Pypi
In order to check that the readme file is compliant with Pypi standards, install the following Python package
pip install readme-renderer
and run the following command
twine check dist/*
Contributors
- Jesus Cid Sueiro
- Miquel Perello Nieto
- Daniel Bacaicoa
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file weaklabels-0.0.1.dev1.tar.gz.
File metadata
- Download URL: weaklabels-0.0.1.dev1.tar.gz
- Upload date:
- Size: 20.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e003779dbc52c979bc820a78eb6aee91c6039be8492aaf87e776afa1355d0be
|
|
| MD5 |
0e7e9d828ee8f60ab26b40d5374bfb2c
|
|
| BLAKE2b-256 |
6f4b804aa0e9b36f31cc1efc898dbd4587375374e485a813d6a255dc93ed427c
|