Skip to main content

Library to generate toy data for machine learning experiments in the context of all-relevant feature selection.

Project description

All Relevant Feature Selection Generator Library (ARFS-Gen)

PyPI PyPI - Python Version PyPI - License

This repository contains a python library to generate synthetic (toy) data for use in research papers when evaluating all relevant feature selection e.g. used in fri.

It allows creating datasets with a specified number of strongly and weakly relevant features as well as random noise features.

In the newest revision it also includes methods which generate data with privileged information.

It works by utilizing existing methods from numpy and scikit-learn.

Install

The library is available on PyPi. Install via pip:

pip install arfs_gen

or clone this repository and use:

pip install .

Usage

In the following we generate a simple regression data set with a mix of strongly and weakly relevant features:

    # Import relevant method
    from arfs_gen import genRegressionData
    # Import model
    from sklearn.svm import LinearSVR

    # Specify parameters
    n = 100
    # Features
    strRel = 2
    strWeak = 2
    # Overall number of features (Rest will be filled by random features)
    d = 10

    # Generate the data
    X, y = genRegressionData(
        n_samples=n,
        n_features=d,
        n_redundant=strWeak,
        n_strel=strRel,
        n_repeated=0,
        noise=0,
    )
    # Fit a model
    linsvr = LinearSVR()
    linsvr.fit(X, y)

Development

For dependency management we use the newly released poetry tool.

If you have poetry installed, use

$ poetry install

inside the project folder to create a new venv and to install all dependencies. To enter the newly created venv use

$ poetry env

to open a new shell inside. Or alternatively run commands inside the venv with poetry run ....

Test

Test it by running poetry run pytest.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for arfs-gen, version 1.1.2
Filename, size File type Python version Upload date Hashes
Filename, size arfs_gen-1.1.2-py3-none-any.whl (11.3 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size arfs_gen-1.1.2.tar.gz (10.6 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page