Skip to main content

Library to generate toy data for machine learning experiments in the context of all-relevant feature selection.

Project description

All Relevant Feature Selection Generator Library (ARFS-Gen)

PyPI PyPI - Python Version PyPI - License

This repository contains a python library to generate synthetic (toy) data for use in research papers when evaluating all relevant feature selection e.g. used in fri.

It allows creating datasets with a specified number of strongly and weakly relevant features as well as random noise features.

In the newest revision it also includes methods which generate data with privileged information.

It works by utilizing existing methods from numpy and scikit-learn.

Install

The library is available on PyPi. Install via pip:

pip install arfs_gen

or clone this repository and use:

pip install .

Usage

In the following we generate a simple regression data set with a mix of strongly and weakly relevant features:

    # Import relevant method
    from arfs_gen import genRegressionData
    # Import model
    from sklearn.svm import LinearSVR

    # Specify parameters
    n = 100
    # Features
    strRel = 2
    strWeak = 2
    # Overall number of features (Rest will be filled by random features)
    d = 10

    # Generate the data
    X, y = genRegressionData(
        n_samples=n,
        n_features=d,
        n_redundant=strWeak,
        n_strel=strRel,
        n_repeated=0,
        noise=0,
    )
    # Fit a model
    linsvr = LinearSVR()
    linsvr.fit(X, y)

Development

For dependency management we use the newly released poetry tool.

If you have poetry installed, use

$ poetry install

inside the project folder to create a new venv and to install all dependencies. To enter the newly created venv use

$ poetry env

to open a new shell inside. Or alternatively run commands inside the venv with poetry run ....

Test

Test it by running poetry run pytest.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arfs_gen-1.1.2.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arfs_gen-1.1.2-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file arfs_gen-1.1.2.tar.gz.

File metadata

  • Download URL: arfs_gen-1.1.2.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.7.6 Linux/5.0.0-1035-azure

File hashes

Hashes for arfs_gen-1.1.2.tar.gz
Algorithm Hash digest
SHA256 d40edad2e6d1c0fd95baf69946692ca33e744af298f924c89fd49470446f857e
MD5 54811841566d01bec4fa1255d02e8c74
BLAKE2b-256 d439b099af9831ab9cb2e3827b3519954fe6f2630c4dd350765c3b6754b7be3f

See more details on using hashes here.

File details

Details for the file arfs_gen-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: arfs_gen-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.7.6 Linux/5.0.0-1035-azure

File hashes

Hashes for arfs_gen-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 211a865c0284a08af20522d8dd11e1cb9510a78abb6e94bc4810938c7e8096b2
MD5 a137fa0dd0791ace285cfa91421e9594
BLAKE2b-256 d6e6a9fa75aaab5460ebcebcf46c689df66995b9eadc56d75d8572d5d9d52e92

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page