Skip to main content

Library to generate toy data for machine learning experiments in the context of all-relevant feature selection.

Project description

All Relevant Feature Selection Generator Library (ARFS-Gen)

PyPI PyPI - Python Version PyPI - License

This repository contains a python library to generate synthetic (toy) data for use in research papers when evaluating all relevant feature selection e.g. used in fri.

It allows creating datasets with a specified number of strongly and weakly relevant features as well as random noise features.

In the newest revision it also includes methods which generate data with privileged information.

It works by utilizing existing methods from numpy and scikit-learn.

Install

The library is available on PyPi. Install via pip:

pip install arfs_gen

or clone this repository and use:

pip install .

Usage

In the following we generate a simple regression data set with a mix of strongly and weakly relevant features:

    # Import relevant method
    from arfs_gen import genRegressionData
    # Import model
    from sklearn.svm import LinearSVR

    # Specify parameters
    n = 100
    # Features
    strRel = 2
    strWeak = 2
    # Overall number of features (Rest will be filled by random features)
    d = 10

    # Generate the data
    X, y = genRegressionData(
        n_samples=n,
        n_features=d,
        n_redundant=strWeak,
        n_strel=strRel,
        n_repeated=0,
        noise=0,
    )
    # Fit a model
    linsvr = LinearSVR()
    linsvr.fit(X, y)

Development

For dependency management we use the newly released poetry tool.

If you have poetry installed, use

$ poetry install

inside the project folder to create a new venv and to install all dependencies. To enter the newly created venv use

$ poetry env

to open a new shell inside. Or alternatively run commands inside the venv with poetry run ....

Test

Test it by running poetry run pytest.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arfs_gen-1.1.1.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arfs_gen-1.1.1-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file arfs_gen-1.1.1.tar.gz.

File metadata

  • Download URL: arfs_gen-1.1.1.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.7.6 Linux/5.0.0-1035-azure

File hashes

Hashes for arfs_gen-1.1.1.tar.gz
Algorithm Hash digest
SHA256 ce025aab74fdf94fd891c11202000d8a57cb834c1daa95fb7449038ccf0d761b
MD5 3a1d8e1a14cbfbad863afb712fd0e5d2
BLAKE2b-256 cacec67ecb61459a4d4ee59596f80ea69dc76a14ad67c46d98bd0c68b1830ce3

See more details on using hashes here.

File details

Details for the file arfs_gen-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: arfs_gen-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.7.6 Linux/5.0.0-1035-azure

File hashes

Hashes for arfs_gen-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 42c61a329582b58372959f9c741b9ea5521eb1aaf43f69c40962e788b74247ea
MD5 416625cf62ba75b6faebdfe857c49aa6
BLAKE2b-256 b88e78b6edbe1b599cb74a5c5e32ed5674751440b355551ada9050ec990d7dcd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page