Skip to main content

Library to generate toy data for machine learning experiments in the context of all-relevant feature selection.

Project description

All Relevant Feature Selection Generator Library (ARFS-Gen)

PyPI PyPI - Python Version PyPI - License

This repository contains a python library to generate synthetic (toy) data for use in research papers when evaluating all relevant feature selection e.g. used in fri.

It allows creating datasets with a specified number of strongly and weakly relevant features as well as random noise features.

In the newest revision it also includes methods which generate data with privileged information.

It works by utilizing existing methods from numpy and scikit-learn.

Install

The library is available on PyPi. Install via pip:

pip install arfs_gen

or clone this repository and use:

pip install .

Usage

In the following we generate a simple regression data set with a mix of strongly and weakly relevant features:

    # Import relevant method
    from arfs_gen import genRegressionData
    # Import model
    from sklearn.svm import LinearSVR

    # Specify parameters
    n = 100
    # Features
    strRel = 2
    strWeak = 2
    # Overall number of features (Rest will be filled by random features)
    d = 10

    # Generate the data
    X, y = genRegressionData(
        n_samples=n,
        n_features=d,
        n_redundant=strWeak,
        n_strel=strRel,
        n_repeated=0,
        noise=0,
    )
    # Fit a model
    linsvr = LinearSVR()
    linsvr.fit(X, y)

Development

For dependency management we use the newly released poetry tool.

If you have poetry installed, use

$ poetry install

inside the project folder to create a new venv and to install all dependencies. To enter the newly created venv use

$ poetry env

to open a new shell inside. Or alternatively run commands inside the venv with poetry run ....

Test

Test it by running poetry run pytest.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arfs_gen-1.2.1.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arfs_gen-1.2.1-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file arfs_gen-1.2.1.tar.gz.

File metadata

  • Download URL: arfs_gen-1.2.1.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.11 Linux/6.11.0-1018-azure

File hashes

Hashes for arfs_gen-1.2.1.tar.gz
Algorithm Hash digest
SHA256 f9904cd933f829b9b929f160919cec1410ca5c31b2c234a93c210ed6958b663a
MD5 53aab4c78eacecf095543b5371b11264
BLAKE2b-256 9c996f7373bd3736a9920a07f30d44f54eae5142d6d56aa27fd77fe826a71dbb

See more details on using hashes here.

File details

Details for the file arfs_gen-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: arfs_gen-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.11 Linux/6.11.0-1018-azure

File hashes

Hashes for arfs_gen-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e7cf151e970407e795ee6502b0dc97a7b31fbfd026bb77693dd8345fbbb84f54
MD5 0cf042e15f0f82c05e9bb4ef1ddd9e42
BLAKE2b-256 96ccd54171261c17d25afea328a2eaf9d1fbf7819c9b3cf8f666ba4bacdebdb5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page