Library to generate toy data for machine learning experiments in the context of all-relevant feature selection.
Project description
All Relevant Feature Selection Generator Library (ARFS-Gen)
This repository contains a python library to generate synthetic (toy) data for use in research papers when evaluating all relevant feature selection e.g. used in fri.
It allows creating datasets with a specified number of strongly and weakly relevant features as well as random noise features.
In the newest revision it also includes methods which generate data with privileged information.
It works by utilizing existing methods from numpy and scikit-learn.
Install
The library is available on PyPi.
Install via pip:
pip install arfs_gen
or clone this repository and use:
pip install .
Usage
In the following we generate a simple regression data set with a mix of strongly and weakly relevant features:
# Import relevant method
from arfs_gen import genRegressionData
# Import model
from sklearn.svm import LinearSVR
# Specify parameters
n = 100
# Features
strRel = 2
strWeak = 2
# Overall number of features (Rest will be filled by random features)
d = 10
# Generate the data
X, y = genRegressionData(
n_samples=n,
n_features=d,
n_redundant=strWeak,
n_strel=strRel,
n_repeated=0,
noise=0,
)
# Fit a model
linsvr = LinearSVR()
linsvr.fit(X, y)
Development
For dependency management we use the newly released poetry tool.
If you have poetry installed, use
$ poetry install
inside the project folder to create a new venv and to install all dependencies.
To enter the newly created venv use
$ poetry env
to open a new shell inside.
Or alternatively run commands inside the venv with poetry run ....
Test
Test it by running poetry run pytest.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arfs_gen-1.1.1.tar.gz.
File metadata
- Download URL: arfs_gen-1.1.1.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.5 CPython/3.7.6 Linux/5.0.0-1035-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce025aab74fdf94fd891c11202000d8a57cb834c1daa95fb7449038ccf0d761b
|
|
| MD5 |
3a1d8e1a14cbfbad863afb712fd0e5d2
|
|
| BLAKE2b-256 |
cacec67ecb61459a4d4ee59596f80ea69dc76a14ad67c46d98bd0c68b1830ce3
|
File details
Details for the file arfs_gen-1.1.1-py3-none-any.whl.
File metadata
- Download URL: arfs_gen-1.1.1-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.5 CPython/3.7.6 Linux/5.0.0-1035-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42c61a329582b58372959f9c741b9ea5521eb1aaf43f69c40962e788b74247ea
|
|
| MD5 |
416625cf62ba75b6faebdfe857c49aa6
|
|
| BLAKE2b-256 |
b88e78b6edbe1b599cb74a5c5e32ed5674751440b355551ada9050ec990d7dcd
|