Skip to main content

Multi armed bandit feature selection

Project description

Multi armed bandit feature selection

Feature selector based on Thompson sampling algorithm Based on: https://epubs.siam.org/doi/pdf/10.1137/1.9781611976700.36

Descriotion

This package is used to select optimal subset of features to maximize selected metric. Optimization could be used for both regression and classification.

How it works

  1. calculate information relevance and information redundancy for every feature
  2. init beta distribution for every feature
  3. sample every beta distribution and select desired number of feature
  4. cross validate model
  5. calculate resulting score based om CV metric and information relevance and redundancy
  6. update beta distributions

Usage

import pandas as pd
import numpy as np
from sklearn.datasets import make_regression
from sklearn.metrics import make_scorer
from sklearn.metrics import mean_absolute_error
from sklearn.linear_model import LinearRegression
from mabfs.ts_selector import ThompsonSamplingFeatureSelection

x, y, coef = make_regression(n_samples=1000,
                             n_features=500, 
                             n_informative=10, 
                             effective_rank=5, 
                             tail_strength=0.7,
                             noise=0.05, 
                             shuffle=True, 
                             bias=100,
                             coef=True,
                             random_state=666)

x = pd.DataFrame(x)
y = pd.Series(y)

true_features = np.where(coef > 0)[0]

model = LinearRegression()
tsfs = ThompsonSamplingFeatureSelection(model=model, 
                                        scoring=make_scorer(mean_absolute_error),
                                        desired_number_of_features=10,
                                        x=x, 
                                        y=y, 
                                        cv_splits=3, 
                                        exploration_coef=0.3,
                                        optimization_steps=100000,
                                        is_regression=True,
                                        n_jobs=36
                                       )

tsfs.select_best_features()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mabfs-0.0.4.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mabfs-0.0.4-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file mabfs-0.0.4.tar.gz.

File metadata

  • Download URL: mabfs-0.0.4.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.5

File hashes

Hashes for mabfs-0.0.4.tar.gz
Algorithm Hash digest
SHA256 6c764a23fad4d7020a39c548029d842e63416e3ca1b899ea78b82d0dfeae1922
MD5 c59c54a5c7b05c27a504822420b3211b
BLAKE2b-256 8bd15b132179af9296e1eaa3ad922d05c3ce1c5c7c217f2896330d6c5f720599

See more details on using hashes here.

File details

Details for the file mabfs-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: mabfs-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 6.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.5

File hashes

Hashes for mabfs-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 7fa34b41b741a37a73d2d700a4a8ba470201e9f9625cfe0dbca78ca6eec1b096
MD5 7a8141d72ebd36b6d23902ca73befe11
BLAKE2b-256 42b579047ca125585c7714195e5f556c97d306484b9483331d5373d32b154c71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page