Skip to main content

A package for Statistical Inference for Feature Selection after OT-based Domain Adaptation

Project description

Statistical Inference for Feature Selection after Optimal Transport-based Domain Adaptation (AISTATS 2025)

This package implements a statistical inference for feature selection (FS) after optimal transport-based domain adaptation (OT-based DA). The main idea is to leverages the SI framework and employs a divide-and conquer approach to efficiently compute the $p$-value. By providing valid $p$-values for the selected features, our proposed method not only controls the false positive rate (FPR) in FS under DA but also maximizes the true positive rate (TPR), i.e., reducing the false negative rate (FNR). We believe this study represents a significant step toward controllable machine learning in the context of DA.

See the paper https://arxiv.org/abs/2410.15022 for more details.

Requirements

This package has the following requirements:

We recommend to install or update anaconda to the latest version and use Python 3 (We used Python 3.12.3).

NOTE: We use scipy package (version 1.13.1) to solve the linear program (simplex method). However, the default package does not return the set of basic variables. Therefore, we slightly modified the package so that it can return the set of basic variables by replacing the two files '_linprog.py' and '_linprog_simplex.py' in scipy.optimize module with our modified files in the folder 'files_to_replace' at https://github.com/NT-Loi/SFS_DA/tree/main/files_to_replace.

How to Automatically Replace the Files

If Using Anaconda

  • First, initialize and activate the target environment (if you want to use your conda base environment, you should replace 'your-env-name' by 'base'):
$ conda init
$ conda activate your-env-name
python replace_scipy_linprog.py --env anaconda --dir files_to_replace

If Using System Python (Non-Anaconda) - Run the following command:

$ python replace_scipy_linprog.py --env python --dir files_to_replace

Installation

This package can be installed using pip:

$ pip install sfs_da

Usage

We provide several Jupyter notebooks demonstrating how to use the sfs-da package in our examples directory.

  • Example for computing $p$-value for Lasso after DA
>> ex0_p_value_lasso_DA.ipynb
  • Example for computing $p$-value for Elastic Net after DA
>> ex1_p_value_elasticnet_DA.ipynb
  • Check the uniformity of the pivot
>> ex2_validity_of_p_value.ipynb

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sfs_da-1.0.3.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sfs_da-1.0.3-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file sfs_da-1.0.3.tar.gz.

File metadata

  • Download URL: sfs_da-1.0.3.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for sfs_da-1.0.3.tar.gz
Algorithm Hash digest
SHA256 e82ff241fa27d30c2b7ae013e63ad0b68d8b7ef08e30e80e35199bd8ae934d4a
MD5 ba3d76a6e7c8423cde04e897d1079b08
BLAKE2b-256 b0d71d74a51e8dbc6eb58d26972cdda1563f1d4313a52ee36ccda1c1a7e956d2

See more details on using hashes here.

File details

Details for the file sfs_da-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: sfs_da-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for sfs_da-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1d92558b92938d606f0e19d17b13e959bcd5117df81d105cbc46c2ae134c1763
MD5 98e93476cb9040e5fbf6611289743a4f
BLAKE2b-256 2e88955670048b7da0d92109cb1691c552171ff3d74560c56ba95598f32dec02

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page