Skip to main content

This is a Python implementation by the authors of the paper 'Online Feature Screening for Data Streams With Concept Drift' from Dr. Mingyuan Wang and Dr. Adrian Barbu. Contain various feature selection methods.

Project description

Online-Feature-Screening-for-Datastream-with-Sparsity-Concept-Drifting

This is a Python implementation by the authors of the paper "Online Feature Screening for Data Streams With Concept Drift" from Dr. Mingyuan Wang and Dr. Adrian Barbu.

Please cite this paper if you use or build on our method. doi.org/10.1109/TKDE.2022.3232752

This project enabled well-known feature screening methods, including gini index, chi-square score, mutual information, fisher-score, T-score to handle streaming data, batch data, data with drifting, and sparse data. It currently only works on binary classification data.

Installation

Prerequisites

  • Python 3.10 or newer
  • pip
  • numpy 2.2.4 or newer

Note

Although the package is designed OS independent, it was only tested on Windows. You might need to use methods listed below other than pip install pyscreeningfs.

For users installing from source (e.g., if no pre-built wheels are available for your system): You will need a C++ compiler compatible with your Python installation:

  • Windows: Microsoft Visual C++ Build Tools (part of Visual Studio, or standalone).
  • Linux: gcc and g++ (usually included or easily installed via your package manager, e.g., sudo apt-get install build-essential).
  • macOS: Xcode Command Line Tools (install with xcode-select --install).

Install via git clone

  1. Clone repository
git clone https://github.com/yourusername/repo_name.git
  1. Navigate into the cloned repository directory
cd repo_name 
  1. Install
pip install .

Install via download

  1. Download the repository
  2. Unpack to your own folder your_folder/repo_name
  3. Navigate into the unpacked repository directory
cd repo_name  
  1. Install
pip install .

Install via pip (Currently unavailable)

If pre-built wheels are available for your system on PyPI (coming soon!), you can install directly:

pip install pyscreeningfs

Data

For .svm sparse data, visit https://www.sysnet.ucsd.edu/projects/url/
Download and put into data/url_svmlight/

For any input data/data files, the Y/label/class vector can only contain numeric value and one of the label must be 1.

Demo

For a demo, see testing.py in the root directory.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyscreeningfs-0.1.1.tar.gz (64.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyscreeningfs-0.1.1-cp310-cp310-win_amd64.whl (89.3 kB view details)

Uploaded CPython 3.10Windows x86-64

File details

Details for the file pyscreeningfs-0.1.1.tar.gz.

File metadata

  • Download URL: pyscreeningfs-0.1.1.tar.gz
  • Upload date:
  • Size: 64.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for pyscreeningfs-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7fba58c9f71f599abf01c4cb7779a034618491d86d1c62ea91a03dd3d679c075
MD5 ba952f513388a98d1e9c9abbbb207a70
BLAKE2b-256 f263fbc002bc35086667e1985d7636510651c4c620f482bd20e47a2380832659

See more details on using hashes here.

File details

Details for the file pyscreeningfs-0.1.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for pyscreeningfs-0.1.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 ccd14a877d1d1fe463b42e07882bada17efb6d5610f84dc51bf4239535b042d7
MD5 75448372721ecfeb4114d3c7d75131d6
BLAKE2b-256 a747d825b03b60402d7f47ec95ef2d99ba930e1ded3bf6a186a80a4a95e8eeb6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page