Skip to main content

Naive Feature Selection

Project description

NFS: Naive Feature Selection

This package solves the Naive Feature Selection problem described in the paper.

Installation

pip install git+https://github.com/aspremon/NaiveFeatureSelection

Usage

Minimal usage script

The DemoNFS.py script loads the 20 newsgroups text data set from scikit-learn and reports accuracy of Naive Feature Selection, followed by SVC using the selected features.

The package is compatible with scikit-learn's Fit-Transform paradigm. To demonstrate this, DemoNFS.py runs the same test using the pipeline package from scikit-learn and performs cross validation using GridSearchCV from sklearn.model_selection.

To run the DemoNFS.py script, type

python DemoNFS.py

This should produce the following output

Testing NFS ...
Loading 20 newsgroups dataset for categories:
['sci.med', 'sci.space']

Extracting features from the training data using a sparse vectorizer
n_samples: 1187, n_features: 21368

Extracting features from the test data using the same vectorizer
n_samples: 790, n_features: 21368

NFS accuracy:   0.843

Space features:
['aerospace', 'allen', 'ames', 'apollo', 'astronomy', 'billion', 'built', 'centaur', 'comet', 'command', 'commercial', 'cost', 'data', 'dc', 'dryden', 'earth', 'flight', 'funding', 'government', 'gravity', 'jupiter', 'landing', 'launch', 'launched', 'launches', 'lunar', 'mars', 'mary', 'mining', 'mission', 'missions', 'moon', 'nasa', 'orbit', 'orbital', 'pat', 'payload', 'planetary', 'program', 'project', 'proton', 'rocket', 'rockets', 'russian', 'satellite', 'satellites', 'shafer', 'shuttle', 'software', 'solar', 'space', 'spacecraft', 'ssto', 'station', 'sun', 'titan', 'vehicle']

Med features:
['allergic', 'banks', 'blood', 'brain', 'cadre', 'cancer', 'candida', 'chastity', 'diagnosed', 'diet', 'disease', 'diseases', 'doctor', 'doctors', 'drug', 'drugs', 'dsl', 'food', 'foods', 'geb', 'gordon', 'health', 'intellect', 'lyme', 'med', 'medical', 'medicine', 'msg', 'n3jxp', 'pain', 'patient', 'patients', 'pitt', 'seizures', 'shameful', 'skepticism', 'soon', 'surrender', 'symptoms', 'syndrome', 'therapy', 'treatment', 'yeast']

Pipeline accuracy:      0.843

Best cross validated k: 500

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

naive_feature_selection-0.0.1.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

naive_feature_selection-0.0.1-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file naive_feature_selection-0.0.1.tar.gz.

File metadata

  • Download URL: naive_feature_selection-0.0.1.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.6.2

File hashes

Hashes for naive_feature_selection-0.0.1.tar.gz
Algorithm Hash digest
SHA256 5f39f17150c3e3624275dfac7dfd85fe8eb812930623d5ce4091efafa6e41789
MD5 6b285b46be6a93f2b26c2d2437e659ae
BLAKE2b-256 5dca0b2756cf50126970c0ebb001972d247c9dc0a32ca0b5a68287d06627b5b2

See more details on using hashes here.

File details

Details for the file naive_feature_selection-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: naive_feature_selection-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 6.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.6.2

File hashes

Hashes for naive_feature_selection-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3173155c1c890639c33bd07ec7dd282f793be4cdfa20cb1769b9627ff15f04a8
MD5 14e61726cd162b045f164e1e9d14b9ff
BLAKE2b-256 15122a4922ea96d6b7af413c1f16bd0c623525ca928fe531fdd2180dec269cab

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page