Python implementation of Neighborhood Component Feature Selection (NCFS)
Project description
Neigbhorhood Component Feature Selection
This is a Python implementation of Neighborhood Component Feature Selection, originally introduced in Yang et al. 2012. NCFS is an embedded feature selection method that learns feature weights by maximizing prediction accuracy in a leave-one-out KNN classifier.
Installation
The package can be with pip using the following command:
pip install ncfs
Example
from NCFS import NCFS
X, y = NCFS.toy_dataset()
feature_select = NCFS.NCFS()
feature_select.fit(X, y)
print(sum(feature_select.coef_ > 1))
Tests
To compare results to the original paper run the following command
python tests/generate_results.py
To perform unit tests ensuring accurate distance calculations, run:
python tests/test_distances.py
Comparison with Original Results
Distance metric
The original paper uses the Manhattan distance when calculating distances between samples/features. While this implementation defaults to using this distance, weights comparable with published results were only found using the euclidean distance. However, while exact weights differed between distance metrics, the selected features did not. Unfortunately, the original paper did not link to the code used, and I've been unable to find a public implementation of the aglorithm.
Numerical stability
NCFS uses the original kernel function when calculating probabilities; however, with a large number of features, distance values can easily approach a large enough value such that the negative exponent rounds to zero. This leads to division by zero issues, and fitting fails. To get around this, small pseudocounts are added to distances when a division by zero would otherwise occur. To keep distances small, features should be scaled between 0 and 1 (enforced by NCFS).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ncfs-0.1.2.tar.gz
.
File metadata
- Download URL: ncfs-0.1.2.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9e228f5d1691644d419a5a8c84eb8861e930b817a324a959eb095cd051b13c0 |
|
MD5 | 18693bc9c2884c557621268576c2d23f |
|
BLAKE2b-256 | 0531bd7ff1eff6486f0e35a6e8a2f75c3b444fa262702c8137f0faefc09be582 |
File details
Details for the file ncfs-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: ncfs-0.1.2-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fa4beb7b69766444cae68883e682f20bb7e094451201b8d53f6afabaee0e6fb5 |
|
MD5 | fa2872aff1bfc738b44bf35a5c719f91 |
|
BLAKE2b-256 | 2703540739b3703740318a1934ec5a5d08d57ae7d53c6bdde037117e9a310066 |