A Small Package for Use of Research
Project description
Dimension Reduction Function Research (drfr)
This package provides a Reduction Model and Regression Model, which respectively contains several choices for reduction and regression of data. It also contains several novelty detection methods for preprocessing
Discription of Each Model
Reduction Model
contains "NPPE", "UMAP", "LLE", "Hessian", "Spectral", "TSNE", "Isomap", used
as keyword argument tag
in function get_reduction()
.
To make tag "UMAP" work properly, an install according to https://github.com/lmcinnes/umap
is needed.
Regression Model
contains "lasso", "ridge", "MARS", used
as keyword argument tag
in function cal_regression()
. As basis generator
either those in BasisGenerator or self made function can be used, where data X
should be the only positional argument.
Basis Generator
contains several functions as basis generators, with form
# generate_basis_function(X, p=basis_degree)
Novelty Detector
The detector contains reimplemented kernel PCA, diffusion map, and robust PCA (used in Robust Hessian LLE). More methods can be found in the package pyod by Y.Zhao et.al.
Use argument method
to choose a method, including kpca, dmap, pca, lof, ocsvm, iforest, rforest, rbhessian
.
Installization
pip install drfr
Usage
from drfr import ReductionModel, BasisGenerator, RegressionModel, NoveltyDetector
from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
if __name__ == "__main__":
N = 2000
k = 24
X, color = datasets.samples_generator.make_swiss_roll(n_samples=N, noise=0.001)
basis_generator = None
outlier_quote = 0.8
poly_degree = 4
tag_red = "NPPE"
tag_reg = "MARS"
# preprocessing
scores = NoveltyDetector.evaluate_novelty(X, labels=color, method="pca")
inlier_ind = np.argwhere(scores < outlier_quote * scores.max()).flatten()
X = X[inlier_ind]
color = color[inlier_ind]
# compute embedded result
red_model = ReductionModel.ReductionModel()
y_nppe = red_model.get_reduction(X, tag=tag_red)
# compute regression weights w given X and y, and compute basis(X)*y
reg_model = RegressionModel.RegressionModel()
y_reg = reg_model.cal_regression(X, y_nppe, tag=tag_reg, basis_generator=BasisGenerator.generate_fourier,
p=poly_degree)
# draw results
fig = plt.figure()
ax = fig.add_subplot(311, projection='3d')
ax.scatter(X[:, 1], X[:, 0], X[:, 2], c=color, cmap=plt.cm.Spectral)
ax.set_title("Original data")
ax = fig.add_subplot(312)
ax.scatter(y_nppe[:, 1], y_nppe[:, 0], c=color, cmap=plt.cm.Spectral)
plt.axis('tight')
plt.xticks([]), plt.yticks([])
plt.title('Projected data with method' + tag_red)
ax = fig.add_subplot(313)
ax.scatter(y_reg[:, 1], y_reg[:, 0], c=color, cmap=plt.cm.Spectral)
plt.axis('tight')
plt.xticks([]), plt.yticks([])
plt.title("NPPE embedded data regressed by " + tag_reg + " Model\n" + "with basis degree" + poly_degree.__str__())
plt.show()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file drfr-0.9.6-py3-none-any.whl
.
File metadata
- Download URL: drfr-0.9.6-py3-none-any.whl
- Upload date:
- Size: 19.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1dc4c31c00d1db6491e7d5e424a8c40af3b66dc9636ab3db1053077562f91913 |
|
MD5 | 5cd66501355d3ceaf7537c213f8f6571 |
|
BLAKE2b-256 | f12938116bb11370a2e6ef24b4423beeeeb2b1354ba748f45e1aca310cf98910 |