Skip to main content
Join the official 2020 Python Developers SurveyStart the survey!

A Small Package for Use of Research

Project description

Dimension Reduction Function Research (drfr)

This package provides a Reduction Model and Regression Model, which respectively contains several choices for reduction and regression of data. It also contains several novelty detection methods for preprocessing

Discription of Each Model

Reduction Model

contains "NPPE", "UMAP", "LLE", "Hessian", "Spectral", "TSNE", "Isomap", used as keyword argument tag in function get_reduction(). To make tag "UMAP" work properly, an install according to https://github.com/lmcinnes/umap is needed.

Regression Model

contains "lasso", "ridge", "MARS", used as keyword argument tag in function cal_regression(). As basis generator either those in BasisGenerator or self made function can be used, where data X should be the only positional argument.

Basis Generator

contains several functions as basis generators, with form

 # generate_basis_function(X, p=basis_degree)

Novelty Detector

The detector contains reimplemented kernel PCA, diffusion map, and robust PCA (used in Robust Hessian LLE). More methods can be found in the package pyod by Y.Zhao et.al. Use argument method to choose a method, including kpca, dmap, pca, lof, ocsvm, iforest, rforest, rbhessian.

Installization

pip install drfr

Usage

from drfr import ReductionModel, BasisGenerator, RegressionModel, NoveltyDetector
from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

if __name__ == "__main__":	
	N = 2000
    k = 24
    X, color = datasets.samples_generator.make_swiss_roll(n_samples=N, noise=0.001)
    basis_generator = None
    outlier_quote = 0.8
    poly_degree = 4
    tag_red = "NPPE"
    tag_reg = "MARS"

    # preprocessing
    scores = NoveltyDetector.evaluate_novelty(X, labels=color, method="pca")
    inlier_ind = np.argwhere(scores < outlier_quote * scores.max()).flatten()
    X = X[inlier_ind]
    color = color[inlier_ind]

    # compute embedded result
    red_model = ReductionModel.ReductionModel()
    y_nppe = red_model.get_reduction(X, tag=tag_red)

    # compute regression weights w given X and y, and compute basis(X)*y
    reg_model = RegressionModel.RegressionModel()
    y_reg = reg_model.cal_regression(X, y_nppe, tag=tag_reg, basis_generator=BasisGenerator.generate_fourier,
                                     p=poly_degree)

    # draw results
    fig = plt.figure()
    ax = fig.add_subplot(311, projection='3d')
    ax.scatter(X[:, 1], X[:, 0], X[:, 2], c=color, cmap=plt.cm.Spectral)

    ax.set_title("Original data")
    ax = fig.add_subplot(312)
    ax.scatter(y_nppe[:, 1], y_nppe[:, 0], c=color, cmap=plt.cm.Spectral)
    plt.axis('tight')
    plt.xticks([]), plt.yticks([])
    plt.title('Projected data with method' + tag_red)
    ax = fig.add_subplot(313)
    ax.scatter(y_reg[:, 1], y_reg[:, 0], c=color, cmap=plt.cm.Spectral)
    plt.axis('tight')
    plt.xticks([]), plt.yticks([])
    plt.title("NPPE embedded data regressed by " + tag_reg + " Model\n" + "with basis degree" + poly_degree.__str__())
    plt.show()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for drfr, version 0.9.6
Filename, size File type Python version Upload date Hashes
Filename, size drfr-0.9.6-py3-none-any.whl (19.2 kB) File type Wheel Python version py3 Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page