Skip to main content

A package for fitting principal curves in Python

Project description

Downloads

prinPy

pip install prinpy

Inspired by this R package, prinPy brings principal curves to Python.

What prinPy does

PrinPy has local and global algorithms for computing principal curves.

What is a Principal Curve?

A principal curve is a smooth n-dimensional curve that passes through the middle of a dataset. Principal curves are a dimensionality reduction tool analogous to a nonlinear principal component. PCs have uses in GPS data, image recognition, bioinformatics, and so much more.

Local Algorithms

Local algorithms work on a step-by-step basis. Starting at one end of the curve, it will attempt to make segments that meet an acceptable error threshold as it moves from one end of the curve to the other. Once the algorithm can connect the current point to the end point, the algorithm terminates and a curve is interpolated through the segments. PrinPy currently has two local algorithms:

  1. CLPC-g (Greedy Constraint Local Principal Curve)1
  2. CLPC-s (One-Dimensional Search Constraint Local Principal Curve)1

CLPC-g will be faster and is fine for simpler curves. CLPS-s has the potential to be much more accurate at the expense of speed for more difficult curves. After fitting a curve, prinPy has the ability to project to the curve.

Global Algorithms

Global algorithms, unlike local algorithms, are more like minimization problems. Given a dataset, a global algorithm might make an initial guess at a principal curve and adjust it from there.

The sole global algorithm as of now performs nonlinear principal component analysis. The global algorithm, called NLPCA in this package, is a neural network implementation.2 This algorithm works by creating an autoassociative neural network with a "bottle-neck" layer which forces the network to learn the most important features of the data.

Which one should I use?
The local algorithms will be better for tightly bunched data, such as digit recogniition or GPS data. The global algorithm is better suited for "clouds" of data or sparsely represented data.

Quick-Start

View the quickstart notebook here. Docs will be coming soon!

# Example of local PC fitting
cl = CLPCG() # Create solver

# CLPCG.fit() fits the principal curve. takes x_data, y_data,
# and the min allowed error for each step. e_min is acheived 
# through trial and error, but 1/4 to 1/2 data error is what authors
# recommend.
cl.fit(xdata, ydata, e_max = .1) 
cl.plot()       # plots curve, optional axes can be passed

# Reconstruct curve
tcks = cl.spline_ticks    # get spline ticks
xy = scipy.interpolate.splev(np.linspace(0,1,100), self.spline_ticks)

References

[1] Dewang Chen, Jiateng Yin, Shiying Yang, Lingxi Li, Peter Pudney, Constraint local principal curve: Concept, algorithms and applications, Journal of Computational and Applied Mathematics, Volume 298, 2016, Pages 222-235, ISSN 0377-0427, https://doi.org/10.1016/j.cam.2015.11.041.

[2] Mark Kramer, Nonlinear Principal Component Analysis Using Autoassociative Neural Networks

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prinpy-0.0.3.1.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prinpy-0.0.3.1-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file prinpy-0.0.3.1.tar.gz.

File metadata

  • Download URL: prinpy-0.0.3.1.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for prinpy-0.0.3.1.tar.gz
Algorithm Hash digest
SHA256 c70666704794c9ede17a2a9c46259979b38195386bbd7e423eff82013c1991a9
MD5 be24975970439c8e432ae7c16a4d848c
BLAKE2b-256 eb94e2edeedd6726a0303a0783e7385f2610c427cf3c916fcb12d3cce3163fd8

See more details on using hashes here.

File details

Details for the file prinpy-0.0.3.1-py3-none-any.whl.

File metadata

  • Download URL: prinpy-0.0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for prinpy-0.0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 486160331913a8a886b7e7c82f7952f6cc650765bc1fb7e3fb1b856f9bb47d91
MD5 f91de5199c11560f5404da93dfe4b95b
BLAKE2b-256 1b76f2ff40e2d5467477e85e40e71c931a5bf91599db540ebdca4a51aa385bf8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page