Skip to main content

A package for fitting principal curves in Python

Project description

princurve

pip install princurve

Inspired by this R package, princurve brings principal curves to Python.

What princurve does

princurve has local and global algorithms for computing principal curves.

What is a Principal Curve?

A principal curve is a smooth n-dimensional curve that passes through the middle of a dataset. Principal curves are a dimensionality reduction tool analogous to a nonlinear principal component. PCs have uses in GPS data, image recognition, bioinformatics, and so much more.

Local Algorithms

Local algorithms work on a step-by-step basis. Starting at one end of the curve, it will attempt to make segments that meet an acceptable error threshold as it moves from one end of the curve to the other. Once the algorithm can connect the current point to the end point, the algorithm terminates and a curve is interpolated through the segments. PrinPy currently has two local algorithms:

  1. CLPC-g (Greedy Constraint Local Principal Curve)1
  2. CLPC-s (One-Dimensional Search Constraint Local Principal Curve)1

CLPC-g will be faster and is fine for simpler curves. CLPS-s has the potential to be much more accurate at the expense of speed for more difficult curves. After fitting a curve, prinPy has the ability to project to the curve.

Global Algorithms

Global algorithms, unlike local algorithms, are more like minimization problems. Given a dataset, a global algorithm might make an initial guess at a principal curve and adjust it from there.

The sole global algorithm as of now performs nonlinear principal component analysis. The global algorithm, called NLPCA in this package, is a neural network implementation.2 This algorithm works by creating an autoassociative neural network with a "bottle-neck" layer which forces the network to learn the most important features of the data.

Which one should I use?
The local algorithms will be better for tightly bunched data, such as digit recogniition or GPS data. The global algorithm is better suited for "clouds" of data or sparsely represented data.

Quick-Start

View the quickstart notebook here. Docs will be coming soon!

# Example of local PC fitting
cl = CLPCG() # Create solver

# CLPCG.fit() fits the principal curve. takes x_data, y_data,
# and the min allowed error for each step. e_min is acheived 
# through trial and error, but 1/4 to 1/2 data error is what authors
# recommend.
cl.fit(xdata, ydata, e_max = .1) 
cl.plot()       # plots curve, optional axes can be passed

# Reconstruct curve
tcks = cl.spline_ticks    # get spline ticks
xy = scipy.interpolate.splev(np.linspace(0,1,100), self.spline_ticks)

References

[1] Dewang Chen, Jiateng Yin, Shiying Yang, Lingxi Li, Peter Pudney, Constraint local principal curve: Concept, algorithms and applications, Journal of Computational and Applied Mathematics, Volume 298, 2016, Pages 222-235, ISSN 0377-0427, https://doi.org/10.1016/j.cam.2015.11.041.

[2] Mark Kramer, Nonlinear Principal Component Analysis Using Autoassociative Neural Networks

[3] Hastie, T. and Stuetzle, W., Principal Curves, JASA, Vol. 84, No. 406 (Jun., 1989), pp. 502-516, DOI: 10.2307/2289936

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

princurve-0.0.1.tar.gz (8.5 kB view details)

Uploaded Source

Built Distribution

princurve-0.0.1-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file princurve-0.0.1.tar.gz.

File metadata

  • Download URL: princurve-0.0.1.tar.gz
  • Upload date:
  • Size: 8.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for princurve-0.0.1.tar.gz
Algorithm Hash digest
SHA256 921b1043b560cf3c8d14c32588c21c2637e980662b314b25687c271d417f29d9
MD5 44bcc2d791136d59ca6c8df161a7565d
BLAKE2b-256 6c8bc9e32bd98538f4310b013f579b30e23b159a414843d16632ce999f3a60e5

See more details on using hashes here.

File details

Details for the file princurve-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: princurve-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for princurve-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1af0d3684e7a896e9de8665e304582fe353ccc8b63dc08d6ee138c6bf15ea528
MD5 b2e3d48761f08377b804a9f24aa1c927
BLAKE2b-256 97eebcad340aa3e084a5822ec7017ec0b68627388d10c3ad89ff8ca5700a2577

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page