Skip to main content

The official implementation for CAMEL: Curvature Augmented Manifold Embedding and Learning

Project description

.. -- mode: rst --

.. image:: docs/Camel_logo.png :width: 600 :alt: CAMELlogo :align: center

|pypi_version|_

.. |pypi_version| image:: https://img.shields.io/pypi/v/camel-learn.svg .. _pypi_version: https://pypi.python.org/pypi/camel-learn/

################################################################# Curvature Augmented Manifold Embedding and Learning -- CAMEL #################################################################

CAMEL is a Python tool for dimension reduction and data visualization. It can perform unsupervised, supervised, semi-supervised, metric, and inverse learning.


Theory and Reference

Detailed derivation and examples can be found in the ArXiv paper. https://arxiv.org/abs/2403.14813

Detailed documentation and examples can be found at https://camel-learn.readthedocs.io/en/latest/


Installing

CAMEL Requirements:

  • Python 3.6 or greater
  • numpy
  • scikit-learn
  • numba
  • annoy
  • pandas

Recommended packages:

  • For plotting
    • matplotlib
  • For metrics evaluation
    • gap statistics, coranking, optics

Install Options

.. code:: bash

 pip install camel-learn

If pip is having difficulties pulling the dependencies, then I'd suggest installing the dependencies manually using Anaconda. The author has tried Anaconda in Mac OS 14 with M1 and M2 CPU.


How to use CAMEL

The camel package is inspired and developed based on many dimension reduction packages, such as UMAP, TriMAP, and PaCMAP, which follow a similar setting from sklearn classes. Thus, CAMEL shares a similar calling format using the CAMEL API.

  1. There is only one class, CAMEL().
  2. fit(X, y) and fit_transform(X, y) perform training in embedding data and constructing a "model". X refers to input feature data, and y refers to input label data. y is optional and can also have missing/NaN data. This module is mainly used for unsupervised, supervised, and semi-supervised learning.
  3. transform(Xnew, basis) is for embedding if new testing data Xnew is provided and the model is constructed using basis datasets. Basis data is optional. This module is mainly used for metric learning, where the metric model is already learned from training data, whether it is supervised, unsupervised, or semi-supervised learning.
  4. invser_transform(ynews, X, y) is used for inverse embedding and dimension augmentation from low to high dimensions. This module assumes that you have a forward embedding constructed from training data X (basis feature) and y (embedding of basis feature). Then, one can reverse this process by constructing a feature space vector from a new unseen point in a dimension point. This is in analogy to the generative model from a latent space in ML.

The CAMEL is very easy to start with. You can start a basic unsupervised learning job by plotting with less than 10 lines of code!

.. code:: python

import matplotlib.pyplot as plt
from camel import CAMEL
from sklearn import datasets

X, y = datasets.make_swiss_roll(n_samples=50000, random_state=None)

reducer= CAMEL()

X_embedding = reducer.fit_transform(X)

y = y.astype(int) #convert to category for easy visulization

# Visualization

plt.figure(1)
plt.scatter(X_embedding[:, 0], X_embedding[:, 1], c=y, cmap='jet', s=0.2)
plt.title('CAMEL Embedding')
plt.tight_layout()
plt.show()

Once done, you will see the 2D embedding of the 3D Swiss Roll.

.. image:: docs/swiss_roll_unsupervised.png :width: 600 :alt: swiss_roll_unsupervised :align: center

Simple code examples in the test folder: (more coming)

===== API

Several parameters can control the CAMEL's results and performance. Default values have been set if you want to start quickly. Below is a description of several main factors if you want to fine-tune the CAMEL.

  • ''n_components'': int, default=2 Dimensions of the embedded space. Typical values are 2 or 3. It can be any integer.

  • ' ' n_neighbors'': int, default=10 Number of neighbors considered for nearest neighbor pairs for local structure preservation.

  • ''FP_number'': float, default=20 Number of further points(e.g., 20 Further pairs per node) Further pairs are used for both local and global structure preservation.

  • ''tail_coe'': float, default=0.05 The parameter to control the attractive force of neighbors (1/(1+tail_coe*dij)**2), smaller values indicate flat tail, and it is not recommended to change.

  • ''w_neighbors'': float, default=1.0 weight coefficient for the attractive force of neighbors, large values indicate strong force for the same distance metric

  • ''w_curv'': float, default=0.001 weight coefficient for attractive/repulsive force due to local curvature, large values indicate strong force for the same distance metric

  • ''w_FP'': float, default=20 weight coefficient for the repulsive force of far points, large values indicate strong force for the same distance metric

  • ''lr'': float, default=1.0 The learning rate of the Adam optimizer for embedding. do not recommend changing.

  • ''num_iters'': int, default=400 The number of iterations for optimizing embedding. It is observed that 200 is sufficient for most cases, and 400 is used here for safety reasons.

  • ''target_weight'': float, default=0.5 weight factor for target/label during the supervised learning, 0 indicates no weight, and it reduces to unsupervised one, 1 indicates infinity weight (set as a large value in practice.

  • ''random_state'': int, optional Random state for the camel instance. Setting a random state is useful for repeatability.

The other setting can be seen in the source code and will be updated in future documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

camel_learn-1.1.2.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

camel_learn-1.1.2-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file camel_learn-1.1.2.tar.gz.

File metadata

  • Download URL: camel_learn-1.1.2.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.0

File hashes

Hashes for camel_learn-1.1.2.tar.gz
Algorithm Hash digest
SHA256 0fd8e8cc0d318b83db2193108764ec797023fb16dd7d9e829de63b08c7fbed8d
MD5 04b969cb04ad3e1067bb71e0e3f4d35e
BLAKE2b-256 2333c1408ee56fb00f873b40b61bdc8f6c879d4fa651ffd0f99acc9fb24233f2

See more details on using hashes here.

File details

Details for the file camel_learn-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: camel_learn-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.0

File hashes

Hashes for camel_learn-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5e7fa74344cbe0c007a3f1050b2a0b38edc04698a8153a32082cb88ffdd1032a
MD5 a803c8c3ebb86e31032d9a118a1a59cd
BLAKE2b-256 506126a756c5e81c8b62505673effde93cdb7e738280e4057a3db4d58d5da3a1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page