Skip to main content

the official implementation for PaCMAP: Pairwise Controlled Manifold Approximation Projection

Project description

PaCMAP

PaCMAP (Pairwise Controlled Manifold Approximation) is a dimensionality reduction method that can be used for visualization, preserving both local and global structure of the data in original space. PaCMAP optimizes the low dimensional embedding using three kinds of pairs of points: neighbor pairs (pair_neighbors), mid-near pair (pair_MN), and further pairs (pair_FP), whose numbers are n_neighbors, n_MN and n_FP respectively.

Previous dimensionality reduction techniques focus on either local structure (e.g. t-SNE, LargeVis and UMAP) or global structure (e.g. TriMAP), but not both, although with carefully tuning the parameter in their algorithms that controls the balance between global and local structure, which mainly adjusts the number of considered neighbors. Instead of considering more neighbors to attract for preserving glocal structure, PaCMAP dynamically uses a special group of pairs -- mid-near pairs, to first capture global structure and then refine local structure, which both preserve global and local structure.

Installation

Requirements:

  • numpy
  • sklearn
  • annoy
  • numba

To install PaCMAP, you can use pip:

pip install pacmap

Benchmarks

The following images are visualizations of two datasets: MNIST and Mammoth, generated by PaCMAP. The two visualizations demonstrate the local and global structure's preservation ability of PaCMAP respectively.

MNIST

Mammoth

Parameters

The list of the most important parameters is given below. Changing these values will affect the result of dimension reduction significantly.

  • n_neighbors: n_neighbors controls the number of neighbors considered in the k-Nearest Neighbor graph

  • MN_ratio: the ratio of the number of mid-near pairs to the number of neighbors, n_MN = $\lfloor$ n_neighbors * MN_ratio $\rfloor$

  • FP_ratio: the ratio of the number of further pairs to the number of neighbors, n_FP = $\lfloor$ n_neighbors * FP_ratio $\rfloor$

Reproducing the experiments

We have provided the code we use to run experiment for better reproducibility. The code are separated into three parts, in three folders, respectively:

  • data, which includes all the datasets we used, preprocessed into the file format each DR method use
  • experiments, which includes all the scripts we use to produce DR results
  • evaluation, which includes all the scripts we use to evaluate DR results, stated in Section 8 in our paper

After downloading the code, you may need to specify the location you stored in the script to make them fully functional.

Citation

If you use PaCMAP in your publication, or you used the implementation in this repository, please cite our preprint here:

@article{
    #TODO
}

License

Please see the license file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pacmap-0.2.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

pacmap-0.2-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file pacmap-0.2.tar.gz.

File metadata

  • Download URL: pacmap-0.2.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for pacmap-0.2.tar.gz
Algorithm Hash digest
SHA256 120b7312a1321496ce277fc59eedc0c68633cb9b0858f913941e31623970e33f
MD5 84283723b8914ce77cd441792ef09e55
BLAKE2b-256 1ce6e3f0471350ae009821022896054910efe311335940e5d6916b7ca9107c42

See more details on using hashes here.

Provenance

File details

Details for the file pacmap-0.2-py3-none-any.whl.

File metadata

  • Download URL: pacmap-0.2-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for pacmap-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d3a65608e6351727c73ce87b5bc7cd6e1c43907927cc88d01604673dcbf0957c
MD5 861792fb0bf9ef37cd91716cf0b6f133
BLAKE2b-256 70bc90613c4efe968ea570b30a10148f10d0dcc451bd8747093189928202d94f

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page