the official implementation for PaCMAP: Pairwise Controlled Manifold Approximation Projection
Project description
PaCMAP
PaCMAP (Pairwise Controlled Manifold Approximation) is a dimensionality reduction method that can be used for visualization, preserving both local and global structure of the data in original space. PaCMAP optimizes the low dimensional embedding using three kinds of pairs of points: neighbor pairs (pair_neighbors), mid-near pair (pair_MN), and further pairs (pair_FP), whose numbers are n_neighbors, n_MN and n_FP respectively.
Previous dimensionality reduction techniques focus on either local structure (e.g. t-SNE, LargeVis and UMAP) or global structure (e.g. TriMAP), but not both, although with carefully tuning the parameter in their algorithms that controls the balance between global and local structure, which mainly adjusts the number of considered neighbors. Instead of considering more neighbors to attract for preserving glocal structure, PaCMAP dynamically uses a special group of pairs -- mid-near pairs, to first capture global structure and then refine local structure, which both preserve global and local structure.
Installation
Requirements:
- numpy
- sklearn
- annoy
- numba
To install PaCMAP, you can use pip:
pip install pacmap
Benchmarks
The following images are visualizations of two datasets: MNIST and Mammoth, generated by PaCMAP. The two visualizations demonstrate the local and global structure's preservation ability of PaCMAP respectively.
Parameters
The list of the most important parameters is given below. Changing these values will affect the result of dimension reduction significantly.
-
n_neighbors: n_neighbors controls the number of neighbors considered in the k-Nearest Neighbor graph
-
MN_ratio: the ratio of the number of mid-near pairs to the number of neighbors, n_MN = $\lfloor$ n_neighbors * MN_ratio $\rfloor$
-
FP_ratio: the ratio of the number of further pairs to the number of neighbors, n_FP = $\lfloor$ n_neighbors * FP_ratio $\rfloor$
Reproducing the experiments
We have provided the code we use to run experiment for better reproducibility. The code are separated into three parts, in three folders, respectively:
data
, which includes all the datasets we used, preprocessed into the file format each DR method useexperiments
, which includes all the scripts we use to produce DR resultsevaluation
, which includes all the scripts we use to evaluate DR results, stated in Section 8 in our paper
After downloading the code, you may need to specify the location you stored in the script to make them fully functional.
Citation
If you use PaCMAP in your publication, or you used the implementation in this repository, please cite our preprint here:
@article{
#TODO
}
License
Please see the license file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pacmap-0.2.tar.gz
.
File metadata
- Download URL: pacmap-0.2.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 120b7312a1321496ce277fc59eedc0c68633cb9b0858f913941e31623970e33f |
|
MD5 | 84283723b8914ce77cd441792ef09e55 |
|
BLAKE2b-256 | 1ce6e3f0471350ae009821022896054910efe311335940e5d6916b7ca9107c42 |
Provenance
File details
Details for the file pacmap-0.2-py3-none-any.whl
.
File metadata
- Download URL: pacmap-0.2-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3a65608e6351727c73ce87b5bc7cd6e1c43907927cc88d01604673dcbf0957c |
|
MD5 | 861792fb0bf9ef37cd91716cf0b6f133 |
|
BLAKE2b-256 | 70bc90613c4efe968ea570b30a10148f10d0dcc451bd8747093189928202d94f |