Skip to main content

Minimum-Distortion Embedding

Project description

PyMDE

PyPI version Conda Version

The official documentation for PyMDE is available at www.pymde.org.

This repository accompanies the monograph Minimum-Distortion Embedding.

PyMDE is a Python library for computing vector embeddings for finite sets of items, such as images, biological cells, nodes in a network, or any other abstract object.

What sets PyMDE apart from other embedding libraries is that it provides a simple but general framework for embedding, called Minimum-Distortion Embedding (MDE). With MDE, it is easy to recreate well-known embeddings and to create new ones, tailored to your particular application.

PyMDE is competitive in runtime with more specialized embedding methods. With a GPU, it can be even faster.

Overview

PyMDE can be enjoyed by beginners and experts alike. It can be used to:

  • visualize datasets, small or large;
  • generate feature vectors for supervised learning;
  • compress high-dimensional vector data;
  • draw graphs (in up to orders of magnitude less time than packages like NetworkX);
  • create custom embeddings, with custom objective functions and constraints (such as having uncorrelated feature columns);
  • and more.

PyMDE is very young software, under active development. If you run into issues, or have any feedback, please reach out by filing a Github issue.

This README gives a very brief overview of PyMDE. Make sure to read the official documentation at www.pymde.org, which has in-depth tutorials and API documentation.

Installation

PyMDE is available on the Python Package Index, and on Conda Forge.

To install with pip, use

pip install pymde

Alternatively, to install with conda, use

conda install -c pytorch -c conda-forge pymde

PyMDE has the following requirements:

  • Python >= 3.7
  • numpy >= 1.17.5
  • scipy
  • torch >= 1.7.1
  • torchvision >= 0.8.2
  • pynndescent
  • requests

Getting started

Getting started with PyMDE is easy. For embeddings that work out-of-the box, we provide two main functions:

pymde.preserve_neighbors

which preserves the local structure of original data, and

pymde.preserve_distances

which preserves pairwise distances or dissimilarity scores in the original data.

Arguments. The input to these functions is the original data, represented either as a data matrix in which each row is a feature vector, or as a (possibly sparse) graph encoding pairwise distances. The embedding dimension is specified by the embedding_dim keyword argument, which is 2 by default.

Return value. The return value is an MDE object. Calling the embed() method on this object returns an embedding, which is a matrix (torch.Tensor) in which each row is an embedding vector. For example, if the original input is a data matrix of shape (n_items, n_features), then the embedding matrix has shape (n_items, embeddimg_dim).

We give examples of using these functions below.

Preserving neighbors

The following code produces an embedding of the MNIST dataset (images of handwritten digits), in a fashion similar to LargeVis, t-SNE, UMAP, and other neighborhood-based embeddings. The original data is a matrix of shape (70000, 784), with each row representing an image.

import pymde

mnist = pymde.datasets.MNIST()
embedding = pymde.preserve_neighbors(mnist.data, verbose=True).embed()
pymde.plot(embedding, color_by=mnist.attributes['digits'])

Unlike most other embedding methods, PyMDE can compute embeddings that satisfy constraints. For example:

embedding = pymde.preserve_neighbors(mnist.data, constraint=pymde.Standardized(), verbose=True).embed()
pymde.plot(embedding, color_by=mnist.attributes['digits'])

The standardization constraint enforces the embedding vectors to be centered and have uncorrelated features.

Preserving distances

The function pymde.preserve_distances is useful when you're more interested in preserving the gross global structure instead of local structure.

Here's an example that produces an embedding of an academic coauthorship network, from Google Scholar. The original data is a sparse graph on roughly 40,000 authors, with an edge between authors who have collaborated on at least one paper.

import pymde

google_scholar = pymde.datasets.google_scholar()
embedding = pymde.preserve_distances(google_scholar.data, verbose=True).embed()
pymde.plot(embedding, color_by=google_scholar.attributes['coauthors'], color_map='viridis', background_color='black')

More collaborative authors are colored brighter, and are near the center of the embedding.

Example notebooks

We have several example notebooks that show how to use PyMDE on real (and synthetic) datasets.

Citing

To cite our work, please use the following BibTex entry.

@article{agrawal2021minimum,
  author  = {Agrawal, Akshay and Ali, Alnur and Boyd, Stephen},
  title   = {Minimum-Distortion Embedding},
  journal = {arXiv},
  year    = {2021},
}

PyMDE was designed and developed by Akshay Agrawal.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymde-0.1.18.tar.gz (66.8 kB view details)

Uploaded Source

Built Distributions

pymde-0.1.18-cp311-cp311-win_amd64.whl (93.2 kB view details)

Uploaded CPython 3.11 Windows x86-64

pymde-0.1.18-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (140.5 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pymde-0.1.18-cp311-cp311-macosx_11_0_arm64.whl (88.2 kB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

pymde-0.1.18-cp311-cp311-macosx_10_9_x86_64.whl (89.1 kB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

pymde-0.1.18-cp310-cp310-win_amd64.whl (93.0 kB view details)

Uploaded CPython 3.10 Windows x86-64

pymde-0.1.18-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (137.4 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pymde-0.1.18-cp310-cp310-macosx_11_0_arm64.whl (87.8 kB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

pymde-0.1.18-cp310-cp310-macosx_10_9_x86_64.whl (88.5 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

pymde-0.1.18-cp39-cp39-win_amd64.whl (93.0 kB view details)

Uploaded CPython 3.9 Windows x86-64

pymde-0.1.18-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (137.2 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pymde-0.1.18-cp39-cp39-macosx_11_0_arm64.whl (87.8 kB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

pymde-0.1.18-cp39-cp39-macosx_10_9_x86_64.whl (88.5 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

pymde-0.1.18-cp38-cp38-win_amd64.whl (92.8 kB view details)

Uploaded CPython 3.8 Windows x86-64

pymde-0.1.18-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (137.2 kB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pymde-0.1.18-cp38-cp38-macosx_11_0_arm64.whl (87.8 kB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

pymde-0.1.18-cp38-cp38-macosx_10_9_x86_64.whl (88.5 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

File details

Details for the file pymde-0.1.18.tar.gz.

File metadata

  • Download URL: pymde-0.1.18.tar.gz
  • Upload date:
  • Size: 66.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for pymde-0.1.18.tar.gz
Algorithm Hash digest
SHA256 67f83fd269046a338dca1ebea71003a65b53c72467b6559bfc5eda30a86d3d33
MD5 5e126e4f7cd5ea5d12ea269df71ce9b5
BLAKE2b-256 a08e21406b5c87e93f73b84992441652e340079725dfc8722be710fb46232a6d

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: pymde-0.1.18-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 93.2 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for pymde-0.1.18-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 3f85fab8db57e76a7532b085e9dbd30140951cf8ff85344cf965439a98ed5cd4
MD5 92708c870fcb558cf3deff2770be4b70
BLAKE2b-256 db6255f4cf9ce902e094372c731ea3d13d20d8246c9aee7c308b18605011bb20

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0a1f3fa85336daebf1cdcf0f7ac3bfdb2ea42c8de944e0a81a7016e3f060177d
MD5 42918999c976ca157dae9dcb2e67cca2
BLAKE2b-256 a991f1b5a0275c2ed9d9a4dd0940c1f07c234da8407b0c0db470d7a548fa47a1

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 557542b9ba9000008ced4cbd2f66362410eb15c1eefd57c8da742a41b16037ec
MD5 4cd5a8271a7917014ec72a49dcc5228c
BLAKE2b-256 f6a2850d0a11234c8bf6aa8e1940388aa8980be340313d3eac3884421ae31f1f

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6ab59be753fae90c8071508cf5dee300cf53fe5a1591129a4b069ffcb1987dc9
MD5 4f551118948584ff2f4d40c5bcc84bc5
BLAKE2b-256 6513576992cf02ce117dbd1d59e468c7df77ad0c09d49fea7f2c6ddca7c2f430

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: pymde-0.1.18-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 93.0 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for pymde-0.1.18-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 61a7164bf892ea7514c62b85c5f828e62c98ea6d3313d2412dd456a7a9b50c92
MD5 e48720967518f4891272b66279650821
BLAKE2b-256 cb1307e263b3d7c155703b0757bd5228b7e1133faf7687b3e37bc010f9fe0191

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c1e01b0afe5418629401cd4d71489090416fe0c048cc892eb2735c085d02aff3
MD5 625dacfd9b94e6596d311fc5250c6dcd
BLAKE2b-256 cd68f881ca8a9c9ef6ebd848212f35d631ec295ecdd2d3402c6ac678e52fe941

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 864e986901145c62fb7081a235c719b5e56041420f62ca598e8ff976aba75ab1
MD5 07b9e6f3aad40766fec3dc41a6d3ab62
BLAKE2b-256 971a121c563f53929eb02c64ad4a9c28a97b6efeacb8d346bc0ab39425b31ea0

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 91da61cd57d53d0a12165b3c45b932189f4b0660a232bd0fb4a78e2caff622f0
MD5 f21bf34c6a88ae89eaf89a7f3f882292
BLAKE2b-256 0e7b7bd751e4d31b375539b9ccd58d00ab0caf5fb96bbeb735be5a06e5db145d

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: pymde-0.1.18-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 93.0 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for pymde-0.1.18-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 ee5cd37a3e56a58f35fd9b1c932974ff7a5f04d1febd58f0521042a670512728
MD5 3b754251ca3f02a28b7f5a5c1169085b
BLAKE2b-256 a6f3ed3bcc25923a56f09dec12ec53e2eeb9f2fa18d3d71621040ba69c892f9f

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e86d24438f692e68d0b90abaa1dc30918ba3ede9116ee3d33b86311a30affef0
MD5 57d03b6d9b2d4f4e7a711ebd494d1476
BLAKE2b-256 bdf884a85e4ad509ff92d52db50a18ee2dd6469019a74beb85cfd24679113fc1

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f845343e21297eaea8c337374384596d7fe2ac322074789817fbe95082fcfe74
MD5 12f2484af82bb49cbbedd994693a26d2
BLAKE2b-256 8e5cb97f35408ef0915b37fc191e4fb045150ff5683d1ed57f55057fa7256dd7

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6fc49a58d0f70d7f9687623f47da3b610b695de534e9495e758b2d3017b8115a
MD5 abce3ecac7940e92224f7d81ba091aa5
BLAKE2b-256 5493a8385347970b14b10c792d6c7aa49495a9bc1117342533be16f993145dc6

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: pymde-0.1.18-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 92.8 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for pymde-0.1.18-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e2a0f86f8792f362c7f4bebebabfd634e1c6b4c9bfd0dbda98d9ee9b8275ef4e
MD5 4797d078b212245775f852025bec4bbb
BLAKE2b-256 e7cbc3768c4c41e1ce7fdb1eb58d9322da13beba9ed873f2769330ba03e39fe0

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 942fff8ad0ba3037582d2d40ed9236c88232a0395a9a275eedde994562f8bcee
MD5 4e06ae1cd79a6dac288b2b351f53a427
BLAKE2b-256 ad17e43c2e5022049b5a93a78dba8a04db1fa43b53ce0fff407fcab2ee518e6b

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 298b592b29adba72ce6eecdfedf967ee5479273b2fe0ea9d7512d79f3a018457
MD5 cea792a89b38109fd5bcffee502164d3
BLAKE2b-256 addf6d1dfb062fb94b75b785575bf843d10ad5326791c463a324aeb40db9581c

See more details on using hashes here.

File details

Details for the file pymde-0.1.18-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.1.18-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0f19353c5f107a23912ee34aa2ed5ad13a47dd00fc0204a1f827a61b04d667c7
MD5 121d4664d566cd0eaeff7ceb3753603d
BLAKE2b-256 9925fd269659a2064d0b6d8f77799cd7d3c9b024b426e81f07ca4cce99d65f75

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page