Skip to main content

Minimum-Distortion Embedding

Project description

PyMDE

PyPI version Conda Version

The official documentation for PyMDE is available at www.pymde.org.

This repository accompanies the monograph Minimum-Distortion Embedding.

PyMDE is a Python library for computing vector embeddings for finite sets of items, such as images, biological cells, nodes in a network, or any other abstract object.

What sets PyMDE apart from other embedding libraries is that it provides a simple but general framework for embedding, called Minimum-Distortion Embedding (MDE). With MDE, it is easy to recreate well-known embeddings and to create new ones, tailored to your particular application.

PyMDE is competitive in runtime with more specialized embedding methods. With a GPU, it can be even faster.

Overview

PyMDE can be enjoyed by beginners and experts alike. It can be used to:

  • visualize datasets, small or large;
  • generate feature vectors for supervised learning;
  • compress high-dimensional vector data;
  • draw graphs (in up to orders of magnitude less time than packages like NetworkX);
  • create custom embeddings, with custom objective functions and constraints (such as having uncorrelated feature columns);
  • and more.

PyMDE is very young software, under active development. If you run into issues, or have any feedback, please reach out by filing a Github issue.

This README gives a very brief overview of PyMDE. Make sure to read the official documentation at www.pymde.org, which has in-depth tutorials and API documentation.

Installation

PyMDE is available on the Python Package Index, and on Conda Forge.

To install with pip, use

pip install pymde

Alternatively, to install with conda, use

conda install -c pytorch -c conda-forge pymde

PyMDE has the following requirements:

  • Python >= 3.7
  • numpy >= 1.17.5
  • scipy
  • torch >= 1.7.1
  • torchvision >= 0.8.2
  • pynndescent
  • requests

Getting started

Getting started with PyMDE is easy. For embeddings that work out-of-the box, we provide two main functions:

pymde.preserve_neighbors

which preserves the local structure of original data, and

pymde.preserve_distances

which preserves pairwise distances or dissimilarity scores in the original data.

Arguments. The input to these functions is the original data, represented either as a data matrix in which each row is a feature vector, or as a (possibly sparse) graph encoding pairwise distances. The embedding dimension is specified by the embedding_dim keyword argument, which is 2 by default.

Return value. The return value is an MDE object. Calling the embed() method on this object returns an embedding, which is a matrix (torch.Tensor) in which each row is an embedding vector. For example, if the original input is a data matrix of shape (n_items, n_features), then the embedding matrix has shape (n_items, embeddimg_dim).

We give examples of using these functions below.

Preserving neighbors

The following code produces an embedding of the MNIST dataset (images of handwritten digits), in a fashion similar to LargeVis, t-SNE, UMAP, and other neighborhood-based embeddings. The original data is a matrix of shape (70000, 784), with each row representing an image.

import pymde

mnist = pymde.datasets.MNIST()
embedding = pymde.preserve_neighbors(mnist.data, verbose=True).embed()
pymde.plot(embedding, color_by=mnist.attributes['digits'])

Unlike most other embedding methods, PyMDE can compute embeddings that satisfy constraints. For example:

embedding = pymde.preserve_neighbors(mnist.data, constraint=pymde.Standardized(), verbose=True).embed()
pymde.plot(embedding, color_by=mnist.attributes['digits'])

The standardization constraint enforces the embedding vectors to be centered and have uncorrelated features.

Preserving distances

The function pymde.preserve_distances is useful when you're more interested in preserving the gross global structure instead of local structure.

Here's an example that produces an embedding of an academic coauthorship network, from Google Scholar. The original data is a sparse graph on roughly 40,000 authors, with an edge between authors who have collaborated on at least one paper.

import pymde

google_scholar = pymde.datasets.google_scholar()
embedding = pymde.preserve_distances(google_scholar.data, verbose=True).embed()
pymde.plot(embedding, color_by=google_scholar.attributes['coauthors'], color_map='viridis', background_color='black')

More collaborative authors are colored brighter, and are near the center of the embedding.

Example notebooks

We have several example notebooks that show how to use PyMDE on real (and synthetic) datasets.

Citing

To cite our work, please use the following BibTex entry.

@article{agrawal2021minimum,
  author  = {Agrawal, Akshay and Ali, Alnur and Boyd, Stephen},
  title   = {Minimum-Distortion Embedding},
  journal = {arXiv},
  year    = {2021},
}

PyMDE was designed and developed by Akshay Agrawal.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymde-0.2.1.tar.gz (65.8 kB view details)

Uploaded Source

Built Distributions

pymde-0.2.1-cp313-cp313-win_amd64.whl (99.1 kB view details)

Uploaded CPython 3.13Windows x86-64

pymde-0.2.1-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (189.6 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

pymde-0.2.1-cp313-cp313-macosx_11_0_arm64.whl (95.5 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

pymde-0.2.1-cp313-cp313-macosx_10_13_x86_64.whl (95.5 kB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

pymde-0.2.1-cp312-cp312-win_amd64.whl (100.0 kB view details)

Uploaded CPython 3.12Windows x86-64

pymde-0.2.1-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (195.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

pymde-0.2.1-cp312-cp312-macosx_11_0_arm64.whl (96.3 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

pymde-0.2.1-cp312-cp312-macosx_10_13_x86_64.whl (96.1 kB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

pymde-0.2.1-cp311-cp311-win_amd64.whl (99.8 kB view details)

Uploaded CPython 3.11Windows x86-64

pymde-0.2.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (190.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

pymde-0.2.1-cp311-cp311-macosx_11_0_arm64.whl (96.1 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

pymde-0.2.1-cp311-cp311-macosx_10_9_x86_64.whl (96.1 kB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

pymde-0.2.1-cp310-cp310-win_amd64.whl (99.6 kB view details)

Uploaded CPython 3.10Windows x86-64

pymde-0.2.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (181.2 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

pymde-0.2.1-cp310-cp310-macosx_11_0_arm64.whl (96.3 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

pymde-0.2.1-cp310-cp310-macosx_10_9_x86_64.whl (96.3 kB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

File details

Details for the file pymde-0.2.1.tar.gz.

File metadata

  • Download URL: pymde-0.2.1.tar.gz
  • Upload date:
  • Size: 65.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for pymde-0.2.1.tar.gz
Algorithm Hash digest
SHA256 079f1d03000d906b8f522e980f9fee8c996773a10c163d1cdded3d521829db75
MD5 7a624f1fa8de82a33b716c550c7795ea
BLAKE2b-256 e14ca0168574e5053b3440b26daa52db746cf67ffc51e079353f156ed98209f8

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: pymde-0.2.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 99.1 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for pymde-0.2.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 3084b1539e94322e491c920be7a0275ce9b4888fa7df70e7acfce718d9fb1b9b
MD5 72ed2b2aa5c1a9ac067f713793eea34e
BLAKE2b-256 f1a452cf85df72df37fb3431caa5ccb5814cd556c620a5b7748f3ef287cabef8

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b986a4cd9e2e3ce092da0dae3b3fdcd79e34649c502f43d82f2c3a089d46d906
MD5 14856081adb7afc08c6ea0b7502fc42a
BLAKE2b-256 68e815df7dc9a507589a83fdda4aeff9dd278b2830994fb4169796d00fd57b9b

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5b8b3283f122738c21e5851a24fa6491cfbcbe5d6b5111f677c2f679290d0eb5
MD5 53e89ddfe8d985acc46795f129069e1f
BLAKE2b-256 297b0bde9887d8b71ca7763570f671777cb8ae4e1adf5c43d632d6a997c9d63b

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 881c60a04ace48b6b9ca5d58d5db863d758af94a156387a280abdd813b873fc6
MD5 5327760b2765fc9520c59c25b28d3b7c
BLAKE2b-256 15362f28c9066dbd51004d16b2978a4d08010d9a616692eb57d3e109d72b2d4d

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: pymde-0.2.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 100.0 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for pymde-0.2.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 a0c0fc8c1c71ce8d56376a0c3bca25ae3aed6c8b73587ecd93a6177123a6453c
MD5 5da4a171aa6c910d421967c2e6be3f06
BLAKE2b-256 a1bb5380dfd63c26da2b43ee4aa4c5764e19f9b23c7f9f8c63ac2d69883bdf48

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 72197ee2959a5d447d6ca7e48a8772cfd7f3e2a1765aa4972ce1e9d396c0f238
MD5 58dae8e8ee2f3e034e6269829ba28086
BLAKE2b-256 cc93db28ffdd030dacd412ae05cbcaf0fc3492284968372084b4270712943e38

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ac306b71db8c7952158819580ec71f9be9b852df597c66ae3458362e49f47fd8
MD5 0430d422da00e265472eb1c8e784c4f7
BLAKE2b-256 d78777dc4f46bad7651fdb5851b7a477c8e7427f7a5873f236bd32ec59e2cd97

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 884889a84ff08951124c8c5e106fc961335899ae8013b14ee761a321194e9481
MD5 9bf4e344b8b9ed245f994ce75e4c3c9e
BLAKE2b-256 d7ec4ec1d01cfb7250e25993ba1fcb03800f1198d920d863271e2d12da11a552

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: pymde-0.2.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 99.8 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for pymde-0.2.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 49544fdca736e616acc6e8c456c99ad8a0fcd6e2760d9dcbb88f4da1a2e35b8a
MD5 93283f4efa87f736077d2be46e548df9
BLAKE2b-256 2edaf337d81fe6f6798855365e8d0ac6eb9defff72c2846a9d06adf0da1a9514

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a759f591ab453a502fd666f44c71bd1787339d5b1cdfb2fb3d29f579312d9907
MD5 f4bf1be5065f38ac3f3b51a80ff253a6
BLAKE2b-256 19c3593bcddcfd867931198ec41e850a8184a63c7e28900284b1742706bd0df1

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bd42c4d7de21622a2abc45ec37509c454425137d835aa68c476804bba1d93aa3
MD5 b63475b894ca5211e9e380283eee6925
BLAKE2b-256 04eb698d3f6792883450af4c5f35d72333f7309d394cf9f3a5160cacc03af529

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 e960bebc9bb73409f2d1a32d61055fb0672d227bc8f82ad192773adb63c28b7b
MD5 26eaa20a9e9b0afe0fbe3482ec5b5a7f
BLAKE2b-256 9c6022105717044556d8ffcb5ad080ead62f6fa2ba0ee1d865bc69964d1fa03a

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: pymde-0.2.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 99.6 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for pymde-0.2.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 341690588209a06e1337ae208b6794b9c63d695ae152eedc8c8efe8031800f47
MD5 da296f8a87d18e6df39a4f3bc112afa1
BLAKE2b-256 c4c6e2567bd87cb9b32d0459090e90e60846c700acf7654ccaf7968f13f9e537

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 54cc5810c57b94720922c1181218d6d9ee9a58e6ebcd955e310ea96740fa72a1
MD5 7348047673924480a54043ca1ac810c2
BLAKE2b-256 08a5dd4b725e8de753d070436613f7124f5b9b725d81d984d63bc1bf23f34357

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 24dec8c0c36f0d01b7993dca50da8b4955e42776c7f80f12bc0656acb73351c8
MD5 8850d8cf52072f401ca869c286bc4833
BLAKE2b-256 a89a0ef80dd9837cddaae138c843bdca91120c88868df179caea2ac229696ad5

See more details on using hashes here.

File details

Details for the file pymde-0.2.1-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pymde-0.2.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6a16c13b94226ddc6b444c10015e207611b49a0521946ad8bcfa4b94cf91215a
MD5 74fd17d54804ad0e6db247f9f034176e
BLAKE2b-256 ff3667496d40b9053cd50e42ca46d328504888a57d4959a2a4acda92ed912f43

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page