Skip to main content

Python Implementation of the FastMap MDS technique.

Project description

FastMapy

Python implementation of FastMap1 MDS technique for embedding objects into vector spaces and dimensionality reduction of existing vector spaces. The general idea is that the objects are embedding into a vector space based on a defined distance metric over the objects. The resulting vector space attempts to maintain this relative distance between the objects relative to the defined distance metric.

This package has common distance metrics already defined and ready to use over appropriate objects, such as Jaccard distance over character shingled n-gram strings or Levenshtein edit distance for embedding string objects. Euclidean distance and taxi cab distance are also available for vector objects. Dictionary objects also work assuming a sparse vector style dictionary of {index: count} where index can be an actual vector index or a token and it's occurrence count.

Multiprocessing is leveraged for model building and object transformation, but is set to serially use a single core by default.

Example

from fastmap.distances import Jaccard
import fastmap

fm_model = fastmap.FastMap(dim=8, distance=Jaccard, dist_args={'shingle_size':4})

embedding = fm_model.fit_transform(string_data)

The above example defines a FastMap model that utilizes Jaccard distance. The target vector space is 8-dimensional and strings are shingled into 4-grams before the distance is computed. A collection of strings are then used to fit the model and the same strings are transformed into 8-dimensional Numpy arrays.

#References 1 Proceedings of the 1995 ACM SIGMOD international conference on Management of data - SIGMOD ’95. (1995). doi:10.1145/223784

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

FastMapy-0.0.1.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

FastMapy-0.0.1-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file FastMapy-0.0.1.tar.gz.

File metadata

  • Download URL: FastMapy-0.0.1.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2.post20191203 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.5

File hashes

Hashes for FastMapy-0.0.1.tar.gz
Algorithm Hash digest
SHA256 60233552f06be703f3878d57009325804e150fcab9ce00cabbeb6051887f84a5
MD5 cef96fc49a3dc70958245974672f4a31
BLAKE2b-256 8c4a3188b837fe820d0464861ba0f53ef77f8394ce1eee400a9dd929855a4717

See more details on using hashes here.

File details

Details for the file FastMapy-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: FastMapy-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2.post20191203 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.5

File hashes

Hashes for FastMapy-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e07007cb0403e4f43e8853b5b2b8c48a536f738d8fae6547657d2476ce28f187
MD5 252cd9ad0300a6dd7bb603ecf208d8c9
BLAKE2b-256 3afb541b7d78029a1d6958eacc63dee98292ed6501fde97484723db1aa9e7caa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page