Skip to main content

Well-Known Machine Learning Models in PyTorch with Strong GPU Acceleration.

Project description

PyCave

PyPi

PyCave provides well-known machine learning models with strong GPU acceleration in PyTorch. Its goal is not to provide a comprehensive collection of models or neural network layers, but rather complement other open-source libraries.

Features

PyCave currently includes the following models to be run on the GPU:

  • pycave.bayes.GMM: Gaussian Mixture Models, optionally trained via mini-batches if the GPU memory is too small to fit the data. Mini-batch training should not impact convergence. Initialization is performed using K-means, optionally on a subset of the data as it is comparatively slow.
  • pycave.bayes.MarkovModel: Markov Models able to learn transition probabilities from a sequence of discrete states.

Roadmap

The following models are currently in development and will be published as soon as possible:

  • pycave.bayes.HMM: Hidden Markov Models, similar to the Gaussian Mixture Models but trained on sequences of datapoints to additionally learn transition probabilities.

Installation

PyCave is available on PyPi and can simply be installed as follows:

pip install pycave

Quickstart

Using PyCave is really easy and is oriented towards Sklearn's interface. In order to train a GMM, you can initialize it as follows and fit it from a torch.Tensor as PyCave is fully implemented in PyTorch:

from pycave.bayes import GMM

gmm = GMM(num_components=100, num_features=32, covariance='spherical')
gmm.fit(data_tensor)

You can then use the GMM's instance methods for inference:

  • gmm.evaluate computes the negative log-likelihood of some data.
  • gmm.predict returns the indices of most likely components for some data.
  • gmm.sample samples a given number of samples from the GMM.

Benchmarks

In order to demonstrate the potential of PyCave, we compared the runtime of PyCave both on CPU and GPU against the runtime of Sklearn's Gaussian Mixture Model.

We train on 100k 128-dimensional datapoints sampled from a "ground truth" GMM with 512 components. PyCave's GMM and Sklearn should then minimize the negative log-likelihood (NLL) of the data. While PyCave's GMM worked well with random initialization, Sklearn required (a single-pass) K-Means initialization to yield useful results. In both cases, the GMM converged when the per-datapoint NLL was below 1e-7.

Implementation Training Time Speedup Compared to Sklearn
Sklearn (CPU) 114.41s x1
PyCave (CPU) 32.07s x3.57
PyCave (GPU) 0.27s x425.19

By moving to PyCave's GPU implementation of GMMs, you can therefore expects speedups by a factor of hundreds.

For huge datasets, PyCave's GMM also supports mini-batch training on a GPU. We run PyCave's GMM on the same kind of data as described above, yet on 100 million instead of 100k datapoints. We use a batch size of 750k to train on a GPU.

Implementation Training Time
PyCave (GPU, mini-batch) 247.95s

Even on this huge dataset, PyCave is able to fit the GMM in just over 4 minutes.

We ran the benchmark on 8 Cores of an Intel Xeon E5-2630 with 2.2 GHz and a single GeForce GTX 1080 GPU with 11 GB of memory.

License

PyCave is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycave-1.0.5.tar.gz (14.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pycave-1.0.5-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file pycave-1.0.5.tar.gz.

File metadata

  • Download URL: pycave-1.0.5.tar.gz
  • Upload date:
  • Size: 14.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7

File hashes

Hashes for pycave-1.0.5.tar.gz
Algorithm Hash digest
SHA256 c686536c4e75c2b9ed1fd64708c771557e29ba91fe4f6b0c0deb69c2d307778b
MD5 da2b170e211b3b1a7274fbef16098ec7
BLAKE2b-256 f013ecc886d44683904be9354acc7dab7d5be68a893090fa927d9ef3d66f61c2

See more details on using hashes here.

File details

Details for the file pycave-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: pycave-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 17.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7

File hashes

Hashes for pycave-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 1ece8955053917998c9c070f13029c365bef0a1f0231e74040e76014ef9e4b0a
MD5 31a6bdf934a9ba09d3ca5525bd808172
BLAKE2b-256 0acf1b0441fa364c78e0f35b447c28c539f2bcdb927f9e3e57e99416c0e3c0c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page