Skip to main content

A Python interace to the GenTen tensor decomposition library

Project description

pygenten: Python bindings for the GenTen package

The python package pygenten provides python bindings for the GenTen package. GenTen is a tool for computing Canonical Polyadic (CP, also called CANDECOMP/PARAFAC) decompositions of tensor data. It is geared towards analysis of extreme-scale data and implements several CP decomposition algorithms that are parallel and scalable, including:

  • CP-ALS: The workhorse algorithm for Gaussian sparse or dense tensor data.
  • CP-OPT: CP decomposition of (sparse or dense) Gaussian data using a quasi-Newton optimization algorithm incorporating possible upper and lower bound constraints.
  • GCP: Generalized CP supporting arbitrary loss functions (Gaussian, Poisson, Bernoulli, ...), solved using quasi-Newton (dense tensors) or stochastic gradient descent (sparse or dense tensors) optimization methods.
  • Streaming GCP: A GCP algorithm that incrementally updates a GCP decomposition as new data is observed, suitable for in situ analysis of streaming data.
  • Federated GCP: A federated learning algorithm for GCP supporting asynchronous parallel communication.

GenTen builds on Kokkos and Kokkos Kernels to support shared memory parallel programming models on a variety of contemporary architectures, including:

  • OpenMP for CPUs.
  • CUDA for NVIDIA GPUs.
  • HIP for AMD GPUs.
  • SYCL for Intel GPUs.

GenTen also supports distributed memory parallelism using MPI.

Installing pygenten

There are two general approaches for building pygenten:

  • Enable python in the generic CMake build process described here.
  • Install using pip which automates the CMake build to some degree.

Installing with pip

pygenten has experimental support for installation using pip from the source distribution on pypi. Because of the wide variety of parallel architectures that GenTen/pygenten can be compiled for, binary distributions (a.k.a. wheels) are not currently provided, but might be in the future. The pip installation leverages scikit-build-core to provide a CMake build backend for pip, which allows the user to provide CMake defines that control the pygenten build process and determine which architectures/parallel programming models are enabled. We thus recommend becoming familiar with the CMake build process for GenTen in general as described here before continuing. In particular, the user must have BLAS and LAPACK libraries available in their build environment that can either be automatically discovered by CMake or manually specified through LAPACK_LIBS.

Basic installation

A basic installation of pygenten can be done simply by:

pip install pygenten

This will build Genten and pygenten for a CPU architecture using OpenMP parallelism using a default compiler from the user's path. During the build of pygenten, CMake will attempt to locate valid BLAS and LAPACK libraries in the user environment. If these cannot be found, the user can customize the build by specifying LAPACK_LIBS as described below.

Customized installation

The build of GenTen/pygenten can be customized by passing CMake defines to specify compilers, BLAS/LAPACK libraries, host/device architectures, and enabled programming models. This is done by adding command-line arguments to pip of the form

--config-settings=cmake.define.SOME_DEFINE=value

Any CMake define accepted by GenTen/Kokkos/KokkosKernels can be passed this way. Since this is fairly verbose and GenTen can require several defines, several meta-options are provided to enable supported parallel programming models:

CMake Define What it enables
PYGENTEN_MPI Enable distributed parallelism with MPI. Sets the execution space to Serial by default.
PYGENTEN_OPENMP Enable shared memory host parallelism using OpenMP
PYGENTEN_SERIAL No host shared memory parallelism. Useful for builds targeting GPU architectures or distributed memory parallelism
PYGENTEN_CUDA Enable CUDA parallelism for NVIDIA GPU architectures
PYGENTEN_HIP Enable HIP parallelism for AMD GPU architectures
PYGENTEN_SYCL Enable SYCL parallelism for Intel GPU architectures

When enabling GPU architectures, one also needs to specify the corresponding architecture via Kokkos_ARCH_* defines described here.

For example, an MPI+CUDA build for a Volta V100 GPU architecture can be obtained with

pip install -v --config-settings=cmake.define.PYGENTEN_CUDA=ON --config-settings=cmake.define.Kokkos_ARCH_VOLTA70=ON --config-settings=cmake.define.PYGENTEN_MPI=ON

For MPI builds, pygenten assumes the MPI compiler wrappers mpicxx and mpicc are available in the user's path. If this is not correct, the user can specify the appropriate compiler by setting the appropriate CMake define, e.g., CMAKE_CXX_COMPILER. Furthermore, for CUDA builds, pygenten will build with the nvcc_wrapper script as the compiler as required by Kokkos, which calls g++ as the host compiler by default. This can be changed by setting the NVCC_WRAPPER_DEFAULT_COMPILER environment variable. Moreover, for MPI+CUDA, pygenten will set environment variables to override the compiler wrapped by mpicxx to use nvcc_wrapper, which currently works only with OpenMPI and MPICH. Finally, for MPI+HIP or MPI+SYCL builds, pygenten assumes the compiler wrappers call the appropriate device-enabled compiler, e.g., hipcc for AMD and icpx for Intel.

Installing numpy

pygenten relies on numpy, both of which are compiled extension libraries leveraging OpenMP and BLAS/LAPACK. Therefore, module import errors can occur if pygenten and numpy are compiled in very different environments, due to, e.g., symbol conflicts in libstdc++. This typically happens when pygenten is compiled with a much newer compiler than what was used to compile the numpy wheel. Futhermore, we have observed slower performance in pygenten in some cases when numpy is imported before pygenten, which we believe is due to inconsistent OpenMP and/or BLAS/LAPACK libraries between the two packages. Thus, the most robust way to use pygenten is to also install numpy from source, using the same build environment as pygenten. This can be done in a similar manner as pygenten by providing configure options to numpy through pip, e.g.,

pip install --no-binary numpy -Csetup-args=-Dblas=my_blas -Csetup-args=-Dlapack=my_lapack numpy

to specify the appropriate BLAS and LAPACK libraries (called my_blas and my_lapack in this case). Compilers can be specified through the CC, CXX, and FC environment variables. More details can be found here. However, we only recommend doing this if you see errors when importing pygenten or you observe slower performance than what would be observed with the genten command-line tool.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygenten-0.0.5.tar.gz (15.6 MB view details)

Uploaded Source

Built Distributions

pygenten-0.0.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.13 manylinux: glibc 2.17+ x86-64

pygenten-0.0.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

pygenten-0.0.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

pygenten-0.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

pygenten-0.0.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

pygenten-0.0.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

pygenten-0.0.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

File details

Details for the file pygenten-0.0.5.tar.gz.

File metadata

  • Download URL: pygenten-0.0.5.tar.gz
  • Upload date:
  • Size: 15.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.18

File hashes

Hashes for pygenten-0.0.5.tar.gz
Algorithm Hash digest
SHA256 fe55bfa54dd22db100663a6673840a9f3d9d0fcbb602123a37ead0d14f43e2de
MD5 e45a75ad3c249f766b578670ffee0dce
BLAKE2b-256 f6ab771f86b50df49bfca0c1f27f3118ccd0b56cd37dbe6b3566bdbcf2e7ffe2

See more details on using hashes here.

File details

Details for the file pygenten-0.0.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f857ebd5b17c61912ed52ab2d772e8de9450aa20d2c80325e0416241060715be
MD5 38e12d0745d59b6c0d41bc12350c3a0c
BLAKE2b-256 a2485fa3f32a3c2d796453c334e95400bc2ec73f421219c348d603168552590d

See more details on using hashes here.

File details

Details for the file pygenten-0.0.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5c84902191f9d4879eda013c91fad4c52f891b92663edcb3c15f49d8aaf3a9ce
MD5 260c0a83cfa1851fdd04aad28b68fc9e
BLAKE2b-256 20b60a473d337b8d85a0c1f20077517e582a4a25aa2ae5fb5eabbcbcf1c8f7e1

See more details on using hashes here.

File details

Details for the file pygenten-0.0.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8fa2731fd62f770f9ac49a565ff4d784460c3e7dad60a98cbf92e635d65cc71e
MD5 8e30e6d41d8c1e80c71ed5cfa32ffafa
BLAKE2b-256 551f44ac5d7c9e4ca77057f3cbe450198e9b26de70c262b2eb4016e6d1d8b94f

See more details on using hashes here.

File details

Details for the file pygenten-0.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2055d4f9c8eaaf92de095ddc7910aafc548e2987f47f299c07f9f84018369d5e
MD5 5dd7725ab4dad8392074c9b8b3e8b68c
BLAKE2b-256 4549f83c2cfa552083a25c638e6cd294e46e6443aaadac65be355ab98322dc44

See more details on using hashes here.

File details

Details for the file pygenten-0.0.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bf1d3807f18a35dd1c94d1b5bcb97e28f104ba9491fcda2a56f0d9b0e9e58907
MD5 4d9a35ccd2b2a9133a987414a49c766f
BLAKE2b-256 13ad981fd3a78234459436e9adbc71a685955b4e419cac88cb972380fb82d0d0

See more details on using hashes here.

File details

Details for the file pygenten-0.0.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 46b4359de2e2c0bf3f613306b6eda40a5491d7cea93c57e3d06e4775619492a0
MD5 ae4b70e6a6b14243af10d51929a41ab1
BLAKE2b-256 eb424f3466ca244a385fb4a06b1de61f2f14ebd60d54bfdae006571bf40a3904

See more details on using hashes here.

File details

Details for the file pygenten-0.0.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9f5174c9e95b074f6ca25523ca833752b5aa58ec9cdd0121a833298b8576de2f
MD5 0e05791a621ce5cfd714372cd8d37a77
BLAKE2b-256 4c305c48ecb3e59e31ab33676bc589fd24a693d2748191cc483c5cf75f88f22b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page