Skip to main content

A Python interace to the GenTen tensor decomposition library

Project description

pygenten: Python bindings for the GenTen package

The python package pygenten provides python bindings for the GenTen package. GenTen is a tool for computing Canonical Polyadic (CP, also called CANDECOMP/PARAFAC) decompositions of tensor data. It is geared towards analysis of extreme-scale data and implements several CP decomposition algorithms that are parallel and scalable, including:

  • CP-ALS: The workhorse algorithm for Gaussian sparse or dense tensor data.
  • CP-OPT: CP decomposition of (sparse or dense) Gaussian data using a quasi-Newton optimization algorithm incorporating possible upper and lower bound constraints.
  • GCP: Generalized CP supporting arbitrary loss functions (Gaussian, Poisson, Bernoulli, ...), solved using quasi-Newton (dense tensors) or stochastic gradient descent (sparse or dense tensors) optimization methods.
  • Streaming GCP: A GCP algorithm that incrementally updates a GCP decomposition as new data is observed, suitable for in situ analysis of streaming data.
  • Federated GCP: A federated learning algorithm for GCP supporting asynchronous parallel communication.

GenTen builds on Kokkos and Kokkos Kernels to support shared memory parallel programming models on a variety of contemporary architectures, including:

  • OpenMP for CPUs.
  • CUDA for NVIDIA GPUs.
  • HIP for AMD GPUs.
  • SYCL for Intel GPUs.

GenTen also supports distributed memory parallelism using MPI.

Installing pygenten

There are two general approaches for building pygenten:

  • Enable python in the generic CMake build process described here.
  • Install using pip which automates the CMake build to some degree.

Installing with pip

pygenten has experimental support for installation using pip from the source distribution on pypi. Furthermore, binary wheels are provided in limited circumstances (currently just linux with OpenMP support only, but more may be provided in the future), enabling immediate installation. The pip installation leverages scikit-build-core to provide a CMake build backend for pip, which allows the user to provide CMake defines that control the pygenten build process and determine which architectures/parallel programming models are enabled. We thus recommend becoming familiar with the CMake build process for GenTen in general as described here before continuing. In particular, the user must have BLAS and LAPACK libraries available in their build environment that can either be automatically discovered by CMake or manually specified through LAPACK_LIBS.

Basic installation

A basic installation of pygenten can be done simply by:

pip install pygenten

This will install the binary wheel if it is available, and if it isn't, build GenTen and pygenten for a CPU architecture using OpenMP parallelism using a default compiler from the user's path. During the build of pygenten, CMake will attempt to locate valid BLAS and LAPACK libraries in the user environment. If these cannot be found, the user can customize the build by specifying LAPACK_LIBS as described below.

Note that when installing pygenten from a binary wheel, the repairwheel step that makes the wheel usable on a wide variety of architectures seems to make the included genten and related executables unusable. If you want to use GenTen outside of python, you should install it from source as described below.

Customized installation

To customize the GenTen/pygenten build, you must first instruct pip to compile from source by adding the --no-binary pygenten command-line argument. The can then be customized by passing CMake defines to specify compilers, BLAS/LAPACK libraries, host/device architectures, and enabled programming models. This is done by adding command-line arguments to pip of the form

--config-settings=cmake.define.SOME_DEFINE=value

Any CMake define accepted by GenTen/Kokkos/KokkosKernels can be passed this way. Since this is fairly verbose and GenTen can require several defines, several meta-options are provided to enable supported parallel programming models:

CMake Define What it enables
PYGENTEN_MPI Enable distributed parallelism with MPI. Sets the execution space to Serial by default.
PYGENTEN_OPENMP Enable shared memory host parallelism using OpenMP
PYGENTEN_SERIAL No host shared memory parallelism. Useful for builds targeting GPU architectures or distributed memory parallelism
PYGENTEN_CUDA Enable CUDA parallelism for NVIDIA GPU architectures
PYGENTEN_HIP Enable HIP parallelism for AMD GPU architectures
PYGENTEN_SYCL Enable SYCL parallelism for Intel GPU architectures

When enabling GPU architectures, one also needs to specify the corresponding architecture via Kokkos_ARCH_* defines described here.

For example, an MPI+CUDA build for a Volta V100 GPU architecture can be obtained with

pip install -v --no-binary pygenten --config-settings=cmake.define.PYGENTEN_CUDA=ON --config-settings=cmake.define.Kokkos_ARCH_VOLTA70=ON --config-settings=cmake.define.PYGENTEN_MPI=ON pygenten

For MPI builds, pygenten assumes the MPI compiler wrappers mpicxx and mpicc are available in the user's path. If this is not correct, the user can specify the appropriate compiler by setting the appropriate CMake define, e.g., CMAKE_CXX_COMPILER. Furthermore, for CUDA builds, pygenten will build with the nvcc_wrapper script as the compiler as required by Kokkos, which calls g++ as the host compiler by default. This can be changed by setting the NVCC_WRAPPER_DEFAULT_COMPILER environment variable. Moreover, for MPI+CUDA, pygenten will set environment variables to override the compiler wrapped by mpicxx to use nvcc_wrapper, which currently works only with OpenMPI and MPICH. Finally, for MPI+HIP or MPI+SYCL builds, pygenten assumes the compiler wrappers call the appropriate device-enabled compiler, e.g., hipcc for AMD and icpx for Intel.

Installing numpy

pygenten relies on numpy, both of which are compiled extension libraries leveraging OpenMP and BLAS/LAPACK. Therefore, module import errors can occur if pygenten and numpy are compiled in very different environments, due to, e.g., symbol conflicts in libstdc++. This typically happens when pygenten is compiled with a much newer compiler than what was used to compile the numpy wheel. Futhermore, we have observed slower performance in pygenten in some cases when numpy is imported before pygenten, which we believe is due to inconsistent OpenMP and/or BLAS/LAPACK libraries between the two packages. Thus, the most robust way to use pygenten is to also install numpy from source, using the same build environment as pygenten. This can be done in a similar manner as pygenten by providing configure options to numpy through pip, e.g.,

pip install --no-binary numpy -Csetup-args=-Dblas=my_blas -Csetup-args=-Dlapack=my_lapack numpy

to specify the appropriate BLAS and LAPACK libraries (called my_blas and my_lapack in this case). Compilers can be specified through the CC, CXX, and FC environment variables. More details can be found here. However, we only recommend doing this if you see errors when importing pygenten or you observe slower performance than what would be observed with the genten command-line tool.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygenten-0.0.6.tar.gz (15.6 MB view details)

Uploaded Source

Built Distributions

pygenten-0.0.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.13 manylinux: glibc 2.17+ x86-64

pygenten-0.0.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

pygenten-0.0.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

pygenten-0.0.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

pygenten-0.0.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

pygenten-0.0.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

pygenten-0.0.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

File details

Details for the file pygenten-0.0.6.tar.gz.

File metadata

  • Download URL: pygenten-0.0.6.tar.gz
  • Upload date:
  • Size: 15.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for pygenten-0.0.6.tar.gz
Algorithm Hash digest
SHA256 c765a19db14044e7c52f3b54022e4436a548bf7051e53b7940a0ba357d73858d
MD5 b585b886ec5c918951ae5017f9f350db
BLAKE2b-256 e1f68c543399e3b9a76d207832106b4ed0f0ab2783aff4f4a88710c99239e092

See more details on using hashes here.

File details

Details for the file pygenten-0.0.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e21d9f2b475850823e75eb68da674ddb706c769e7b387773b339d093c7bb34b0
MD5 7e9a6d9a7b505aaafe9a3edb74b4e2cb
BLAKE2b-256 7b0798c433c1ea3179fc01ddc552509b33910f6cfc18fe5e8319a8a23ff87465

See more details on using hashes here.

File details

Details for the file pygenten-0.0.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f55bf99d477a9c11097e9c06b581cd0bbbe6167d08d6a3d1a2ae7825f085d2b7
MD5 c0e310bd9744aa8f6a9c0f90c6b28b00
BLAKE2b-256 47e35b106ccac248a10484e85392ab2265c0a3ecb511182fe08306bdff0fa82e

See more details on using hashes here.

File details

Details for the file pygenten-0.0.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 45ee29372e3e44d11af929880e33b14a0b337dc76b54acb41069033d5eba016c
MD5 16c059e77ffcea74804a497ae9747841
BLAKE2b-256 15d666e6ed8c9dc5d431a325610dfcd072b7b1e4ef6cae8e9b8fc1c0d0b70fa5

See more details on using hashes here.

File details

Details for the file pygenten-0.0.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f4fde4fd17a963eb7ca5ab650b822d65e41a0ec407b7db7d1e423ad35d9cd9ad
MD5 0cb07a14ac4d18e2ad1cb1cb02f9e20c
BLAKE2b-256 85b0b7261dfa213159c0a9fa83d7ca2b4adb1b7d2b0ae01e26dc2b71ceee7ae9

See more details on using hashes here.

File details

Details for the file pygenten-0.0.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d74862ef252e2ccc1b3cca5372dc3642090fb1768d1c84388c14c84a8e3b40ed
MD5 b3327e06f3fa849a8d55774da671cd06
BLAKE2b-256 4bff1187836cb909c835cd9ceba58f109ee5401b4a2bc8e9294797e340d85170

See more details on using hashes here.

File details

Details for the file pygenten-0.0.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 eb87fe06438bbfbaa617f1591862859205ba843eba93b30af606bd8973f85cf3
MD5 e6552d0cda2ff6ae3d825011c698e3dd
BLAKE2b-256 5984f87e7bb32656c403cbf6c0c9f0d8ebed50907399e1787ab484d57038c406

See more details on using hashes here.

File details

Details for the file pygenten-0.0.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pygenten-0.0.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 266816545313c77ccd2af8e17c2464612f56d0e96134bd821b7f8d81a04b440f
MD5 16f7e3c27916032a226ac40656711a3e
BLAKE2b-256 d5bdf883ff810b665cf7b5e89d58c49b4d4ab9fd96825a58b09ce72c2137d7fc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page