Skip to main content

Converter matrix and type determination for a range of array formats, focusing on sparse arrays

Project description

sparseconverter

Format detection, identifiers and converter matrix for a range of numerical array formats (backends) in Python, focusing on sparse arrays.

Usage

Basic usage:

import numpy as np
import sparseconverter as spc

a1 = np.array([
    (1, 0, 3),
    (0, 0, 6)
])

# array conversion
a2 = spc.for_backend(a1, spc.SPARSE_GCXS)

# format determination
print("a1 is", spc.get_backend(a1), "and a2 is", spc.get_backend(a2))
a1 is numpy and a2 is sparse.GCXS

See examples/ directory for more!

Description

This library can help to implement algorithms that support a wide range of array formats as input, output or for internal calculations. All dense and sparse array libraries already do support format detection, creation and export from and to various formats, but with different APIs, different sets of formats and different sets of supported features -- dtypes, shapes, device classes etc.

This project creates an unified API for all conversions between the supported formats and takes care of details such as reshaping, dtype conversion, and using an efficient intermediate format for multi-step conversions.

Features

  • Supports Python 3.10 - (at least) 3.14
  • Defines constants for format identifiers
  • Various sets to group formats into categories:
    • Dense vs sparse
    • CPU vs CuPy-based
    • nD vs 2D backends
  • Efficiently detect format of arrays, including support for subclasses
  • Get converter function for a pair of formats
  • Convert to a target format
  • Find most efficient conversion pair for a range of possible inputs and/or outputs

That way it can help to implement format-specific optimized versions of an algorithm, to specify which formats are supported by a specific routine, to adapt to availability of CuPy on a target machine, and to perform efficient conversion to supported formats as needed.

Supported array formats

Still TODO

  • PyTorch arrays
  • More detailed cost metric based on more real-world use cases and parameters.

Changelog

0.7.0 (in development)

  • No changes yet

0.6.0

0.5.0

0.4.0

0.3.4

0.3.3

0.3.2

0.3.1

  • Include version constraint for sparse.

0.3.0

  • Introduce conversion_cost() to obtain a value roughly proportional to the conversion cost between two backends.

0.2.0

  • Introduce result_type() to find the smallest NumPy dtype that accomodates all parameters. Allowed as parameters are all valid arguments to numpy.result_type(...) plus backend specifiers.
  • Support cupyx.scipy.sparse.csr_matrix with dtype=bool.

0.1.1

Initial release

Known issues

  • conda install -c conda-forge cupy on Python 3.7 and Windows 11 may install cudatoolkit 10.1 and cupy 8.3, which have sporadically produced invalid data structures for cupyx.sparse.csc_matrix for unknown reasons. This doesn't happen with current versions. Running the benchmark function benchmark_conversions() can help to debug such issues since it performs all pairwise conversions and checks for correctness.

Notes

This project is developed primarily for sparse data support in LiberTEM. For that reason it includes the backend CUDA, which indicates a NumPy array, but targeting execution on a CUDA device.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparseconverter-0.6.0.tar.gz (23.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sparseconverter-0.6.0-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file sparseconverter-0.6.0.tar.gz.

File metadata

  • Download URL: sparseconverter-0.6.0.tar.gz
  • Upload date:
  • Size: 23.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sparseconverter-0.6.0.tar.gz
Algorithm Hash digest
SHA256 b8af60f6f62f0814187df7b71cc2dcab6294a84944435af668918e32efb62f83
MD5 6df361967fb76f70f9940328e00461a1
BLAKE2b-256 d949a8b3258c187bd5e68bb4c6d2d95f7189eef6b53f65125bfe159a4c23b784

See more details on using hashes here.

File details

Details for the file sparseconverter-0.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sparseconverter-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d442305109a22c97d333a1252a317441a4c59a862e7227d4eb58a54a0f746046
MD5 397a11e8dbd990800a3462648d606012
BLAKE2b-256 7bfa63b2212cf5272c36af3d11881f1763cf65a5f99e15faff90d5ee650da25b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page