CLA is a simple toy library for basic vector/matrix operations written C + CUDA with a Python API.

These details have not been verified by PyPI

Project links

repository

Project description

C Linear Algebra (CLA) Library

Features • Quick Start • Build • Architecture

CLA is a simple toy library for basic vector/matrix operations in C. This project main goal is to learn the foundations of CUDA, and Python bindings, using ctypes as a wrapper, through simple Linear Algebra operations (additions, subtraction, multiplication, broadcasting, etc).

Features

C17 support, Python 3.13, CUDA 12.8;
Linux support;
Vector-vector operations;
Matrix-matrix operations;
Vector and matrix norms;
GPU device selection to run operations;
Get CUDA information from the system (i.e., support, number of devices, etc);
Management of memory (CPU memory vs GPU memory), allowing copies between devices;

Quick Start

[!IMPORTANT]
In order to use the library the CUDA Toolkit must be installed and available in the system. Even if you don't intend to use a GPU, the library uses the CUDA Runtime to query for CUDA-capable devices and will fail if the cudart is unavailable.

Installation

For the C-only API, obtain the latest binaries and headers from the releases tab in GitHub. For the Python API, use your favorite package manager (i.e., pip, uv) and install pycla from PyPi (e.g., pip install pycla).

C API

The C API provides structs (see cla/include/entities.h) and functions (see cla/include/vector_operations.h, cla/include/matrix_operations.h) that operate over those structs. The two main entities are Vector and Matrix. A vector or matrix can reside in either the CPU memory (host memory, from CUDA's terminology) or GPU memory (device memory). Those structs always keep metadata on the CPU (i.e., shape, current device), which allows the CPU to coordinate most of the workflow. In order for an operation to be run on the GPU the entities must first be copied to the GPU's memory.

For a quickstart, compile the samples/c_api.c with: (i) gcc -l cla <filename>.c, if you installed the library system-wide (i.e., copied the headers to /usr/include/ and shared library to /usr/lib/); or (ii) gcc -I <path-to-include> -L <path-to-root-wih-libcla> -l cla <filename>.c.

To run, make the libcla.so findable by the executable (i.e., either update LD_LIBRARY_PATH environment variable or include it on /usr/lib) and run in the shell of your preference (i.e., ./a.out).

Python API

The Python API provides an object-oriented approach for using the low-level C API. All features of the C API are exposed by the Vector and Matrix classes. Some samples are available at samples using Jupyter Notebooks. The code below showcases the basic features of the API:

# Core entities
from pycla import Vector, Matrix

# Contexts for intensive computation
from pycla.core import ShareDestionationVector, ShareDestionationMatrix

# Vector and Matrices can be instantiated directly from Python lists/sequences
vector = Vector([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# e.g, for Matrices
# matrix = Matrix([[1, 2, 3, 4], [1, 2, 3, 4]])

# Vector and Matrices can be moved forth and back to the GPU with the `.to(...)` and `.cpu()` methods
# Once an object is on the GPU, we cannot directly read its data from the CPU,
#    however we can still retrieve its metadata (i.e., shape, device name, etc)
vector.to(0)
print(vector)

# We can bring an object back to the CPU with either
#   .to(None) or .cpu() calls
vector.cpu()
print(vector)

# The Vector class overrides the built-in operators
#  of Python. Most of the time, the result of an operation
#  return a new Vector instead of updating the current one
#  in place.
result = vector ** 2
print(result)

# We can also directly release the memory allocated
#  for a vector with
vector.release()
del vector

# Whenever we apply an operation on a Vector/Matrix,
#   a new object is allocated in memory to store the result.
# The only exception are the 'i' operations (i.e., *=, +=, -=, etc),
#   which edit the object in place.
# However, for some extensive computation, it is desirable to
#   waste as little memory and time as possible. Thus, the
#   ShareDestination{Vector,Matrix} contexts allow for using
#   a single shared object for most operation with vectors and matrices.
a = Vector([1.0] * 10)
b = Vector([2.0] * 10)
with ShareDestionationVector(a, b) as result:
    op1 = a + b
    op2 = result * 2
    op3 = result / 2

# All op1, op2 and op3 vectors represent the
#  same vector.
print(result)
print(Vector.has_shared_data(op1, result))
print(Vector.has_shared_data(op2, result))
print(Vector.has_shared_data(op3, result))

Build

The whole library can be built using the make targets defined on the Makefile. All you have to do is make the required libraries available on the system (i.e., install CUDA 12.8, Python 3.13, gcc/g++ 17, CMake 4.0.0) and install the Python libraries for development (i.e., py-dev-requirements). The table below describes the main targets that can be run with make <target>.

Target	Description
`all`	Prepare and compile the `CLA` library and install the library (`.so`) in `pycla.bin`
`test`	Run all unit tests for `cla` and `pycla`.
`release`	Run tests for `cla` and `pycla` and create release files (i.e., Python wheel and C zip file).
`clean`	Utility target that removes the CMake build directory.
`test-cla-memory-leak`	Runs Valgrind and CUDA compute sanitizer for memory leaks in the C API.
`test-pycla`	Run tests for the Python API only.

Architecture

The library is organized as simply as possible. The goal is to make a slight distinction between the C and Python APIs, while allowing the core code with CUDA to be flexible.

The C API provides a shared library named cla to be used by other programs/libraries during the linking stage or runtime. This C library is static linked to the CUDA kernel/functions during build.

The Python API provides a wrapper to the cla library by a Python package named pycla, which dynamics load the cla library during runtime. It is necessaary to have the CUDA runtime available to use CUDA-related functionanilty.

The aforementioned relationship is depicted in the diagram below:

flowchart LR
  cla("`cla`")
  pycla("`pycla`")
  cuda["CUDA code"]

  cla-.->|Static links| cuda
  pycla==>|Dynamic loads| cla

Directory structure

The source code is organized as follows:

cla: source code for the C API;
- include: header files (i.e., .h, .cuh), has subdirectories for each module (e.g., cuda, vector, matrix);
- matrix: matrix module;
- vector: vector module;
- cuda: CUDA management code;
pycla: source code for the Python API;
- bin: wrapper for the cla shared library;
- core: core entities;

`cla` library

The following diagram shows the module/package organization.

flowchart TD
  vector("<strong>Vector Module</strong><br>Vector operations, norms, conversions.")
  matrix("<strong>Matrix Module</strong><br>Matrix operations, norms, conversions, vector-matrix operations.")
  cuda("<strong>CUDA Module</strong><br> alternative operations for Matrix and Vectors with CUDA kernels.")

  subgraph cla
  matrix -->|Uses for Matrix-Vector operations| vector
  matrix -->|Uses for parallel operations| cuda
  vector -->|Uses for parallel operations| cuda
  end

`pycla` library

The following diagram shows the module/package organization.

flowchart TD
  core("<strong>Core Module</strong><br>Core entities.")
  wrapper("<strong>CLA Module</strong><br>CLA wrapper with ctypes.")

  subgraph pycla
  core -->|Uses| wrapper
  end

Project details

These details have not been verified by PyPI

Project links

repository

Release history Release notifications | RSS feed

1.2.1

May 18, 2025

1.2.0

Apr 24, 2025

This version

1.1.1

Apr 24, 2025

1.1.0

Apr 23, 2025

1.0.0

Apr 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pycla-1.1.1-py3-none-any.whl (183.3 kB view details)

Uploaded Apr 24, 2025 Python 3

File details

Details for the file pycla-1.1.1-py3-none-any.whl.

File metadata

Download URL: pycla-1.1.1-py3-none-any.whl
Upload date: Apr 24, 2025
Size: 183.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pycla-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`79ef5f36ea3ec7fcc173d9d012da08e3b03e42c3cc0555390519053788c0d1bf`
MD5	`0567269bd9ae273f0a94c5a3d6505ba0`
BLAKE2b-256	`ac83063ab0338a0fac805292282426abd411e6cffdbc232589ba6bbc544e47f0`

See more details on using hashes here.

pycla 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

C Linear Algebra (CLA) Library

Features

Quick Start

Installation

C API

Python API

Build

Architecture

Directory structure

`cla` library

`pycla` library

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

pycla 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

C Linear Algebra (CLA) Library

Features

Quick Start

Installation

C API

Python API

Build

Architecture

Directory structure

cla library

pycla library

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

`cla` library

`pycla` library