Package for computing 2-point statistics (correlation functions and structure functions)
Project description
Documentation | Installation | Contributing | Getting Help
pairstat is a python package that provides accelerated/parallelized routines for computing spatial 2-point statistics from spatial data data (e.g. 2-point correlation function, structure functions).
The pairstat package was formerly known as pyvsf
Motivation
2-point statistics are important for characterizing the properties of turbulence (2-point statistics comes up in other contexts like cosmology). There hasn’t been an easy-to-use package for computing these quantities, until now.
The pairstat package is most useful for datasets where Fourier methods are problematic (e.g. you don’t have a regularly spaced periodic grid). Before developing pairstat, I performed similar calculations by processing the outputs of scipy.spatial.distance.pdist and scipy.spatial.distance.cdist functions. This package implements equivalent functionality that uses more specialized C++ code in order to perform the calculation faster and with far less memory. [1] It also supports parallelization (more on that below).
Installation
As long as you have a C++ compiler, the easiest way to get the package is by invoking
$ python -m pip install pairstat
The package is automatically compiled with OpenMP support if the compiler supports it. To confirm that pairstat was compiled with OpenMP support, you can check whether the output from the following command mentions OpenMP:
$ python -m pairstat
See our Installation Guide for more details (especially if the package wasn’t compiled with OpenMP support).
Key-Features: Parallelism and Scalability
The key feature of this package is the support for parallelism. If a compatible compiler is used to build this package, it will automatically be built with OpenMP support for parallelizing calculations of structure functions and correlation functions.
Undocumented machinery also exists to help use this functionality to parallelize calculations across machines on a computing cluster (e.g. with MPI). We plan to document this machinery in the near future.
The other important feature, is memory usage. The memory usage is independent of the number of points. A naive implementation of equivalent calculation using scipy functionality has memory usage that scales with the number of pairs of points (i.e. the number of points squared for auto-correlation). In other words, this function is far more scalable that the alternative.
Current Status
We are planning to replace the C++ and Cython logic with the rust logic before the 1.0 release. This rewrite will allow us to significantly improve the code quality.
Contributions and Feature requests are welcome!
License
pairstat is dual-licensed under either the MIT license and the Apache License (Version 2.0).
Footnotes
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pairstat-0.3.0.tar.gz.
File metadata
- Download URL: pairstat-0.3.0.tar.gz
- Upload date:
- Size: 344.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6d6be090a7bf3bca72ca3b2db8bf1f15e8789b08d1494051da37b3cc4b4cf78
|
|
| MD5 |
29df4511a0d867c408cb0a2b025bfa51
|
|
| BLAKE2b-256 |
eecd1230ad8fea94e40262956a30d26f3adb9fd4624ec003c30e51297a530a87
|
Provenance
The following attestation bundles were made for pairstat-0.3.0.tar.gz:
Publisher:
cd.yml on mabruzzo/pairstat
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pairstat-0.3.0.tar.gz -
Subject digest:
c6d6be090a7bf3bca72ca3b2db8bf1f15e8789b08d1494051da37b3cc4b4cf78 - Sigstore transparency entry: 382027361
- Sigstore integration time:
-
Permalink:
mabruzzo/pairstat@3a35a3af4bdfb138e662966182dc191200712e80 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/mabruzzo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@3a35a3af4bdfb138e662966182dc191200712e80 -
Trigger Event:
push
-
Statement type: