No project description provided
Project description
persuit
Persuit is an experimental persistent homology (PH) library, which aims to speed up your PH pipeline by reducing the boundary matrix whilst it is being constructed. The library is written in Rust and Python bindings are provided via PyO3.
Warning Persuit is currently in alpha. The implementation of the standard algorithm is far from optimised, the API is not fixed and no tests have been written. Use at your own risk.
Motivation
Here is the rough outline of many PH pipelines:
- Read in data.
- Build up a basis for the columns of your boundary matrix.
- Sort this basis by the filtration time and dimension.
- Compute a sparse representation of the boundary matrix.
- Pass the sparse matrix to a PH library (e.g. Eirene or PHAT) to reduce the boundary matrix and hence compute pairings.
In many (although not all) pipelines, steps 1-4 takes significantly longer than step 5. Indeed, often step 4 alone can take longer than step 5. However, step 5 often still takes a significant time to complete. In such circumstances, Persuit might offer a speed up.
Consider the "standard algorithm" for persistence computation:
for each column j:
while ∃i < j with low(i) = low(j):
add col[i] to col[j]
endwhile
endfor
Note that in order to reduce column j
we only need the previous columns.
Therefore, we can start reducing j
as soon as it is available!
That is, we can compute PH while we compute the sparse boundary matrix and potentially hide some of the cost of the standard algorithm.
Main Idea
The two core functions provided by Persuit are std_persuit
and std_persuit_collected
.
Let's focus on std_persuit
.
pub fn std_persuit<C, T>(col_iterator: T) -> StandardAlgo<C, flume::IntoIter<(usize, C)>>
where
T: Iterator<Item = C> + Send + 'static,
C: Column + 'static,
{
let (tx, rx) = flume::unbounded();
thread::spawn(move || {
let mut enumerated_cols = col_iterator.enumerate();
while let Some(enum_col) = enumerated_cols.next() {
tx.send(enum_col).unwrap();
}
});
StandardAlgo::new(rx.into_iter())
}
Note that the output of this function is an iterator over the pairings computed by the standard algorithm.
Despite all of the annotations, the signature of std_persuit
is essentially
iterator over columns -> iterator over parings
The way this is achieved is:
- Spawn a new thread and setup a channel for communication.
- The new thread consumes columns from the input iterator and puts them on the channel.
- The main thread consumes columns off the channel and performs the standard algorithm on these columns, unaware they are coming from a different thread.
Since the new thread is constantly consuming the iterator, we construct the sparse matrix at the usual rate. Meanwhile, the main thread is reducing columns of this sparse matrix as soon as it can.
Currently, due to the overhead of inter-thread communication this is usually slower than the serial algorithm.
Usage
The main Python bindings are std_persuit
and std_persuit_serial
.
These functions take one argument: an iterator over sparse columns.
Currently, persuit
only supports ℤ₂ homology and hence your matrix should also have enties in ℤ₂.
The ith iterate of the provided iterator should be an increasing list of indices of the non-zero rows in the ith column of your boundary matrix.
See test.py
for an illustrative example.
TODO
- Try out
FnvHasMap
,IndexMap
andBTreeMap
for storinglow_inverse
- Try out linked lists and
bitvec
as Column types - Return (Lex-optimal) representatives
- Python bindings
- Implement other PH algorithms, e.g. clear and compress?
- Implement parallel over dimensions?
- Write a basic README
- Find example where this actually does speed up PH (need PH computation to be a significant portion of the time)
- Write tests (unit, integration + property)
- Benchmark
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file persuit-0.1.0.tar.gz
.
File metadata
- Download URL: persuit-0.1.0.tar.gz
- Upload date:
- Size: 17.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.14.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 61f44cde7c7d0e284884729dcba41f47c1967d691f0add50619015062c253af5 |
|
MD5 | adfc4bfa838f384c95501adf944cf1e6 |
|
BLAKE2b-256 | b56abd4e0bcedcab4f382815a5a47be2f4c1fc9290b97dddddb19834ee88f1cd |
File details
Details for the file persuit-0.1.0-cp310-cp310-manylinux_2_34_x86_64.whl
.
File metadata
- Download URL: persuit-0.1.0-cp310-cp310-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 230.8 kB
- Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/0.14.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18bc5a43bbe9e82831c502e6a16a5e9b3892db7859530465d95c10329f4d8f04 |
|
MD5 | d19b0193167c621300dbf0c2c412754f |
|
BLAKE2b-256 | f7f3589e3a8d35cb4ae14e01aae745c1643b89583e8bf47400eda72e83f63da4 |