Python package for building Aligned Hierarchies for sequential data streams
Project description
A Python package that builds aligned hierarchies for sequential data streams.
Documentation
See our website for a complete reference manual and introductory tutorials.
This example tutorial will show you a usage of the package from start to finish.
Summary
We introduce repytah
, a Python package that constructs the aligned hierarchies representation that contains all possible structure-based hierarchical decompositions for a finite length piece of sequential data aligned on a common time axis. In particular, this representation--introduced by Kinnaird [@Kinnaird_ah] with music-based data (like musical recordings or scores) as the primary motivation--is intended for sequential data where repetitions have particular meaning (such as a verse, chorus, motif, or theme). Although the original motivation for the aligned hierarchies representation was finding structure for music-based data streams, there is nothing inherent in the construction of these representations that limits repytah
to only being used on sequential data that is music-based.
The repytah
package builds these aligned hierarchies by first extracting repeated structures (of all meaningful lengths) from the self-dissimilarity matrix (SDM) for a piece of sequential data. Intentionally repytah
uses the SDM as the starting point for constructing the aligned hierarchies, as an SDM cannot be reversed-engineered back to the original signal and allows for researchers to collaborate with signals that are protected either by copyright or under privacy considerations. This package is a Python translation of the original MATLAB code by Kinnaird [-@Kinnaird_code] with additional documentation, and the code has been updated to leverage efficiencies in Python.
Problems Addressed
Sequential data streams often have repeated elements that build on each other, creating hierarchies. Therefore, the goal of repytah
is to extract these repetitions and their relationships to each other in order to form aligned hierarchies.
To learn more about aligned hierarchies, see this paper by Kinnaird (ISMIR 2016) which introduces aligned hierarchies in the context of music-based data streams.
Audience
People working with sequential data where repetitions have meaning will find repytah
useful including computational scientists, advanced undergraduate students, younger industry experts, and many others.
An example application of repytah
is in Music Information Retrieval (MIR), i.e., in the intersection of music and computer science.
Installation
The latest stable release is available on PyPI, and you can install it by running:
pip install repytah
If you use Anaconda, you can install the package using conda-forge
:
conda install -c conda-forge repytah
To build repytah from source, say python setup.py build
.
Then, to install repytah, say python setup.py install
.
Alternatively, you can download or clone the repository and use pip
to handle dependencies:
unzip repytah.zip
pip install -e repytah-main
or
git clone https://github.com/smith-tinkerlab/repytah.git
pip install -e repytah
By calling pip list
you should see repytah
now as an installed package:
repytah (0.x.x, /path/to/repytah)
Current and Future Work - Elements of the Package
- Aligned Hierarchies - This is the fundamental output of the package, of which derivatives can be built. The aligned hierarchies for a given sequential data stream is the collection of all possible hierarchical structure decompositions, aligned on a common time axis. To this end, we offer all possible structure decompositions in one cohesive object.
- Includes walk through file
example.py
using suppliedinput.csv
- Includes walk through file
- Forthcoming Aligned sub-Hierarchies - (AsH) - These are derivatives of the aligned hierarchies and are described in Aligned sub-Hierarchies: a structure-based approach to the cover song task
- Forthcoming Start-End and S_NL diagrams
- Forthcoming SuPP and MaPP representations
MATLAB code
The original code to this project was written in MATLAB by Katherine M. Kinnaird. It can be found here.
Acknowledgements
This code was developed as part of Smith College's Summer Undergraduate Research Fellowship (SURF) from 2019 to 2022 and has been partially funded by Smith College's CFCD funding mechanism. Additionally, as Kinnaird is the Clare Boothe Luce Assistant Professor of Computer Science and Statistical & Data Sciences at Smith College, this work has also been partially supported by Henry Luce Foundation's Clare Boothe Luce Program.
Additionally, we would like to acknowledge and give thanks to Brian McFee and the librosa team. We significantly referenced the Python package librosa in our development process.
Citing
Please cite repytah
using the following:
C. Jia et al., repytah: A Python package that builds aligned hierarchies for sequential data streams. Python package version 0.1.2, 2023. [Online]. Available: https://github.com/smith-tinkerlab/repytah.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file repytah-0.1.2.tar.gz
.
File metadata
- Download URL: repytah-0.1.2.tar.gz
- Upload date:
- Size: 15.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a64e5068b042b7217e7e9d502ede011c40a274f727083257ebcefd262a56c1a |
|
MD5 | 42ed7e5596777058e7be51c412d5bf1b |
|
BLAKE2b-256 | 60251972a04b4c56fb8b70e39e21fbfc7c955547357e8dd3770ae446862a65a4 |
File details
Details for the file repytah-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: repytah-0.1.2-py3-none-any.whl
- Upload date:
- Size: 39.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d2626e5743eeeb844560f20380cbdf0316971e1e31d332ecd5f2cc8e12204cc |
|
MD5 | cc0f5f5a0c73eb207300aa5e46e41a9f |
|
BLAKE2b-256 | a24a26acd83c9ee73a2883de26c1978b73ab88ab95ab118aa973b8d0fd420956 |