An open-source Python library for audio time-scale modification.

These details have not been verified by PyPI

Project description

PyTSMod

Python license downloads

PyTSMod is an open-source library for Time-Scale Modification algorithms in Python 3. PyTSMod contains basic TSM algorithms such as Overlap-Add (OLA), Waveform-Similarity Overlap-Add (WSOLA), Time-Domain Pitch-Synchronous Overlap-Add (TD-PSOLA), and Phase Vocoder (PV-TSM). We are also planning to add more TSM algorithms and pitch shifting algorithms.

Full documentation is available on https://pytsmod.readthedocs.io

open-issues closed-issues open-prs closed-prs

The implementation of the algorithms are based on those papers and libraries:

TSM Toolbox: MATLAB Implementations of Time-Scale Modification Algorithms.
Jonathan Driedger, Meinard Müller.
Proceedings of the 17th International Conference on Digital Audio Effects (DAFx-14), 2014.

A review of time-scale modification of music signals.
Jonathan Driedger, Meinard Müller.
Applied Sciences, 6(2), 57, 2016.

DAFX: digital audio effects
Udo Zölzer.
John Wiley & Sons, 2011.

Installing PyTSMod

PyTSMod is hosted on PyPI. To install, run the following command in your Python environment:

$ pip install pytsmod

Or if you use poetry, you can clone the repository and build the package through the following command:

$ poetry build

Requirements

To use the latest version of PyTSMod, Python with version >= 3.8 and following packages are required.

NumPy (>=1.20.0)
SciPy (>=1.8.0)
soundfile (>=0.10.0)

Using PyTSMod

Using OLA, WSOLA, and PV-TSM

OLA, WSOLA, and PV-TSM can be imported as module to be used directly in Python. To get the result easily, all you need is just two parameters, the input audio sequence x and the time stretching factor s. Here's a minimal example:

import numpy as np
import pytsmod as tsm
import soundfile as sf  # you can use other audio load packages.

x, sr = sf.read('/FILEPATH/AUDIOFILE.wav')
x = x.T
x_length = x.shape[-1]  # length of the audio sequence x.

s_fixed = 1.3  # stretch the audio signal 1.3x times.
s_ap = np.array([[0, x_length / 2, x_length], [0, x_length, x_length * 1.5]])  # double the first half of the audio only and preserve the other half.

x_s_fixed = tsm.wsola(x, s_fixed)
x_s_ap = tsm.wsola(x, s_ap)

Time stretching factor s

Time stretching factor s can either be a constant value (alpha) or an 2 x n array of anchor points which contains the sample points of the input signal in the first row and the sample points of the output signal in the second row.

Using TD-PSOLA

When using TD-PSOLA, the estimated pitch information of the source you want to modify is needed. Also, you should know the hop size and frame length of the pitch tracking algorithm you used. Here's a minimal example:

import numpy as np
import pytsmod as tsm
import crepe  # you can use other pitch tracking algorithms.
import soundfile as sf  # you can use other audio load packages.

x, sr = sf.read('/FILEPATH/AUDIOFILE.wav')

_, f0_crepe, _, _ = crepe.predict(x, sr, viterbi=True, step_size=10)

x_double_stretched = tsm.tdpsola(x, sr, f0_crepe, alpha=2, p_hop_size=441, p_win_size=1470)  # hop_size and frame_length for CREPE step_size=10 with sr=44100
x_3keyup = tsm.tdpsola(x, sr, f0_crepe, beta=pow(2, 3/12), p_hop_size=441, p_win_size=1470)
x_3keydown = tsm.tdpsola(x, sr, f0_crepe, target_f0=f0_crepe * pow(2, -3/12), p_hop_size=441, p_win_size=1470)

Time stretching factor alpha

In this version, TD-PSOLA only supports the fixed time stretching factor alpha.

Pitch shifting factor beta and target_f0

You can modify pitch of the audio sequence in two ways. The first one is beta, which is the fixed pitch shifting factor. The other one is target_f0, which supports target pitch sequence you want to convert. You cannot use both of the parameters.

Using PyTSMod from the command line

From version 0.3.0, this package includes a command-line tool named tsmod, which can create the result file easily from a shell. To generate the WSOLA result of input.wav with stretching factor 1.3 and save to output.wav, please run:

$ tsmod wsola input.wav output.wav 1.3  # ola, wsola, pv, pv_int are available.

Currently, OLA, WSOLA, and Phase Vocoder(PV) are supported. TD-PSOLA is excluded due to the difficulty of sending extracted pitch data to TD-PSOLA. Also, non-linear TSM is not supported in command-line.

For more information, use -h or --help command to see the detailed usage of tsmod.

Audio examples

The original audio is from TSM toolbox.

Stretching factor α=0.5

Name	Method	Original	OLA	WSOLA	Phase Vocoder	Phase Vocoder (phase locking)	TSM based on HPSS
CastanetsViolin	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav
DrumSolo	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav
Pop	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav
SingingVoice	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav

Stretching factor α=1.2

Name	Method	Original	OLA	WSOLA	Phase Vocoder	Phase Vocoder (phase locking)	TSM based on HPSS
CastanetsViolin	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav
DrumSolo	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav
Pop	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav
SingingVoice	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav

Stretching factor α=1.8

Name	Method	Original	OLA	WSOLA	Phase Vocoder	Phase Vocoder (phase locking)	TSM based on HPSS
CastanetsViolin	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav
DrumSolo	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav
Pop	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav
SingingVoice	TSM Toolbox	wav	wav	wav	wav	wav	wav
-	PyTSMod	-	wav	wav	wav	wav	wav

References

[1] Jonathan Driedger, Meinard Müller. "TSM Toolbox: MATLAB Implementations of Time-Scale Modification Algorithms", Proceedings of the 17th International Conference on Digital Audio Effects (DAFx-14). 2014.

[2] Jonathan Driedger, Meinard Müller. "A review of time-scale modification of music signals", Applied Sciences, 6(2), 57. 2016.

[3] Udo Zölzer. "DAFX: digital audio effects", John Wiley & Sons. 2011.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.8

Nov 22, 2023

0.3.7

Sep 13, 2023

0.3.6

Oct 15, 2022

0.3.5

Jun 3, 2022

0.3.4

May 28, 2022

0.3.3

Nov 24, 2020

0.3.2

Oct 21, 2020

0.3.1

Oct 6, 2020

0.3.0

Sep 29, 2020

0.2.0

Sep 29, 2020

0.1.1

Sep 22, 2020

0.1.0

Sep 22, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytsmod-0.3.8.tar.gz (32.9 kB view details)

Uploaded Nov 22, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pytsmod-0.3.8-py3-none-any.whl (34.2 kB view details)

Uploaded Nov 22, 2023 Python 3

File details

Details for the file pytsmod-0.3.8.tar.gz.

File metadata

Download URL: pytsmod-0.3.8.tar.gz
Upload date: Nov 22, 2023
Size: 32.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/5.4.0-166-generic

File hashes

Hashes for pytsmod-0.3.8.tar.gz
Algorithm	Hash digest
SHA256	`00d46d28a79b4ff7c790fd93603c9988cdd64a37e5d76010bbef106e0f1d867c`
MD5	`c2015ca951ecf3f409d53533950da2ea`
BLAKE2b-256	`eb0aa625ba74ad9d91a29eca16f7bd7c1e5980577323f4cfecce3b4b3630f0aa`

See more details on using hashes here.

File details

Details for the file pytsmod-0.3.8-py3-none-any.whl.

File metadata

Download URL: pytsmod-0.3.8-py3-none-any.whl
Upload date: Nov 22, 2023
Size: 34.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/5.4.0-166-generic

File hashes

Hashes for pytsmod-0.3.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`092871d805b2267f6a0013e59825bb76cdba4e03540778d043f5f96f7f68a3c0`
MD5	`3428063c6261151b4cbbbc15accc7a64`
BLAKE2b-256	`434ed1f53eea7ddb7ba21a9450a2b79fe99f29163a9abbae06bbaf1455a7356e`

See more details on using hashes here.

pytsmod 0.3.8

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

PyTSMod

Installing PyTSMod

Requirements

Using PyTSMod

Using OLA, WSOLA, and PV-TSM

Time stretching factor s

Using TD-PSOLA

Time stretching factor alpha

Pitch shifting factor beta and target_f0

Using PyTSMod from the command line

Audio examples

Stretching factor α=0.5

Stretching factor α=1.2

Stretching factor α=1.8

References

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes