Time-domain pitch-synchronous overlap-add

These details have not been verified by PyPI

Project links

Homepage

Project description

Time-domain pitch-synchronous overlap-add (TD-PSOLA)

This module permits contant- and variable-rate pitch-shifting and time-stretching of speech. It is a wrapper around the parselmouth [1] wrapper around the Praat [2] implementation of TD-PSOLA [3]. Pitch-shifting is performed by providing a numpy array of target pitch values equally spaced over time. Variable-rate time stretching uses forced phoneme alignment via pypar.

If you need to extract pitch features or phoneme alignments, see torchcrepe for pitch estimation and pyfoal for forced alignment. If you only want to perform pitch-shifting, you do not need to extract forced alignments. If you want to do variable-rate time stretching, you do not need to perform pitch estimation.

Installation

pip install psola

Usage

If you want to perform pitch-shifting or time-stretching on audio already loaded into memory, use psola.vocode. If you want to do this with audio saved in a file, use psola.from_file. You can use psola.to_file or psola.from_file_to_file to save the results to a file. To process many files at once with multiprocessing, use psola.from_files_to_files. Each of these functions is documented below. The command-line interface wraps the arguments of psola.from_files_to_files and is described in the next section.

`psola.vocode`

"""Performs pitch vocoding using Praat

Arguments
    audio : np.array(shape=(samples,))
        The speech signal to process
    sample_rate : int
        The audio sampling rate.
    source_alignment : pypar.Alignment
        The current alignment if performing time-stretching
    target_alignment : pypar.Alignment
        The target alignment if performing time-stretching
    target_pitch : np.array(shape=(frames,))
        The target pitch contour
    constant_stretch : float or None
        A constant value for time-stretching
    fmin : int
        The minimum allowable frequency in Hz.
    fmax : int
        The maximum allowable frequency in Hz.

Returns
    audio : np.array(shape=(samples,))
        The vocoded audio
"""

`psola.from_file`

"""Performs vocoding using Praat

Arguments
    audio_file : string
        The file containing the speech signal to process
    source_alignment_file : string or None
        The file containing the original alignment
    target_alignment_file : string or None
        The file containing the target alignment
    target_pitch_file : string or None
        The file containing the target pitch
    constant_stretch : float or None
        A constant value for time-stretching
    fmin : int
        The minimum allowable frequency in Hz.
    fmax : int
        The maximum allowable frequency in Hz.

Returns
    audio : np.array(shape=(samples,))
        The vocoded audio
    sample_rate : int
        The audio sampling rate
"""

`psola.to_file`

"""Performs pitch vocoding and saves audio to disk

Arguments
    audio : np.array(shape=(samples,))
        The speech signal to process
    sample_rate : int
        The audio sampling rate
    output_file : string
        The file to save the vocoded speech
    source_alignment : pypar.Alignment
        The current alignment if performing time-stretching
    target_alignment : pypar.Alignment
        The target alignment if performing time-stretching
    target_pitch : np.array(shape=(frames,))
        The target pitch contour
    constant_stretch : float or None
        A constant value for time-stretching
    fmin : int
        The minimum allowable frequency in Hz.
    fmax : int
        The maximum allowable frequency in Hz.
"""

`psola.from_file_to_file`

"""Performs vocoding using Praat and save to disk

Arguments
    audio_file : string
        The file containing the speech signal to process
    output_file : string
        The file to save the vocoded speech
    source_alignment_file : string or None
        The file containing the original alignment
    target_alignment_file : string or None
        The file containing the target alignment
    target_pitch_file : string or None
        The file containing the target pitch
    constant_stretch : float or None
        A constant value for time-stretching
    fmin : int
        The minimum allowable frequency in Hz.
    fmax : int
        The maximum allowable frequency in Hz.
"""

`psola.from_files_to_files`

"""Performs vocoding using Praat and save to disk

Arguments
    audio_files : list
        The files containing the speech signals to process
    output_files : list
        The files to save the vocoded speech
    source_alignment_files : string or None
        The files containing the original alignments
    target_alignment_files : list or None
        The files containing the target alignments
    target_pitch_files : list or None
        The files containing the target pitch
    constant_stretch : float or None
        A constant value for time-stretching
    fmin : int
        The minimum allowable frequency in Hz.
    fmax : int
        The maximum allowable frequency in Hz.
"""

Command-line interface

usage: python -m psola
    [-h]
    [--audio_files AUDIO_FILES [AUDIO_FILES ...]]
    [--source_alignment_files SOURCE_ALIGNMENT_FILES [SOURCE_ALIGNMENT_FILES ...]]
    [--target_alignment_files TARGET_ALIGNMENT_FILES [TARGET_ALIGNMENT_FILES ...]]
    [--constant_stretch CONSTANT_STRETCH]
    [--target_pitch_files TARGET_PITCH_FILES [TARGET_PITCH_FILES ...]]
    [--fmin FMIN]
    [--fmax FMAX]
    [--output_files OUTPUT_FILES [OUTPUT_FILES ...]]

optional arguments:
  -h, --help            show this help message and exit
  --audio_files AUDIO_FILES [AUDIO_FILES ...]
                        The speech signal to process
  --source_alignment_files SOURCE_ALIGNMENT_FILES [SOURCE_ALIGNMENT_FILES ...]
                        The files containing the original alignments
  --target_alignment_files TARGET_ALIGNMENT_FILES [TARGET_ALIGNMENT_FILES ...]
                        The files containing the target alignments
  --constant_stretch CONSTANT_STRETCH
                        A constant value for time-stretching
  --target_pitch_files TARGET_PITCH_FILES [TARGET_PITCH_FILES ...]
                        The target pitch contour
  --fmin FMIN           The minimum allowable frequency in Hz
  --fmax FMAX           The maximum allowable frequency in Hz
  --output_files OUTPUT_FILES [OUTPUT_FILES ...]
                        Where to save the vocoded audio

References

[1] Y. Jadoul, B. Thompson, and B. De Boer, "Introducing parselmouth: A python interface to praat," Journal of Phonetics, vol. 71, pp. 1â€“15, 2018.

[2] P. Boersma, "Praat: doing phonetics by computer", http://www.praat.org/, 2006.

[3] E. Moulines and F. Charpentier, "Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones," Speech communication, 1990.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.0.1

Mar 31, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psola-0.0.1.tar.gz (7.1 kB view details)

Uploaded Mar 31, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

psola-0.0.1-py3-none-any.whl (19.2 kB view details)

Uploaded Mar 31, 2021 Python 3

File details

Details for the file psola-0.0.1.tar.gz.

File metadata

Download URL: psola-0.0.1.tar.gz
Upload date: Mar 31, 2021
Size: 7.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.1

File hashes

Hashes for psola-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`a09edebc0dc1bcaff1cfd6fa98b8f294ae2a0732d71befa8a27f2cb094081d1f`
MD5	`7712a0c093966c7094a939a6f778516a`
BLAKE2b-256	`6b80595fe228e5d785328567ad9a4354260c21f27465c032b8ede6f665c3fd1a`

See more details on using hashes here.

File details

Details for the file psola-0.0.1-py3-none-any.whl.

File metadata

Download URL: psola-0.0.1-py3-none-any.whl
Upload date: Mar 31, 2021
Size: 19.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.1

File hashes

Hashes for psola-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`34874c442398640b01fbaffe385959f3a1c30e94dbde8a138dd3dfd36858a5e3`
MD5	`2f7eaf877be31494912d649cf354bc1c`
BLAKE2b-256	`42ca7bb639f4c51acdcb94795eb80bde6b7a32697724b706841bbabb36a94861`

See more details on using hashes here.

psola 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Time-domain pitch-synchronous overlap-add (TD-PSOLA)

Installation

Usage

`psola.vocode`

`psola.from_file`

`psola.to_file`

`psola.from_file_to_file`

`psola.from_files_to_files`

Command-line interface

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes