Python implementation of WORLD vocoder.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

PYTHON WORLD VOCODER:

This is a line-by-line implementation of WORLD vocoder (Matlab, C++) in python. It supports python 3.0 and later.

INSTALLATION

pip install worldvocoder

EXAMPLE

import worldvocoder as wv
import soundfile as sf
import librosa

# read audio
audio, sample_rate = sf.read("some_file.wav")
audio = librosa.to_mono(audio)

# initialize vocoder
vocoder = wv.World()

# encode audio
dat = vocoder.encode(sample_rate, audio, f0_method='harvest')

in which, sample_rate is sampling frequency and audio is the speech/singing signal.

The dat is a dictionary object that contains pitch, magnitude spectrum, and aperiodicity.

We can scale the pitch:

dat = vocoder.scale_pitch(dat, 1.5)

Be careful when you scale the pich because there is upper limit and lower limit.

We can make speech faster or slower:

dat = vocoder.scale_duration(dat, 2)

To resynthesize the audio:

dat = vocoder.decode(dat)
output = dat["out"]

To use d4c_requiem analysis and requiem_synthesis in WORLD version 0.2.2, set the variable is_requiem=True:

# requiem analysis
dat = vocoder.encode(fs, x, f0_method='harvest', is_requiem=True)

To extract log-filterbanks, MCEP-40, VAE-12 as described in the paper Using a Manifold Vocoder for Spectral Voice and Style Conversion, check test/spectralFeatures.py. You need Keras 2.2.4 and TensorFlow 1.14.0 to extract VAE-12. Check out speech samples

NOTE:

The vocoder use pitch-synchronous analysis, the size of each window is determined by fundamental frequency F0. The centers of the windows are equally spaced with the distance of frame_period ms.
The Fourier transform size (fft_size) is determined automatically using sampling frequency and the lowest value of F0 f0_floor. When you want to specify your own fft_size, you have to use f0_floor = 3.0 * fs / fft_size. If you decrease fft_size, the f0_floor increases. But, a high f0_floor might be not good for the analysis of male voices.

CITATION:

Dinh, T., Kain, A., & Tjaden, K. (2019). Using a manifold vocoder for spectral voice and style conversion. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019-September, 1388-1392.

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.5

Jul 18, 2023

0.0.4

Jul 5, 2023

0.0.3

Jul 5, 2023

0.0.2

Jun 9, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

worldvocoder-0.0.5.tar.gz (31.5 kB view details)

Uploaded Jul 18, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

worldvocoder-0.0.5-py3-none-any.whl (41.2 kB view details)

Uploaded Jul 18, 2023 Python 3

File details

Details for the file worldvocoder-0.0.5.tar.gz.

File metadata

Download URL: worldvocoder-0.0.5.tar.gz
Upload date: Jul 18, 2023
Size: 31.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.0

File hashes

Hashes for worldvocoder-0.0.5.tar.gz
Algorithm	Hash digest
SHA256	`9c2748c6bc0be1df04e4a7675805966c8981ce81b863d9b90cb8764a7ad03176`
MD5	`9044046d5fbadd8cdb6e3604c6486a0c`
BLAKE2b-256	`d4e48336dffb1a26e61d3558a8b9c8120538121089dbd978dbff7806de301d52`

See more details on using hashes here.

File details

Details for the file worldvocoder-0.0.5-py3-none-any.whl.

File metadata

Download URL: worldvocoder-0.0.5-py3-none-any.whl
Upload date: Jul 18, 2023
Size: 41.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.0

File hashes

Hashes for worldvocoder-0.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`df6b147d0e2d45d26ab0c5e52a44154d65a5b2ff8d29d2b3593ebecd3e518879`
MD5	`f298c378be06bfc47b72ec7a866ebe90`
BLAKE2b-256	`c154de9ac193992ba965bed8cc611a56d0327919d09b9569bc5290886974334e`

See more details on using hashes here.

worldvocoder 0.0.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PYTHON WORLD VOCODER:

INSTALLATION

EXAMPLE

NOTE:

CITATION:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes