Skip to main content

Praat-compatible acoustic analysis in Python (Rust implementation)

Project description

praatfan-gpl

Praat-compatible acoustic analysis in Python, powered by Rust.

This package provides exact reimplementations of Praat's acoustic analysis algorithms, designed to produce bit-accurate output matching Praat/parselmouth.

Installation

pip install praatfan-gpl

Building from source

Requires Rust and maturin:

cd python
pip install maturin
maturin develop --release

Quick Start

import praatfan_gpl as pc

# Load an audio file
sound = pc.Sound.from_file("speech.wav")

# Or create from numpy array
import numpy as np
samples = np.sin(2 * np.pi * 440 * np.arange(44100) / 44100)
sound = pc.Sound(samples, 44100.0)

# Compute pitch (F0)
pitch = sound.to_pitch(0.01, 75.0, 600.0)
print(f"Pitch at 0.5s: {pitch.get_value_at_time(0.5, 'hertz', 'linear')}")

# Get all pitch values as numpy array
f0_values = pitch.values()  # NaN for unvoiced frames
times = pitch.times()

# Compute formants
formant = sound.to_formant_burg(
    time_step=0.01,
    max_num_formants=5,
    max_formant_hz=5500.0,  # Use 5000 for female speakers
    window_length=0.025,
    pre_emphasis_from=50.0
)

# Get F1, F2 at specific time
f1 = formant.get_value_at_time(1, 0.5, "hertz", "linear")
f2 = formant.get_value_at_time(2, 0.5, "hertz", "linear")

# Get all F1 values
f1_values = formant.formant_values(1)  # numpy array

# Compute intensity
intensity = sound.to_intensity(100.0, 0.01)
db_values = intensity.values()

# Compute spectrum (single-frame FFT)
spectrum = sound.to_spectrum(fast=True)
cog = spectrum.get_center_of_gravity(2.0)
std = spectrum.get_standard_deviation(2.0)

# Compute spectrogram
spectrogram = sound.to_spectrogram(
    effective_analysis_width=0.005,
    max_frequency=5000.0,
    time_step=0.002,
    frequency_step=20.0,
    window_shape="gaussian"
)
spec_data = spectrogram.values()  # 2D numpy array [freq, time]

# Compute harmonicity (HNR)
hnr = sound.to_harmonicity_ac(0.01, 75.0, 0.1, 1.0)
# Or cross-correlation method:
hnr_cc = sound.to_harmonicity_cc(0.01, 75.0, 0.1, 1.0)

API Reference

Sound

# Loading
Sound.from_file(path)                    # Load from WAV, MP3, FLAC, OGG
Sound(samples, sample_rate)              # From numpy array

# Properties
sound.sample_rate                        # Sample rate in Hz
sound.duration                           # Duration in seconds
sound.num_samples                        # Number of samples
sound.start_time, sound.end_time         # Time bounds

# Methods
sound.samples()                          # Get samples as numpy array
sound.get_value_at_time(time)            # Linear interpolation
sound.extract_part(start, end, ...)      # Extract segment
sound.pre_emphasis(from_freq)            # High-pass filter
sound.rms(), sound.peak()                # Amplitude measures

# Analysis
sound.to_pitch(dt, floor, ceil)          # Pitch extraction
sound.to_formant_burg(...)               # Formant analysis
sound.to_intensity(min_pitch, dt)        # Intensity contour
sound.to_spectrum(fast)                  # Single-frame FFT
sound.to_spectrogram(...)                # Time-frequency analysis
sound.to_harmonicity_ac/cc(...)          # HNR analysis

# Speech-referenced variants (robust on long recordings — see below)
sound.to_pitch_ac_referenced(dt, floor, ceil, *, reference_peak=None)
sound.to_pitch_cc_referenced(dt, floor, ceil, *, reference_peak=None)
sound.to_harmonicity_ac_referenced(dt, min_pitch, silence, ppw, *, reference_peak=None)
sound.to_harmonicity_cc_referenced(dt, min_pitch, silence, ppw, *, reference_peak=None)

Speech-referenced normalization (long recordings)

Praat references each frame's amplitude against the whole-file peak. On utterance-length audio that peak is the speech peak, which is correct. On long conversational recordings (e.g. 10-minute Buckeye files) the peak is set by whatever is loudest anywhere — a click, laugh, or interviewer burst — so every quiet voiced frame's relative intensity drops and gets misclassified as silent, which then NaN-gates downstream formants and HNR.

The *_referenced methods substitute a robust amplitude reference for the whole-file peak. The plain to_pitch / to_harmonicity_* methods are unchanged and remain bit-exact with Praat.

import praatfan_gpl as pg

# Estimate one reference per recording, then share it across pitch and HNR.
# Standards are frame-level robust statistics over speech-active frames, so a
# short loud burst cannot dominate them:
ref = pg.estimate_speech_reference(samples, sample_rate)
ref.reference_peak    # 75th-pct of per-frame peak |x| — a typical-speech peak
ref.std               # median per-frame RMS (z-norm scale)
ref.mean              # median per-frame mean (DC reference)
ref.speech_mask       # per-frame bool, hop_s=0.01 grid
ref.frame_times       # frame centers (s)
ref.speech_fraction   # fraction of frames in the speech mask

pitch = sound.to_pitch_ac_referenced(0.01, 75.0, 600.0,
                                     reference_peak=ref.reference_peak)
hnr   = sound.to_harmonicity_cc_referenced(0.01, 75.0, 0.1, 1.0,
                                           reference_peak=ref.reference_peak)

# Or pass reference_peak=None (the default) to have each call estimate the
# reference internally.

The estimator definition and constants are shared across the praatfan package family (praatfan, praatfan-rust, praatfan-gpl), so a reference_peak computed by one package is interchangeable with the others.

Pitch

pitch.get_value_at_time(time, unit, interp)  # Query pitch
pitch.values()                               # All values (NaN=unvoiced)
pitch.times()                                # Frame times
pitch.num_frames, pitch.time_step           # Frame info
pitch.pitch_floor, pitch.pitch_ceiling      # Analysis parameters

Formant

formant.get_value_at_time(n, time, unit, interp)      # Fn frequency
formant.get_bandwidth_at_time(n, time, unit, interp)  # Fn bandwidth
formant.formant_values(n)                             # All Fn values
formant.bandwidth_values(n)                           # All Bn values
formant.times()                                       # Frame times

Intensity

intensity.get_value_at_time(time, interp)  # Query dB
intensity.values()                          # All values
intensity.times()                           # Frame times
intensity.min(), intensity.max(), intensity.mean()

Spectrum

spectrum.get_band_energy(f_min, f_max)    # Energy in band (Pa² s)
spectrum.get_center_of_gravity(power)     # Spectral centroid
spectrum.get_standard_deviation(power)    # Spectral spread
spectrum.get_skewness(power)              # Spectral skewness
spectrum.get_kurtosis(power)              # Spectral kurtosis
spectrum.num_bins, spectrum.df            # Frequency info

Spectrogram

spectrogram.values()                       # 2D array [freq, time]
spectrogram.get_time_from_frame(i)        # Frame time
spectrogram.get_frequency_from_bin(i)     # Bin frequency
spectrogram.num_frames, spectrogram.num_freq_bins
spectrogram.time_step, spectrogram.freq_step

Harmonicity

harmonicity.get_value_at_time(time, interp)  # Query HNR (dB)
harmonicity.values()                          # All values
harmonicity.times()                           # Frame times
harmonicity.min(), harmonicity.max(), harmonicity.mean()

Unit Options

  • Pitch units: "hertz", "mel", "semitones", "erb"
  • Frequency units: "hertz", "bark", "mel", "erb"
  • Interpolation: "nearest", "linear", "cubic"
  • Window shapes: "gaussian", "hanning", "hamming", "rectangular"

Comparison with parselmouth

praatfan-gpl aims for bit-accurate compatibility with Praat/parselmouth:

import parselmouth
import praatfan_gpl as pc
import numpy as np

sound_pm = parselmouth.Sound("speech.wav")
sound_pc = pc.Sound.from_file("speech.wav")

# Formant comparison
formant_pm = sound_pm.to_formant_burg(0.01, 5, 5500, 0.025, 50)
formant_pc = sound_pc.to_formant_burg(0.01, 5, 5500.0, 0.025, 50.0)

for t in np.arange(0, sound_pm.duration, 0.01):
    f1_pm = formant_pm.get_value_at_time(1, t)
    f1_pc = formant_pc.get_value_at_time(1, t, "hertz", "linear")
    if f1_pm is not None and f1_pc is not None:
        assert abs(f1_pm - f1_pc) < 1.0  # Within 1 Hz

License

GPL-3.0 (same as Praat).

Citing

If you use praatfan-gpl in published work, please cite Praat, Parselmouth, and — if you use FormantPath — Weenink (2015):

  • Boersma, P. & Weenink, D. (2024). Praat: doing phonetics by computer. https://www.fon.hum.uva.nl/praat/
  • Jadoul, Y., Thompson, B., & de Boer, B. (2018). "Introducing Parselmouth: A Python interface to Praat." Journal of Phonetics, 71, 1–15.
  • Weenink, D. (2015). "Improved formant frequency measurements of short segments." Proceedings of ICPhS 2015, Glasgow.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

praatfan_gpl-0.1.8-cp39-abi3-win_arm64.whl (1.3 MB view details)

Uploaded CPython 3.9+Windows ARM64

praatfan_gpl-0.1.8-cp39-abi3-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.9+Windows x86-64

praatfan_gpl-0.1.8-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

praatfan_gpl-0.1.8-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

praatfan_gpl-0.1.8-cp39-abi3-macosx_11_0_arm64.whl (1.4 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

praatfan_gpl-0.1.8-cp39-abi3-macosx_10_12_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file praatfan_gpl-0.1.8-cp39-abi3-win_arm64.whl.

File metadata

  • Download URL: praatfan_gpl-0.1.8-cp39-abi3-win_arm64.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: CPython 3.9+, Windows ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for praatfan_gpl-0.1.8-cp39-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 de4f7e2760f574bfd67853679a109b416d4e034b8d344e5fb29f7a3d5c156ba9
MD5 cc14491a3f44d306f36e07740dc263f2
BLAKE2b-256 7930003440cca7e656fdcdee36fa42dd8744b3f8cf20ccfee111a169d7c55383

See more details on using hashes here.

File details

Details for the file praatfan_gpl-0.1.8-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: praatfan_gpl-0.1.8-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for praatfan_gpl-0.1.8-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 33984a6b0731e1e3fa440e8fa73928c2dbad80d04d7de42d2535e282232ce9c5
MD5 7061db87422e45ee04c014055e13b516
BLAKE2b-256 7912f0b175dac23e242d67b08ddd7b5ddb166c30e790d0819221163c75785a20

See more details on using hashes here.

File details

Details for the file praatfan_gpl-0.1.8-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for praatfan_gpl-0.1.8-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 98ba7a4995e6ca55c1e91d6b861336e7916e434cf6a5313b3b895ade7f51776d
MD5 cb61226b3c8c445d7dc5ce70b625d886
BLAKE2b-256 e489703b57e63947e838cc0490d53151809005667ba0393293f1330491ca974e

See more details on using hashes here.

File details

Details for the file praatfan_gpl-0.1.8-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for praatfan_gpl-0.1.8-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 5d3a269eb1ad33fbc0a21ccad769c894f962eb5b873d43f38b59e20fce0d127a
MD5 9bc5cd855109d0897ae3c93e2743ab92
BLAKE2b-256 ec83f011a7ab5298366417177db7ce186ebc84b1f794aab9784553c66dd5f561

See more details on using hashes here.

File details

Details for the file praatfan_gpl-0.1.8-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for praatfan_gpl-0.1.8-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8640c6fd14c3f9d71d3ef0d734622b06bfc93fbd4d9fef51c9848039fd3f94c7
MD5 a8b52f0e794806a23c175c6b2e397cd6
BLAKE2b-256 a34f5f583aad0ed8c1f27503796229fda0a2cbbf23199acc5e72c12341ccbadf

See more details on using hashes here.

File details

Details for the file praatfan_gpl-0.1.8-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for praatfan_gpl-0.1.8-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e78b8d6907c6c5083beb97ee05fbaec86083717e58babdefe5dcae140f106e26
MD5 e1246f2ede64aa38aae1f4f8a12ec262
BLAKE2b-256 39bac979b2ce94c3bc413c9aaef0b1944ae869d0de5d6fb9b306114bb9eed390

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page