Skip to main content

Phonetic Analysis ToolKit: Tools for processing phonetic data

Project description

PATKIT - Phonetic Analysis ToolKIT

PATKIT GUI

PATKIT provides tools for phonetic analysis of speech data. It includes a GUI for manual assessment/analysis/annotation (see picture above), command line tools for batch processing, and an API for programming your own tools.

While currently PATKIT's tools mainly work on tongue and larynx ultrasound as well as audio, in the future, the toolkit will include facilities for processing other kinds of articulatory data. The first two tools to be implemented are Optical Flow and Pixel Difference.

Optical Flow tracks local changes in ultrasound frames and estimates the flow field based on these.

Pixel Difference and Scanline Based Pixel Difference -- work on raw, uninterpolated data and produce measures of change over the course of a recording. How they work is explained in Chapter 3 of Pertti Palo's PhD thesis.

Getting PATKIT

Detailed instructions

Quick start guide:

  • Install the package manager uv:

    • MacOS/Linux run curl -LsSf https://astral.sh/uv/install.sh | sh
    • Windows run powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
    • See uv for more details.
  • On the commandline run uv tool install patkit

  • Run patkit --help for instructions.

  • If you want to run the example data, get the recorded_data and scenarios folders from github.

    • Run patkit scenarios/minimal in the folder where you downloaded the data and experiment from there.
  • If on Linux of the debian variety (ubuntu, popos, others), you may also need to run the following:

apt-get update
apt-get upgrade
sudo apt-get install -y libxcb-cursor-dev

Try this in case trying to run patkit complains about a missing xcb plugin.

Documentation

Documentation

Current version and development plans

See Changelog, for what's new in the current version and what's coming up.

What's included

TODO 0.22.1: give a quick description of included data and goodies here. TODO 1.0: Move the data elsewhere to be optionally loaded.

Contributing

Please get in touch with Pertti, if you would like to contribute to the project. All help is welcome regardless of your skill level. You can contribute by trying to use it according to instructions and reporting back when they lead you astray, proofreading docs, commenting code, testing PATKIT on a new platform, writing new functionality, writing tests for the code, roasting the code, doing UI design, contributing use cases...

Versioning

We use SemVer for versioning under the rules as set out by PEP 440 with the additional understanding that releases before 1.0 (i.e. current releases at time of writing) have not been tested in any way.

For the versions available, see the tags on this repository.

Contributors

List of contributors

Copyright and License

The Phonetic Analysis ToolKIT (PATKIT or patkit for short) and examples is a tool box for analysing phonetic data.

PATKIT Copyright (C) 2019-2025 Pertti Palo, Scott Moisik, Matthew Faytak and Motoki Saito.

Optical Flow tools Copyright (C) 2020-2025 Scott Moisik

Pixel Difference tools Copyright (C) 2019-2025 Pertti Palo

Laryngeal example data Copyright (C) 2020 Scott Moisik

Tongue and tongue spline example data Copyright (C) 2013-2020 Pertti Palo

Program license

PATKIT is licensed under GPL 3.0.

This program (see below for data) is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/gpl-3.0.en.html

Data license

Data License

The data in directories larynx_data, tongue_data_1, tongue_data_1_2, and tongue_data_2 are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License. See link above or https://creativecommons.org/licenses/by-nc-sa/4.0/ for details.

Citing the code

When using any part of PATKIT, please cite:

  1. Palo, P., Moisik, S. R., and Faytak, M. (2023). “Analysing Speech Data with SATKIT”. In: International Conference of Phonetic Sciences (ICPhS 2023). Prague.
  2. Faytak, M., Moisik, S. & Palo, P. (2020): The Speech Articulation Toolkit (SATKit): Ultrasound image analysis in Python. In ISSP 2020, Online (planned as Providence, Rhode Island)

When making use of the Optic Flow code, please cite:

  1. Esling, J. H., & Moisik, S. R. (2012). Laryngeal aperture in relation to larynx height change: An analysis using simultaneous laryngoscopy and laryngeal ultrasound. In D. Gibbon, D. Hirst, & N. Campbell (Eds.), Rhythm, melody and harmony in speech: Studies in honour of Wiktor Jassem: Vol. 14/15 (pp. 117–127). Polskie Towarzystwo Fonetyczne.
  2. Moisik, S. R., Lin, H., & Esling, J. H. (2014). A study of laryngeal gestures in Mandarin citation tones using simultaneous laryngoscopy and laryngeal ultrasound (SLLUS). Journal of the International Phonetic Association, 44(01), 21–58. https://doi.org/10.1017/S0025100313000327
  3. Poh, D. P. Z., & Moisik, S. R. (2019). An acoustic and articulatory investigation of citation tones in Singaporean Mandarin using laryngeal ultrasound. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of the Phonetic Sciences.

When using the Pixel Difference (PD) code, please cite:

  1. Pertti Palo (2019). Measuring Pre-Speech Articulation. PhD thesis. Queen Margaret University, Scotland, UK. Available here PhD thesis.

Acknowledgments

  • Inspiration for PD was drawn from previous projects using Euclidean distance to measure change in articulatory speech data. For references, see Pertti Palo's PhD thesis.

  • The project uses a nifty python tool called licenseheaders by Johann Petrak and contributors to add and update license headers for python files.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

patkit-0.22.0.tar.gz (189.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

patkit-0.22.0-py3-none-any.whl (311.7 kB view details)

Uploaded Python 3

File details

Details for the file patkit-0.22.0.tar.gz.

File metadata

  • Download URL: patkit-0.22.0.tar.gz
  • Upload date:
  • Size: 189.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for patkit-0.22.0.tar.gz
Algorithm Hash digest
SHA256 ea86c53bfe27bab96a1ae8f520eba8cd3bdd6980e69f3b591d91064f64b52aa5
MD5 4095eb065de14864f9a1ff5c83075e20
BLAKE2b-256 56243db4f5df39d772da68de689f17688c51dc517608a2fd3a498cd50971c5d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for patkit-0.22.0.tar.gz:

Publisher: release.yaml on giuthas/patkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file patkit-0.22.0-py3-none-any.whl.

File metadata

  • Download URL: patkit-0.22.0-py3-none-any.whl
  • Upload date:
  • Size: 311.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for patkit-0.22.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cb2de2a4ecc1ff8f89e66860b850d344333710fde92a437bc61d8f784e568dc4
MD5 bd2bc81e749614e5f62715c92616dca7
BLAKE2b-256 0c730802669817acc5abaf2e7522f706f87278621bf2aa1a6a32ce4112fbedac

See more details on using hashes here.

Provenance

The following attestation bundles were made for patkit-0.22.0-py3-none-any.whl:

Publisher: release.yaml on giuthas/patkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page