Skip to main content

Phoneme Aligner

Project description

# Phoneme Forced Aligner


This package was designed to intake human data from Makin Lab (.txt and .wav file from the block needed phoneme alignment) and ouput a JSON file with the mike-on and mike-off times and the alignments (start time, transcription method, production 1, phoneme list 1, production 2, phoneme list 2).

## Getting Started

These instructions will give you a copy of the project up and running on
your local machine for development and testing purposes.

### Installing

This is a pip installable package. Therefore, run the following command:
____________________________________________________

## Functions

### get_forced_aligment()
input: block txt path, block wav path, output transcription json path, transcription method, critical error threshold, verbose
functionality:
- determine which transcript will be used based on transcription method input (Critical Error, Wav2Vec, Original)
- run Montreal Forced aligner (input trials directory and verbose and returns text grid)
- demarcate JSON
- clean directories
____________________________________________________

### demarcate_to_json()
input: trial directory, block path, text grid directory, output json file path
functionality:
- read textgrid
- use ER-demarcation Algorithm to denote phoneme split
- use ER-demarcation to denote transcript split
- write to Phoneme Json (see output format)
____________________________________________________

### clean_directories()
input: list of temp directories created
functionality: clean and remove directory
____________________________________________________

### parse_block()
input: wav file path, txt file path, trials directory, verbose
functionality:
- create a Trial Directory (same name as the label): each trial is a .wav and a .txt
- Trial Directory:
Trials dir:
trial.wav
trial.txt
...
Trial transcription method .json

TranscriptionMethod.json format:
{
trial start-time (float): method ('wav2vec' or 'original'),
...
}
____________________________________________________

## Notes
For a more in depth explanation in the methods used in this package, as well as the reasoning behind, refer back to the paper.

## Authors

- **James Willian Stonebridge**
- **Herbert Alexander de Bruyn**
- **Tyler Dierckman**

## License

This project is licensed under the MIT License.

## Acknowledgments

- Varun implemented the original code for Wav2Vec2 based transcription used in this package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

align_phonemes-0.12.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

align_phonemes-0.12-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file align_phonemes-0.12.tar.gz.

File metadata

  • Download URL: align_phonemes-0.12.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.1

File hashes

Hashes for align_phonemes-0.12.tar.gz
Algorithm Hash digest
SHA256 0ca1d066007a05ef0a844c4d20bca35c138837de0e218e97b7e8197088c4edf9
MD5 0fdc146bec022fd83e7485069cf221ff
BLAKE2b-256 773017ec24fbc4000808fef0a99489f67aee686b695bb849a917910735dbcb43

See more details on using hashes here.

File details

Details for the file align_phonemes-0.12-py3-none-any.whl.

File metadata

File hashes

Hashes for align_phonemes-0.12-py3-none-any.whl
Algorithm Hash digest
SHA256 6c5a686b85265c5db949935207ea009af456c2246cc26fa6a4d693b26ea41967
MD5 be2bcf686ac2ca884ad501dbffd5e581
BLAKE2b-256 cdac96c5c51fedebf6e0b2b08c3f16612edde91f0618e7bdf2d3ee69cb45dff5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page