Command-line interface (CLI) to modify TextGrids and their corresponding audio files.
Project description
textgrid-tools
Command-line interface (CLI) to modify TextGrids and their corresponding audio files.
Features
- grids
merge: merge grids togetherplot-durations: plot durationsmark-durations: mark intervals with specific durations with a textcreate-dictionary: create pronunciation dictionary out of a word and a pronunciation tierplot-stats: plot statisticsexport-vocabulary: export vocabulary out of multiple grid filesexport-marks: exports marks of a tier to a fileexport-durations: exports durations of grids to a fileexport-paths: exports grid paths to a fileexport-audio-paths: exports audio paths to a fileimport-paths: import grids from paths written in a fileimport-audio-paths: import audio files from paths written in a file
- grid
create: convert text files to grid filessync: synchronize grid minTime and maxTime according to the corresponding audio filesplit: split a grid file on intervals into multiple grid files (incl. audio files)print-stats: print statistics
- tiers
apply-mapping: apply mapping table to markstranscribe: transcribe words of tiers using a pronunciation dictionaryremove: remove tiers
- tier
rename: rename tierclone: clone tiermap: map tier to other tiersmove: move tier to another positionexport: export content of tier to a txt fileimport: import content of tier from a txt file
- intervals
join: join adjacent intervalsjoin-between-marks: join intervals between marksjoin-by-boundary: join intervals by boundaries of a tierjoin-by-duration: join intervals by a durationjoin-marks: join intervals containing specific marksjoin-symbols: join intervals containing specific symbolsjoin-template: join intervals according to a templatesplit: split intervalsfix-boundaries: align boundaries of tiers according to a reference tierremove: remove intervalsplot-durations: plot durationsreplace-text: replace text using regex pattern
Roadmap
- Performance improvement
- Adding more tests
Installation
pip install textgrid-tools --user
Usage
usage: textgrid-tools-cli [-h] [-v] {grids,grid,tiers,tier,intervals} ...
This program provides methods to modify TextGrids (.TextGrid) and their corresponding audio files (.wav).
positional arguments:
{grids,grid,tiers,tier,intervals} description
grids execute commands targeted at multiple grids at once
grid execute commands targeted at single grids
tiers execute commands targeted at multiple tiers at once
tier execute commands targeted at single tiers
intervals execute commands targeted at intervals of tiers
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
Dependencies
numpy>=1.18.5scipy>=1.8.0tqdm>=4.63.0TextGrid>=1.5pandas>=1.4.0ordered_set>=4.1.0matplotlib>=3.5.0pronunciation_dictionary>=0.0.5
Contributing
If you notice an error, please don't hesitate to open an issue.
Development setup
# update
sudo apt update
# install Python 3.8, 3.9, 3.10 & 3.11 for ensuring that tests can be run
sudo apt install python3-pip \
python3.8 python3.8-dev python3.8-distutils python3.8-venv \
python3.9 python3.9-dev python3.9-distutils python3.9-venv \
python3.10 python3.10-dev python3.10-distutils python3.10-venv \
python3.11 python3.11-dev python3.11-distutils python3.11-venv
# install pipenv for creation of virtual environments
python3.8 -m pip install pipenv --user
# check out repo
git clone https://github.com/stefantaubert/textgrid-ipa.git
cd textgrid-ipa
# create virtual environment
python3.8 -m pipenv install --dev
Running the tests
# first install the tool like in "Development setup"
# then, navigate into the directory of the repo (if not already done)
cd textgrid-ipa
# activate environment
python3.8 -m pipenv shell
# run tests
tox
Final lines of test result output:
py38: commands succeeded
py39: commands succeeded
py310: commands succeeded
py311: commands succeeded
congratulations :)
Troubleshooting
If recordings/audio files are not in .wav format they need to be converted, e.g.:
sudo apt install ffmpeg -y
# e.g., mp3 to wav conversion
ffmpeg -i *.mp3 -acodec pcm_s16le -ar 22050 *.wav
License
MIT License
Acknowledgments
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410
Citation
If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see About => Cite this repository).
Changelog
- v0.0.8 (2023-05-30)
- Fixed:
- Bugfix
intervals removecopying on different in/out-locations - Bugfix
import-pathsandimport-audio-pathsoption--symlinkis now creating symbolic links instead of hard links
- Bugfix
- Changed:
- Improved logging in
import-pathsandimport-audio-paths - Improved logging of durations in
grids plot-stats
- Improved logging in
- Added:
- Added option to get durations from audio files on
grids export-durations
- Added option to get durations from audio files on
- Fixed:
- v0.0.7 (2023-01-12)
- Fixed:
- Bugfix
grids import-pathsandgrids import-audio-paths
- Bugfix
- Added:
- Added option
--ignoreto ignore custom marks ingrids export-vocabulary - Added option
--modetointervals replace-textto replace text on different interval positions - Added returning of an exit code
- Added option
- Removed:
- Removed
tiers mark-silencebecausegrids mark-durationsshould be used - Removed
tiers remove-symbolsbecauseintervals replace-textshould be used - Removed
intervals join-between-pausesbecausejoin-between-marksshould be used
- Removed
- Fixed:
- v0.0.6 (2022-12-23)
- improved validation for pronunciation dictionary creation
- bugfix replace text logging
- added intervals join-template
- support Python 3.11
- update pylint config
- fix description of grid/audio import
- v0.0.5 (2022-11-25)
intervals remove: added parametermodeto better choose which intervals should be removed- Added method to plot statistics for all grids together
tiers transcribe: added optionassign-mark-to-missingto replace missing transcriptions with a custom mark- Bugfix:
mark-durationsempty couldn't be assigned - Added
--min-counttomark-durations - Improved sorting of phonemes in durations plotting
- Changed marks exporting format to only contain tier marks
- Added exporting/importing of audio paths
- Added durations exporting
- Added exporting/importing of grid paths
- Added replacement of marks using regex pattern
- Added
--dryoption to most methods - Make split symbol on split mandatory
- Upper-cased metavars
- v0.0.4 (2022-06-09)
- fixed bug while saving TextGrids
- improved robustness against file system errors
- v0.0.3 (2022-05-31)
- fixed invalid installation format and clarified dependencies
- adjusted textgrid serialization equal to praat output
- added option
include-emptyon vocabulary export - set default chunksize to
1 - added missing
__init__.pyfiles - improved logging
- v0.0.2 (2022-05-06)
- improved logging
- improved reading/saving speed of TextGrids
- removed n_digits argument
- added option to define encoding of TextGrids
- added option to insert interval between grids which should be merged together
- removed tier copy
- added parser for tier export
- v0.0.1 (2022-04-29)
- initial release
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file textgrid-tools-0.0.8.tar.gz.
File metadata
- Download URL: textgrid-tools-0.0.8.tar.gz
- Upload date:
- Size: 82.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4afa3e8e4dacd3864f9e01165fa2871af8aaad94362461aebf7131126f432ad9
|
|
| MD5 |
78c210fccf0daea895968f71b1484a57
|
|
| BLAKE2b-256 |
94c42e79b42bb06189c8a4ae017336fffbf72739a8400d60841092c379d6bb38
|
File details
Details for the file textgrid_tools-0.0.8-py3-none-any.whl.
File metadata
- Download URL: textgrid_tools-0.0.8-py3-none-any.whl
- Upload date:
- Size: 152.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
379420d6e7496a137bffa8be4b51f90ca43ddc42a321d677a7e2cfcf0f482fee
|
|
| MD5 |
7662e96ec34f0cd0fdf6e63009281ad8
|
|
| BLAKE2b-256 |
243d57622bc04b493a64b7c6e9361b382cfac4ac326d8c21c42f351713753dbc
|