Command-line interface (CLI) to create a pronunciation dictionary by looking up pinyin transcriptions using pypinyin including the possibility of ignoring punctuation and splitting words on hyphens before transcribing them.
Project description
dict-from-pypinyin
Command-line interface (CLI) to create a pronunciation dictionary by looking up pinyin transcriptions using pypinyin including the possibility of ignoring punctuation and splitting words on hyphens before transcribing them.
Installation
pip install dict-from-pypinyin --user
Usage
dict-from-pypinyin-cli
Example
# Create example vocabulary
cat > /tmp/vocabulary.txt << EOF
社会语言学?
㐻,
『㑐
鲜-亮。
『占斌?
『机具-机呀?
EOF
# Create dictionary from vocabulary
dict-from-pypinyin-cli \
/tmp/vocabulary.txt \
/tmp/result.dict \
--split-on-hyphen
cat /tmp/result.dict
Output:
社会语言学? shè huì yǔ yán xué ?
社会语言学? shè huì yǔ yàn xué ?
社会语言学? shè huì yǔ yín xué ?
社会语言学? shè huì yù yán xué ?
社会语言学? shè huì yù yàn xué ?
社会语言学? shè huì yù yín xué ?
社会语言学? shè kuài yǔ yán xué ?
社会语言学? shè kuài yǔ yàn xué ?
社会语言学? shè kuài yǔ yín xué ?
社会语言学? shè kuài yù yán xué ?
社会语言学? shè kuài yù yàn xué ?
社会语言学? shè kuài yù yín xué ?
㐻, nèi ,
『㑐 『 shū
鲜-亮。 xiān - liàng 。
鲜-亮。 xiān - liáng 。
鲜-亮。 xiǎn - liàng 。
鲜-亮。 xiǎn - liáng 。
『占斌? 『 zhàn bīn ?
『占斌? 『 zhān bīn ?
『占斌? 『 tiē bīn ?
『机具-机呀? 『 jī jù - jī ya ?
『机具-机呀? 『 jī jù - jī yā ?
『机具-机呀? 『 jī jù - jī xiā ?
『机具-机呀? 『 jī jù - wèi ya ?
『机具-机呀? 『 jī jù - wèi yā ?
『机具-机呀? 『 jī jù - wèi xiā ?
『机具-机呀? 『 wèi jù - jī ya ?
『机具-机呀? 『 wèi jù - jī yā ?
『机具-机呀? 『 wèi jù - jī xiā ?
『机具-机呀? 『 wèi jù - wèi ya ?
『机具-机呀? 『 wèi jù - wèi yā ?
『机具-机呀? 『 wèi jù - wèi xiā ?
Development setup
# update
sudo apt update
# install Python 3.8, 3.9, 3.10 & 3.11 for ensuring that tests can be run
sudo apt install python3-pip \
python3.8 python3.8-dev python3.8-distutils python3.8-venv \
python3.9 python3.9-dev python3.9-distutils python3.9-venv \
python3.10 python3.10-dev python3.10-distutils python3.10-venv \
python3.11 python3.11-dev python3.11-distutils python3.11-venv \
python3.12 python3.12-dev python3.12-distutils python3.12-venv
# install pipenv for creation of virtual environments
python3.8 -m pip install pipenv --user
# check out repo
git clone https://github.com/stefantaubert/dict-from-pypinyin.git
cd dict-from-pypinyin
# create virtual environment
python3.8 -m pipenv install --dev
Running the tests
# first install the tool like in "Development setup"
# then, navigate into the directory of the repo (if not already done)
cd dict-from-pypinyin
# activate environment
python3.8 -m pipenv shell
# run tests
tox
Final lines of test result output:
py38: commands succeeded
py39: commands succeeded
py310: commands succeeded
py311: commands succeeded
py312: commands succeeded
congratulations :)
License
MIT License
Acknowledgments
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410
Citation
If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see About => Cite this repository).
Taubert, S. (2024). dict-from-pypinyin (Version 0.0.2) [Computer software]. https://doi.org/10.5281/zenodo.10554720
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dict_from_pypinyin-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c63ab4575e7928cdfbae52a2938831187af0b3e7e9bb934fb803c6eb02ab98c9 |
|
MD5 | fbe6039fc924d419372442f955b5f60e |
|
BLAKE2b-256 | bd137fa4042d3f0df2300cc5006510ff07885d99ae718ceeb93a4fc1a0c47253 |