Skip to main content

A Persian Grapheme to Phoneme model using LSTM implemented in pytorch

Project description

g2p_fa

A Grapheme to Phoneme model using LSTM implemented in pytorch

Installation

pip install g2p_fa

Usage:

>>> from g2p_fa import G2P_Fa
>>> g2p = G2P_Fa()
>>> g2p('سلام')
'sælɒːm'
>>> g2p('طلا')
'tʰælɒː'
>>> g2p('تلاش')
'tʰælɒːʃ'

Training

Create a csv file with Persian text at the first column and IPA in second colmn. for example:

ابتکار,ʔebtʰekʰɒːɾ

To change the hyperparameters, you can pass them as a dict to the model:

>>> hp = {
    'INPUT_DIM' : 41,
    'OUTPUT_DIM' : 33,
    'ENC_EMB_DIM' : 16,
    'DEC_EMB_DIM' : 16,
    'HID_DIM' : 128,
    'N_LAYERS' : 2,
    'ENC_DROPOUT' : 0.5,
    'DEC_DROPOUT' : 0.5
}

Then create an instance of 'G2P_Fa' wihout loading checkpoint:

>>> from g2p_fa import G2P_Fa
>>> g2p = G2P_Fa(checkpoint=None, hparams=hp)

And then train the model with csv file:

>>> g2p.train('data.csv',epoch=20)
len train: 18968, len valid: 4743
initial loss: 3.5005286693573
Epoch 1 / 20    Train Loss: 3.264, Valid loss: 2.996
Epoch 2 / 20    Train Loss: 2.937, Valid loss: 2.898
Epoch 3 / 20    Train Loss: 2.851, Valid loss: 2.828
Epoch 4 / 20    Train Loss: 2.768, Valid loss: 2.790
Epoch 5 / 20    Train Loss: 2.664, Valid loss: 2.836
Epoch 6 / 20    Train Loss: 2.579, Valid loss: 2.855
Epoch 7 / 20    Train Loss: 2.573, Valid loss: 2.820
Epoch 8 / 20    Train Loss: 2.510, Valid loss: 2.865
Epoch 9 / 20    Train Loss: 2.491, Valid loss: 2.849
Epoch 10 / 20   Train Loss: 2.417, Valid loss: 2.837
Epoch 11 / 20   Train Loss: 2.421, Valid loss: 2.817
Epoch 12 / 20   Train Loss: 2.370, Valid loss: 2.884
Epoch 13 / 20   Train Loss: 2.350, Valid loss: 2.872
Epoch 14 / 20   Train Loss: 2.318, Valid loss: 2.797
Epoch 15 / 20   Train Loss: 2.317, Valid loss: 2.653
Epoch 16 / 20   Train Loss: 2.316, Valid loss: 2.634
Epoch 17 / 20   Train Loss: 2.292, Valid loss: 2.629
Epoch 18 / 20   Train Loss: 2.215, Valid loss: 2.709
Epoch 19 / 20   Train Loss: 2.208, Valid loss: 2.581
Epoch 20 / 20   Train Loss: 2.182, Valid loss: 2.568

Then you can save the model:

>>> g2p.save('SAVE_PATH')

For using your saved model you have to pass the checkpoint

>>> g2p = G2P_Fa(checkpoint='SAVE_PATH',hparams=hp)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

g2p_fa-1.1.0.tar.gz (6.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

g2p_fa-1.1.0-py3-none-any.whl (6.2 MB view details)

Uploaded Python 3

File details

Details for the file g2p_fa-1.1.0.tar.gz.

File metadata

  • Download URL: g2p_fa-1.1.0.tar.gz
  • Upload date:
  • Size: 6.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for g2p_fa-1.1.0.tar.gz
Algorithm Hash digest
SHA256 0f3c9f467610b008f42e5477f665a3be81888a45f2866a5b6e4eff6015ed31b0
MD5 eba33d9ea1a0d405ca3bd3fcf123df1d
BLAKE2b-256 4fde533115739f6b9a3f3481e00ae0916a70c6da78ec3ab548fe81c75948570f

See more details on using hashes here.

File details

Details for the file g2p_fa-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: g2p_fa-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for g2p_fa-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4b5fde901920fa326852ae121431c3d91e696e1290f9fca87b5657ea2e56a320
MD5 64625b46c611e01fb56e943f53b60abd
BLAKE2b-256 a658cbf8a55c491c9ff130dccaea67614f067da149e82285cf42a45c77008552

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page