Skip to main content

Phonetic transcription of Spanish

Project description

License: LGPL Version: 2.1.00 Python versions: 3.5, 3.6, 3.7, 3.8, 3.9

Fonemas

A Python phonologic transcription library for Spanish

fonemas is a Python library of methods and functions for phonologic and phonetic transcription of Spanish words.

This library is part of the research project Sound and Meaning in Spanish Golden Age Literature. This library was originally intended to analyse only pohonological features relevant to verse scansion. It has expanded its functionality ever since to become a fully featured phonological and phonetic analyser with IPA and SAMPA support.

Installation

pip3 install fonemas

Use

The library provides the class transcription(sentence, mono, epenthesis, aspiration, rehash, sampastr). The class takes the obligatoy argument sentence, which is a string of characters with a Spanish word or words. It optionally takes two Boolean arguments mono, epenthesis and aspiration set to False as default.

  • mono sets whether the output shows graphic stresses for monosyllabic words

  • epenthesis set the behaviour S bfore consonant in onset (spiritu -> es pi ri tu|spi ri tu)

  • aspiration inserts an aspiration modifier 'ʰ' in onset. This may be useful when dealing with ambiguous verses in classic poetry to choose which synaloepha to break.

  • rehash moves last consonan on last-syllable coda to next's words first-syllable onset if it begins with a vowel.

  • sampastr allows an alternativestress symbol, as '"' to prevent issues e.g. when using in a CSV file.

The class transcription() has three dataclass attributes, each with two attributes {words, syllables} containing each a list of strings, which may be words or syllables, respectively.

  • phonology for the phonological transcription (requires UNICODE support).

  • phonetics for the phonetic transcription in IPA symbols (requires UNICODE support).

  • sampa for the phonetic transcription SAMPA transliteration.

>>> from fonemas import Transcription
>>> object = Transcription('Averigüéis')
>>> a.phonology.words
['abeɾiˈgwejs']
>>> a.phonology.syllables
['a', 'be', 'ɾi', 'ˈgwejs']
>>> a.phonetics.words
['aβeɾiˈɣwejs']
>>> a.phonetics.syllables
['a', 'be', 'ɾi', 'ˈɣwejs']
>>> a.sampa.words
['aBeri"Gwejs']
>>> a.sampa.syllables
['a', 'Be', 'ri', '"Gwejs']

Description

The transcription is done according to the Spanish phonology and phonotactics described by Quilis (2019).

Known issues

The phonetic transcription lacks allophones represented in IPA with diacritics. They require double characters, which need a workaround to be evaluated. It can be solved using hacks for 'special cases', which I will do until figure out a general solution.

Non-Spanish languages with different prosodic rules but same spelling will cause problems, e.g.(lat. 'amor', 'amabor', 'amabar', 'amer' vs sp. 'amor'. 'labor', 'acabar', 'temer').

Contributions

Feel free to contribute using the GitHub Issue Tracker for feedback, suggestions, or bug reports.

How to cite fonemas

Authors of scientific papers including results generated using fonemas are encouraged to cite the following paper.

@article{SanzLazaroF_RHD2023, 
    author    = {Sanz-Lázaro, Fernando},
    title     = {Del fonema al verso: una caja de herramientas digitales de escansión teatral},
    volume    = {8},
    journal   = {Revista de Humanidades Digitales},
    doi       = {https://doi.org/10.5944/rhd.vol.8.2023.37830},
    pages     = {74--89},
    langid    = {Spanish},
}                                                                                                                                                              

Changelog

  • 2.1.0

    • ChatGpt optimisation adn documentation
  • 2.0.20.1

    • Simplified coarticullations with regex
  • 2.0.20

    • Solved some issues with nasal coarticulation
  • 2.0.19

    • Solved stops/affricates alternance after space, hyphen, stress mark, beginning of line.
  • 2.0.18

    • Solved diphthongs contradicting the perceptibility scale.
  • 2.0.17

    • hie -> ʝe
  • 2.0.16

    • Isolated consonants

Copyright

Copyright (C) 2022 Fernando Sanz-Lázaro <fsanzl@gmail.com>

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this library. If not, see <https://www.gnu.org/licenses/>.

References

Quilis, Antonio, Tratado de fonología y fonética españolas. Madrid, Gredos, 2019.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fonemas-2.1.0.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

fonemas-2.1.0-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file fonemas-2.1.0.tar.gz.

File metadata

  • Download URL: fonemas-2.1.0.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for fonemas-2.1.0.tar.gz
Algorithm Hash digest
SHA256 45614b4389b2215f91cf3bc6070b024a2750ae3f9a1438ce8f3efcf8fb410665
MD5 f50df0d74d070352daf6599531a9665e
BLAKE2b-256 53db50fd387b5c00ee136569e68567ee8c607d70115445a9b8a511582351828c

See more details on using hashes here.

File details

Details for the file fonemas-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: fonemas-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for fonemas-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 15897a490a736159731d273d14c59fc15159244d125c6d7b26afcd4369734321
MD5 4725da72699464137e8055d9a0197272
BLAKE2b-256 d8d5f61585d0000e416fe070542ce534ca5ce7003f8c3719ef0d55fd37c090d5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page