The grammar analyzer for Turkish

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Dizge: The grammar analyzer for Turkish

Version License

Introduction
How to Use
2.1. Installation and First Run
2.2. competence Functions
2.3. tools Functions
2.4. Data Loading
2.5. Analysis and Getting Result
Contact

Introduction

Dizge is an open-source Python project for linguistic analysis of Turkish data. The project consists of two modules:

competence: It represents the linguistic knowledge of an ideal native speaker, and it's programmed in the Object Oriented Programming (OOP) principles.
tools: It includes rule-based functions representing the linguistic processes.

The current version (v0.1.x) of the project has a few phonetic/phonological functions, as a pre-version.

For the theoretical background of Dizge, the following resources have benefitted from:

Ergenç, İ., & Uzun, İ. P. (2020). Türkçenin ses dizgesi (2nd ed.). Seçkin Yayınevi.
IPA Chart, http://www.internationalphoneticassociation.org/content/ipa-chart, available under a Creative Commons Attribution-Sharealike 3.0 Unported License. Copyright © 2018 International Phonetic Association.

You can cite our project in your publications:

Mutlu, M. U., Yetimaslan, N., & Atagün, İ. (2021). Dizge: The grammar analyzer for Turkish. https://github.com/dizge

Also, Dizge has a no-code web service on https://dizge.pythonanywhere.com.

How to Use

Installation and First Run

You can use the pip tool to install the package to your system with the following pip install dizge command. Then, it's easy to call it in any Python script:

import dizge

`competence` Functions

Dizge has all the Turkish phonemes and their linguistic features. For example, you can easily get a list of Turkish vowels and consonants:

for i in [vars(phoneme) for phoneme in dizge.vowels]:
    print(i)

for i in [vars(phoneme) for phoneme in dizge.consonants]:
    print(i)

Similarly, you can check if any phonemes have a feature with the help of functions in competence. It's enough to give the grapheme of any phonemes as string inputs to the following functions:

isVowel(): If a phoneme is a vowel, it'll return True; otherwise, it'll return False.
isUnrounded(): If a phoneme is an unrounded vowel, it'll return True; otherwise, it'll return False.
isRounded(): If a phoneme is a rounded vowel, it'll return True; otherwise, it'll return False.
isClose(): If a phoneme is a closed vowel, it'll return True; otherwise, it'll return False.
isCloseMid(): If a phoneme is a mid-closed vowel, it'll return True; otherwise, it'll return False.
isOpenMid(): If a phoneme is a mid-open vowel, it'll return True; otherwise, it'll return False.
isOpen(): If a phoneme is an open vowel, it'll return True; otherwise, it'll return False.
isFront(): If a phoneme is a front vowel, it'll return True; otherwise, it'll return False.
isCentral(): If a phoneme is a central vowel, it'll return True; otherwise, it'll return False.
isBack(): If a phoneme is a back vowel, it'll return True; otherwise, it'll return False.
isConsonant(): If a phoneme is a consonant, it'll return True; otherwise, it'll return False.
isPlosive(): If a phoneme is a plosive consonant, it'll return True; otherwise, it'll return False.
isNasal(): If a phoneme is a nasal consonant, it'll return True; otherwise, it'll return False.
isTrill(): If a phoneme is a trilled consonant, it'll return True; otherwise, it'll return False.
isTaporFlap(): If a phoneme is a tapped or flapped consonant, it'll return True; otherwise, it'll return False.
isFricative(): If a phoneme is a fricative consonant, it'll return True; otherwise, it'll return False.
isLateralFricative(): If a phoneme is a lateral fricative consonant, it'll return True; otherwise, it'll return False.
isApproximant(): If a phoneme is an approximantal consonant, it'll return True; otherwise, it'll return False.
isLateralApproximant(): If a phoneme is a lateral approximantal consonant, it'll return True; otherwise, it'll return False.
isBilabial(): If a phoneme is a bilabial consonant, it'll return True; otherwise, it'll return False.
isLabiodental(): If a phoneme is a labiodental consonant, it'll return True; otherwise, it'll return False.
isDental(): If a phoneme is a dental consonant, it'll return True; otherwise, it'll return False.
isAlveolar(): If a phoneme is an alveolar consonant, it'll return True; otherwise, it'll return False.
isPostalveolar(): If a phoneme is a postalveolar consonant, it'll return True; otherwise, it'll return False.
isRetroflex(): If a phoneme is a retroflex consonant, it'll return True; otherwise, it'll return False.
isPalatal(): If a phoneme is a palatal consonant, it'll return True; otherwise, it'll return False.
isVelar(): If a phoneme is a velar consonant, it'll return True; otherwise, it'll return False.
isUvular(): If a phoneme is an ulvular consonant, it'll return True; otherwise, it'll return False.
isPharyngeal(): If a phoneme is a pharyngeal consonant, it'll return True; otherwise, it'll return False.
isGlottal(): If a phoneme is a glottal consonant, it'll return True; otherwise, it'll return False.
isVoiced(): If a phoneme is voiced, it'll return True; otherwise, it'll return False.
isVoiceless(): If a phoneme is voiceless, it'll return True; otherwise, it'll return False.

`tools` Functions

In this module, there are some functions of linguictic processes, like transcription (Grapheme-to-Phoneme or G2P), syllablization, etc.

First, the softG() function shows the effects, such as vowel shifting or lengthening, of the <ğ> grapheme without a phoneme value:

>>> dizge.softG('dağ')
'daː'
>>> dizge.softG('göğüs')
'göːüs'
>>> dizge.softG('eğlence')
'eylence'

We provide two options for syllabilzation analysis. First of them is the orthography-based analysis, as the traditional method. syllable_o() works in this way.

>>> dizge.syllable_o('afyonkarahisarlılaştıramadıklarımızdanmışçasına')
['af', 'yon', 'ka', 'ra', 'hi', 'sar', 'lı', 'laş', 'tı', 'ra', 'ma', 'dık', 'la', 'rı', 'mız', 'dan', 'mış', 'ça', 'sı', 'na']

However, if you care about the phonetic occurences during the syllablization, you can use the syllable_p() function.

>>> dizge.syllable_p('afyonkarahisarlılaştıramadıklarımızdanmışçasına')
[('ɑf', 'VC'), ('jɔŋ', 'CVC'), ('kɑ', 'CV'), ('ɾɑ', 'CV'), ('çI', 'CV'), ('sɑɾ', 'CVC'), ('łɨ', 'CV'), ('łɑʃ', 'CVC'), ('tɨ', 'CV'), ('ɾɑ', 'CV'), ('mɑ', 'CV'), ('dɨk', 'CVC'), ('łɑ', 'CV'), ('ɾɨ', 'CV'), ('mɨz', 'CVC'), ('dɑn', 'CVC'), ('mɨʃ', 'CVC'), ('tʃɑ', 'CV'), ('sɨ', 'CV'), ('nɑ', 'CV')]

Also, you can get a comprehensive calculation of the analysis with the countSyllable() function:

>>> dizge.countSyllable('afyonkarahisarlılaştıramadıklarımızdanmışçasına')
{'VC': 1, 'CVC': 7, 'CV': 12}

You can analyze the vowel harmony of a word, as a signal of Turkish originated words, with the harmony() function. It checks e-type and i-type harmonies of the word and returns both in a tuple:

>>> dizge.harmony('ankara')
(True, True)

Last but not least, the g2p() function transcribes from graphemes to phonemes in a rule-based way:

>>> dizge.g2p('afyonkarahisarlılaştıramadıklarımızdanmışçasına')
'ɑfjɔŋkɑɾɑçIsɑɾłɨłɑʃtɨɾɑmɑdɨkłɑɾɨmɨzdɑnmɨʃtʃɑsɨnɑ'

DISCLAIMER:

Unlike our reference resources, we ignored the half-long occurences. It may be implemented in the upcoming updates.
Also, we provide more than one alternative for the examples our reference resources provide alternatives based on the spoken language. You can pick the one you need by indexing.

Data Loading

While using Dizge, you can work on tabular data in the xlsx and csv formats or plain-text data in the txt format. It's possible to use the 3rd party data processing libraries or built-in Python functions to load the data, based on your choice. Here's an example usage:

data = open('data/sample.txt', 'r', encoding="utf-8").read()

After loading the data, you can select the part you need to analyze, and you can standardize the data before starting the analysis. If you prefer to use the analyze() function we'll mention in the next section, the function already standardizes your data. Let's look at how to use the standardize() function:

words = dizge.standardize(data)

Analysis and Getting Result

You can use the analyze() function to analyze the data easily. It's enough to give two inputs, i.e.nthe data you work on, and the tools you want to use. That's all:

>>> tools = ["g2p", "syllable_o", "syllable_p", "countSyllable", "harmony"]
>>> result = dizge.analyze(words, tools)

Contact

For any requests about Dizge, you can use the following contact info:

Email: dizgenlp@gmail.com
X (formerly Twitter): @dizgenlp
Instagram: @dizgenlp

Additionally, please feel free to create issue or PRs (pull requests).

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.1.6

Mar 23, 2024

0.1.5

Mar 10, 2024

0.1.4

Mar 5, 2022

0.1.3

Nov 10, 2021

0.1.2 yanked

Nov 10, 2021

Reason this release was yanked:

Issue

0.1.1

May 19, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dizge-0.1.6.tar.gz (15.6 kB view hashes)

Uploaded Mar 23, 2024 Source

Built Distribution

dizge-0.1.6-py3-none-any.whl (13.6 kB view hashes)

Uploaded Mar 23, 2024 Python 3

Hashes for dizge-0.1.6.tar.gz

Hashes for dizge-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`9d33ed6034718d409cd9ad955c9509b77e6b7f940352b41402bb093b56c338f6`
MD5	`e8e6b15e4721dd9ff8243ca7b3c1719c`
BLAKE2b-256	`a31fa0e3fc73bb0064ae8f8a675735a12df80be0ecd1f623b8cf0af420d77cd5`

Hashes for dizge-0.1.6-py3-none-any.whl

Hashes for dizge-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1be3a1b83d540642c1b6320d6c3d330d7edceb71579ea6fad6f8c98fb679d23f`
MD5	`0cff2519b1982b7c7b4eba6f329896a5`
BLAKE2b-256	`3e455483b7458ec80faf645e80cd91b285f4d3179c388ff75e4a1f18f7f30266`

dizge 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Dizge: The grammar analyzer for Turkish

Table of Contents

Introduction

How to Use

Installation and First Run

`competence` Functions

`tools` Functions

Data Loading

Analysis and Getting Result

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

dizge 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Dizge: The grammar analyzer for Turkish

Table of Contents

Introduction

How to Use

Installation and First Run

competence Functions

tools Functions

Data Loading

Analysis and Getting Result

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

`competence` Functions

`tools` Functions