Skip to main content

Thai dependency parser.

Project description

Attaparse : Thai Dependency Parser

attaparse is a Thai dependency parser trained using stanza. Attaparse uses PhayaThaiBERT as a based model in training process. The model refer to Stanza*P with no POS model in Thai Universal Dependency Treebank (TUD).

Content

  1. Installation
  2. Usage

Installation

attaparse can be installed using pip:

pip install attaparse

Usage

Initialising

import attaparse
from attaparse import load_model, depparse

nlp = load_model()

Plain Text

Uses Stanza's default Thai tokeniser.

text = 'ฉันอยากกินข้าวที่แม่ทำ'

doc = depparse(text, nlp)

Pipe-Delimited Input

from attaparse import depparse_pipe_delimited

nlp = load_model(tokenize_pretokenized=True)
pipe_text = "ฉัน|รัก|เธอ"

doc = depparse_pipe_delimited(pipe_text, nlp)

Pre-tokenised List Input

from attaparse import depparse_pretokenized

nlp = load_model(tokenize_pretokenized=True)
tokens = [["ฉัน", "กิน", "ข้าว"]]

doc = depparse_pretokenized(tokens, nlp)

Access the Results

print(f'\n{text}\n',*[f'id: {word.id}\tword: {word.text}\thead id: {word.head}\thead: {sent.words[word.head-1].text if word.head > 0 else "root"}\tdeprel: {word.deprel}' for sent in doc.sentences for word in sent.words], sep='\n')
  • .id : the id of the word.
  • .head : the head of the word.
  • .deprel : the dependency relationship between the word and the head.

Citation

If you use attaparse in your project or publication, please cite as follows:

Panyut Sriwirote, Wei Qi Leong, Charin Polpanumas, Santhawat Thanyawong, William Chandra Tjhi, Wirote Aroonmanakun, and Attapol T. Rutherford. 2025. The Thai Universal Dependency Treebank. Transactions of the Association for Computational Linguistics, 13:376–391.

BibTex

@article{sriwirote-etal-2025-thai,
    title = "The {T}hai {U}niversal {D}ependency Treebank",
    author = "Sriwirote, Panyut  and
      Leong, Wei Qi  and
      Polpanumas, Charin  and
      Thanyawong, Santhawat  and
      Tjhi, William Chandra  and
      Aroonmanakun, Wirote  and
      Rutherford, Attapol T.",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "13",
    year = "2025",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/2025.tacl-1.18/",
    doi = "10.1162/tacl_a_00745",
    pages = "376--391"
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

attaparse-1.0.0.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

attaparse-1.0.0-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file attaparse-1.0.0.tar.gz.

File metadata

  • Download URL: attaparse-1.0.0.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for attaparse-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b6f5be0cb39c7b2605fbb985d57f8a1ee25f4dc37181e45b85187efeb87b988d
MD5 d94e7cf9c5936acfb51957ca083be008
BLAKE2b-256 d71c0be6ae423ef721e85f83221484af3bf429e83f32f44bd4c07f8af3e4f74a

See more details on using hashes here.

File details

Details for the file attaparse-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: attaparse-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for attaparse-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8c9529daf26e59b7171397499e7172b0fbe028c20dd5a95ad1c810b7c0d42231
MD5 0f7bf86ec14a19dece50b352a788e857
BLAKE2b-256 4922d66a2663aef9c21775a853dd4444a1adbcdcb8c160ac4876783dee5e901e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page