Skip to main content

Thai person name classifier

Project description

nameseer ['name-seer'] - a Thai person name classifier

nameseer ('name-seer') is a python library for Thai name classification. It can determine whether a given name is a Thai person name or a Thai corporate name. nameseer uses linguistics features within the name to determine whether the name is a Thai person name or not. nameseer comes with a pre-trained model that was trained on 700,000 company names and 700,000 person names, with 0.994 accuracy when tested on 400,000 unseen names.

nameseer is based on the name-ethnicity classifier, orginally proposed here:

Treeratpituk, Pucktada, and C. Lee Giles. "Name-ethnicity classification and ethnicity-sensitive name matching." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 26. No. 1. 2012.

Paper URL : https://ojs.aaai.org/index.php/AAAI/article/download/8324/8183

Requirements

  • abydos
  • scikit-learn
  • nltk
  • pythainlp
  • python = 3.7.6+

Installation

nameseer can be installed using pip

pip install nameseer

Usages

Once installed, you can use nameseer within your python code to classify whether a Thai name is a person name or a corporate name.

>>> from nameseer import NameClassifier

>>> nc = NameClassifier.load_pretrained_model()
>>> nc.classify_names(['ประยุทธ์ จันทร์โอชา','แอดวานซ์อินโฟร์เซอร์วิส'])
['person', 'company']

Citation

Treeratpituk, Pucktada (2022). Nameseer: a Thai person name classifier. Mar 29, 2022. See https://github.com/botx/nameseer

Author

Pucktada Treeratpituk, Bank of Thailand (pucktadt@bot.or.th)

License

This project is licensed under the Apache Software License 2.0 - see the LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nameseer-0.1.2.tar.gz (4.9 MB view details)

Uploaded Source

Built Distribution

nameseer-0.1.2-py3-none-any.whl (5.1 MB view details)

Uploaded Python 3

File details

Details for the file nameseer-0.1.2.tar.gz.

File metadata

  • Download URL: nameseer-0.1.2.tar.gz
  • Upload date:
  • Size: 4.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for nameseer-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0a4c33f727794fc2a4bf516574a84c5e3554c2387347a225221cebdab852fcef
MD5 33f9928c2f50d57723f790c0de74e9ac
BLAKE2b-256 f1c126c773a3800182c5314d82374925bb99c04032008e540b451e851bc4fd12

See more details on using hashes here.

File details

Details for the file nameseer-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: nameseer-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for nameseer-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8116c3c99d0e6e6a0c05364ae650870ed229972b123c68a81f4a63edd0bb18cd
MD5 f901ccbb074889898657781d7b92f3b5
BLAKE2b-256 ae6a9c8ffc209807ec7905408955e83522db65d022138f26ae51f5c1820f13ff

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page