Skip to main content

Thai person name classifier

Project description

nameseer ['name-seer'] - a Thai person name classifier

nameseer ('name-seer') is a python library for Thai name classification. It can determine whether a given name is a Thai person name or a Thai corporate name. nameseer uses linguistics features within the name to determine whether the name is a Thai person name or not. nameseer comes with a pre-trained model that was trained on 700,000 company names and 700,000 person names, with 99.4% accuracy when tested on 400,000 unseen names.

nameseer is based on the name-ethnicity classifier, orginally proposed here:

Treeratpituk, Pucktada, and C. Lee Giles. "Name-ethnicity classification and ethnicity-sensitive name matching." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 26. No. 1. 2012.

Paper URL : https://ojs.aaai.org/index.php/AAAI/article/download/8324/8183

Requirements

  • abydos
  • scikit-learn
  • nltk
  • pythainlp
  • python >= 3.7.6

Installation

nameseer can be installed using pip

pip install nameseer

Usages

Once installed, you can use nameseer within your python code to classify whether a Thai name is a person name or a corporate name.

>>> from nameseer import NameClassifier

>>> nc = NameClassifier.load_pretrained_model()
>>> nc.classify_names(['ประยุทธ์ จันทร์โอชา','แอดวานซ์อินโฟร์ เซอร์วิส'])
['person', 'company']

Citation

Treeratpituk, Pucktada (2022). Nameseer: a Thai person name classifier. Mar 29, 2022. See https://github.com/botx/nameseer

Author

Pucktada Treeratpituk, Bank of Thailand (pucktadt@bot.or.th)

License

This project is licensed under the Apache Software License 2.0 - see the LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nameseer-0.1.4.tar.gz (4.9 MB view hashes)

Uploaded Source

Built Distribution

nameseer-0.1.4-py3-none-any.whl (5.1 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page