Skip to main content

Thai person name classifier

Project description

nameseer ['name-seer'] - a Thai person name classifier

nameseer ('name-seer') is a python library for Thai name classification. It can determine whether a given name is a Thai person name or a Thai corporate name. nameseer uses linguistics features within the name to determine whether the name is a Thai person name or not. nameseer comes with a pre-trained model that was trained on 700,000 company names and 700,000 person names, with 99.4% accuracy when tested on 400,000 unseen names.

nameseer is based on the name-ethnicity classifier, orginally proposed here:

Treeratpituk, Pucktada, and C. Lee Giles. "Name-ethnicity classification and ethnicity-sensitive name matching." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 26. No. 1. 2012.

Paper URL : https://ojs.aaai.org/index.php/AAAI/article/download/8324/8183

Requirements

  • abydos
  • scikit-learn
  • nltk
  • pythainlp
  • python >= 3.7.6

Installation

nameseer can be installed using pip

pip install nameseer

Usages

Once installed, you can use nameseer within your python code to classify whether a Thai name is a person name or a corporate name.

>>> from nameseer import NameClassifier

>>> nc = NameClassifier.load_pretrained_model()
>>> nc.classify_names(['ประยุทธ์ จันทร์โอชา','แอดวานซ์อินโฟร์ เซอร์วิส'])
['person', 'company']

Citation

Treeratpituk, Pucktada (2022). Nameseer: a Thai person name classifier. Mar 29, 2022. See https://github.com/botx/nameseer

Author

Pucktada Treeratpituk, Bank of Thailand (pucktadt@bot.or.th)

License

This project is licensed under the Apache Software License 2.0 - see the LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nameseer-0.1.4.tar.gz (4.9 MB view details)

Uploaded Source

Built Distribution

nameseer-0.1.4-py3-none-any.whl (5.1 MB view details)

Uploaded Python 3

File details

Details for the file nameseer-0.1.4.tar.gz.

File metadata

  • Download URL: nameseer-0.1.4.tar.gz
  • Upload date:
  • Size: 4.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for nameseer-0.1.4.tar.gz
Algorithm Hash digest
SHA256 aba523b5e4ab959e6b26125362d52d321f3deabbc01dac0a95a40f155d414789
MD5 1259fc2437497d0e0f19e240fb860008
BLAKE2b-256 a7cf1876f1c1b25e5244a5310df435ef7ca51bc8ee2781db741f9623fd66beae

See more details on using hashes here.

File details

Details for the file nameseer-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: nameseer-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for nameseer-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 313e17a23b455ed34934a5002b7f31f45d66424bb33c8e9c3d2fb6979e09bf0a
MD5 a5d2f7d7c6b56dcaf189200aac93b8b4
BLAKE2b-256 2b0857f17a236ca10bc4a4e62f53309060c01e78be8d2ff3ddb44167e3a25e46

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page