Skip to main content

Thai person name classifier

Project description

nameseer ['name-seer'] - a Thai person name classifier

nameseer ('name-seer') is a python library for Thai name classification. It can determine whether a given name is a Thai person name or a Thai corporate name. nameseer uses linguistics features within the name to determine whether the name is a Thai person name or not. nameseer comes with a pre-trained model that was trained on 700,000 company names and 700,000 person names, with 0.994 accuracy when tested on 400,000 unseen names.

nameseer is based on the name-ethnicity classifier, orginally proposed here:

Treeratpituk, Pucktada, and C. Lee Giles. "Name-ethnicity classification and ethnicity-sensitive name matching." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 26. No. 1. 2012.

Paper URL : https://ojs.aaai.org/index.php/AAAI/article/download/8324/8183

Requirements

  • abydos
  • scikit-learn
  • nltk
  • pythainlp
  • python = 3.9+

Installation

nameseer can be installed using pip

pip install nameseer

Usages

Once installed, you can use nameseer within your python code to classify whether a Thai name is a person name or a corporate name.

>>> from nameseer import NameClassifier

>>> nc = NameClassifier.load_pretrained_model()
>>> nc.classify_names(['ประยุทธ์ จันทร์โอชา','แอดวานซ์อินโฟร์เซอร์วิส']
['person', 'company']

Citation

Treeratpituk, Pucktada (2022). Nameseer: a Thai person name classifier. Mar 29, 2022. See https://github.com/botx/nameseer

Author

Pucktada Treeratpituk, Bank of Thailand (pucktadt@bot.or.th)

License

This project is licensed under the Apache Software License 2.0 - see the LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nameseer-0.1.1.tar.gz (4.9 MB view details)

Uploaded Source

Built Distribution

nameseer-0.1.1-py3-none-any.whl (5.1 MB view details)

Uploaded Python 3

File details

Details for the file nameseer-0.1.1.tar.gz.

File metadata

  • Download URL: nameseer-0.1.1.tar.gz
  • Upload date:
  • Size: 4.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for nameseer-0.1.1.tar.gz
Algorithm Hash digest
SHA256 94a4835cf300b8d744edaf99a43daf8b134ee1be8e86ee84e99141a5756db14e
MD5 1e6e2be1e0abcc1e16a70e2b21dda2fa
BLAKE2b-256 32f0daebf28d6b791ede41e2c66036d6bafa20ffac8c3a177489de9bfe58d471

See more details on using hashes here.

File details

Details for the file nameseer-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: nameseer-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for nameseer-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d2bd88dd0640c1e71d7b07b11c3c2b0bab8b03a9bbdcdd57a61e99d953832b3e
MD5 3a60a28ac366fe5662a47c910a6c336b
BLAKE2b-256 be5c6a09c523b8aca91aab3c1d0efb3e062aac0569dae3306886018306e67d9b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page