Skip to main content

Thai person name classifier

Project description

nameseer ['name-seer'] - a Thai person name classifier

nameseer ('name-seer') is a python library for Thai name classification. It can determine whether a given name is a Thai person name or a Thai corporate name. nameseer uses linguistics features within the name to determine whether the name is a Thai person name or not. nameseer comes with a pre-trained model that was trained on 700,000 company names and 700,000 person names, with 0.994 accuracy when tested on 400,000 unseen names.

nameseer is based on the name-ethnicity classifier, orginally proposed here:

Treeratpituk, Pucktada, and C. Lee Giles. "Name-ethnicity classification and ethnicity-sensitive name matching." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 26. No. 1. 2012.

Paper URL : https://ojs.aaai.org/index.php/AAAI/article/download/8324/8183

Requirements

  • abydos
  • scikit-learn
  • nltk
  • pythainlp
  • python = 3.7.6+

Installation

nameseer can be installed using pip

pip install nameseer

Usages

Once installed, you can use nameseer within your python code to classify whether a Thai name is a person name or a corporate name.

>>> from nameseer import NameClassifier

>>> nc = NameClassifier.load_pretrained_model()
>>> nc.classify_names(['ประยุทธ์ จันทร์โอชา','แอดวานซ์อินโฟร์เซอร์วิส'])
['person', 'company']

Citation

Treeratpituk, Pucktada (2022). Nameseer: a Thai person name classifier. Mar 29, 2022. See https://github.com/botx/nameseer

Author

Pucktada Treeratpituk, Bank of Thailand (pucktadt@bot.or.th)

License

This project is licensed under the Apache Software License 2.0 - see the LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nameseer-0.1.3.tar.gz (4.9 MB view details)

Uploaded Source

Built Distribution

nameseer-0.1.3-py3-none-any.whl (5.1 MB view details)

Uploaded Python 3

File details

Details for the file nameseer-0.1.3.tar.gz.

File metadata

  • Download URL: nameseer-0.1.3.tar.gz
  • Upload date:
  • Size: 4.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for nameseer-0.1.3.tar.gz
Algorithm Hash digest
SHA256 1b54a395f9a7f2420e438631963bae7742916507ccc12a41cd319dae959cd995
MD5 fd820d69e1669524deb50cb1be7c6ce0
BLAKE2b-256 1d6fe4f23f0aaad4341da6508caefa1a340f1680564a3ef0f7ad4847a1d63209

See more details on using hashes here.

File details

Details for the file nameseer-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: nameseer-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for nameseer-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a62a57df11595156a93666c3074acf77754f6a715065506ddbf47b5abeec2b4f
MD5 ef94b8f8781e68aa3c9e0ad94e50e3c1
BLAKE2b-256 55e74911e349298a8065f9a23b24093b2bfdee4b84bcd237175aeb66cb55ba13

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page