Skip to main content

Identify family/given names and capitalize correctly

Project description

README

nameutils - Identify given/family names and capitalize correctly

Description

nameutils is a python module containing functions that can split a person's full name into their given and family names, and capitalize the letters appropriately. It understands complex names in Latin scripts from many different languages, and it understands Chinese, Japanese, and Korean names, in both their own characters, and romanized.

This module is useful when receiving a person's name that might be all uppercase, or in the wrong case, or it might have the given names and the family name combined in a single string (e.g., a single spreadsheet column), and you need to split the full name into its parts, and you want to set the capitalization correctly so as to show each person a little respect by taking the trouble to at least try to get their name right.

Getting the case right for people's names is difficult, and many software systems address this problem by not even trying, and using uppercase exclusively. It's ugly, but it's easy and consistent. We can do better. It can't be perfect, by default, but with ongoing adjustments to suit your evolving dataset, you can improve it to meet your needs.

People with complex grammatical aristocratic/topographic/patronymic family names often don't know how their own names should be capitalized. Or at least, they don't know how their own ancestors capitalized their name, or they know, but they disagree with it. Some people insist on having it their own way, and that's fine. This module, by default, prefers how their ancestors would have capitalized their names, but people can do whatever they want to their own names, and it's important to them, so this module supports general exceptions that apply to everyone with a particular family name, for when the default behaviour is definitely wrong, and it also supports exceptions that apply only to individuals who report that it is wrong for them.

Note: This module doesn't handle every name on Earth. Apart from Chinese, Japanese, and Korean family names, it only understands names written in Latin scripts, except perhaps by lucky accident. For example, names in Cyrillic work. It doesn't handle honorifics, titles, joined initials, or postnominals. It only handles names. But it does handle complex names coming from a variety of places (e.g., British Isles, Europe, Middle East, Africa, East Asia, Pacifika, Americas). By default, it doesn't correctly identify unhyphenated multi-name family names (like Spanish and Catalan names, unless the formal "y" or "i" is present). Such names need to be handled with split exceptions. It handles some mixed case names like McAdam, MacArthur, FitzSimmons, DeVito, VanZandt, etc., but there will be false negatives (and arguably false positives) which can be corrected with case exceptions. Over time, you will build up a set of case exceptions and split exceptions that meets the needs of your dataset.

This is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 3 or later.

Documentation

There is a manual entry:

Download

nameutils is on PyPI:

And can be installed using pip:

    python3 -m pip install nameutils

Requirements

nameutils is a python module that should work on systems with any version of python3. It depends on the non-standard regex module.


URL: https://raf.org/nameutils
GIT: https://github.com/rafmod/nameutils
GIT: https://codeberg.org/rafmod/nameutils
Date: 20250708
Author: raf <raf@raf.org>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nameutils-1.0.1.tar.gz (79.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nameutils-1.0.1-py3-none-any.whl (70.0 kB view details)

Uploaded Python 3

File details

Details for the file nameutils-1.0.1.tar.gz.

File metadata

  • Download URL: nameutils-1.0.1.tar.gz
  • Upload date:
  • Size: 79.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for nameutils-1.0.1.tar.gz
Algorithm Hash digest
SHA256 266a7c09fc24db664ff1185cc97ad19cc25dede59a82d97f8a2d0ea446cda103
MD5 248686cdf244f1d5892b2a52b2d8fdbf
BLAKE2b-256 67958073f9c62775f057c4cbdd348e07d6014fe6f414fda13830885832e8f397

See more details on using hashes here.

File details

Details for the file nameutils-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: nameutils-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 70.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for nameutils-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c56f3b4d2b3a31d6408aff642233452f50e0667d4f8c2aea74e9f9e6609e8583
MD5 8c7d016bcdc50809e5da4f9724db797b
BLAKE2b-256 6bc1430407da6a550bce7d70f14d6d44828164eb130cb2d8224909835e6b112d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page