Skip to main content

unifying person names in different notations

Project description

Person Name Normalisation

Unifying person names in different notations

different sources write person names in different notations:

  • Firstname Secondname Lastname
  • Lastname, Firstname Secondname

also extracted are:

  • academic degrees (e.g. 'Dr.', 'Ph.D.')
  • name prefixes (e.g. 'van ter', 'von', 'De')

included: german, french, italian, dutch

missing: spanish, portuguese

missing: double Lastnames in Spanish

Installation

pip install personnamenorm

Usage

import personnamenorm as pnn
nameobj = pnn.namenorm('Dr. Dipl. Firstname Secondname von und zu Lastname')
results in
nameobj.name <dict>
{
    'raw': 'Dr. Dipl. Firstname von und zu Lastname',
    'Firstname': ['Firstname','Secondname'],
    'Lastname': ['Lastname'],
    'title': ['Dr.','Dipl.'],
    'prefix': ['von und zu']
}

nameobj.fullname <str>
'von und zu Lastname, Firstname Secondname'

nameobj.fullname_abbrev <str>
'von und zu Lastname, F S'

more examples can be found in this file on github.

Debug-mode

by default debug mode is off.

activating the debug mode

nameobj = pnn.namenorm(<str>, True)

returns additional information as logging message.

  • used annotation dictionary
  • annotated input string as list of tuples

Logging

logging is implemented

  • writes to std-out if logging IS NOT enabled before
  • writes to the existing logging handler if other logging IS enabled before

Test

see folder 'tests' on github.

python test_personnamenorm.py

Project details


Release history Release notifications | RSS feed

This version

0.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

personnamenorm-0.2-py3-none-any.whl (6.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page