Skip to main content

Hand-curated dataset of English given names and nicknames

Project description

CI PyPI version

Nicknames

A hand-curated CSV file containing English given names (first names) and their associated nicknames.

There are Python, SQL, Java, Perl, and R parsers provided for convenience.

This is a relatively large list with roughly 1100 canonical names. Any help from people to clean this list up and add to it is greatly appreciated. The first name in a line is the canonical name, and the rest are nicknames for that name.

This lookup file was initially created by mining this genealogy page from the Center for African American Research, Inc. Because the lookup originates from a dataset used for genealogy purposes there are old names that aren't commonly used these days, but there are recent ones as well. Examples are "gregory", "greg", or "geoffrey", "geoff". There was also a significant effort to make it machine readable, i.e. separate it with commas, remove human conventions like "rickie(y)" would need to be made into two different names "rickie", and "ricky". Due to the source of the original data, the dataset is heavily biased towards traditionally African American names. Names from other groups may or may not be present.

This project was created by Old Dominion University - Web Science and Digital Libraries Research Group. More information about the creation of this lookup can be found on this blog post about the creation of this library

Python API

The Python parser is available on PyPI from

pip install nicknames

and then you can do:

from nicknames import NickNamer

nn = NickNamer()

# Get the nicknames for a given name as a set of strings
nicks = nn.nicknames_of("Alexander")
assert isinstance(nicks, set)
assert "al" in nicks
assert "alex" in nicks

# Note that the relationship isn't symmetric: al is a nickname for alexander,
# but alexander is not a nickname for al.
assert "alexander" not in nn.nicknames_of("al")

# Capitalization is ignored and leading and trailing whitespace is ignored
assert nn.nicknames_of("alexander") == nn.nicknames_of(" ALEXANDER ")

# Queries that aren't found return an empty set
assert nn.nicknames_of("not a name") == set()

# The other useful thing is to go the other way, nickname to canonical:
# It acts very similarly to nicknames_of.
can = nn.canonicals_of("al")
assert isinstance(can, set)
assert "alexander" in can
assert "alex" in can

assert "al" not in nn.canonicals_of("alexander")

# You can combine these to see if two names are interchangeable:
union = nn.nicknames_of("al") | nn.canonicals_of("al")
are_interchangeable = "alexander" in union

For more advanced usage, such as loading your own data, read the source code.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nicknames-0.1.11.tar.gz (15.5 kB view details)

Uploaded Source

Built Distribution

nicknames-0.1.11-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file nicknames-0.1.11.tar.gz.

File metadata

  • Download URL: nicknames-0.1.11.tar.gz
  • Upload date:
  • Size: 15.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for nicknames-0.1.11.tar.gz
Algorithm Hash digest
SHA256 f68298784fdc96eeb16db8b74cea37338c6d642781935950ef6b4d6a51a87910
MD5 a92bdd04c357ff968a6753307e038d6e
BLAKE2b-256 33e152f6054a6930341bb6b5b162361f067ee0ef924143816bd82d1c56bcb147

See more details on using hashes here.

Provenance

The following attestation bundles were made for nicknames-0.1.11.tar.gz:

Publisher: release.yml on carltonnorthern/nicknames

Attestations:

File details

Details for the file nicknames-0.1.11-py3-none-any.whl.

File metadata

  • Download URL: nicknames-0.1.11-py3-none-any.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for nicknames-0.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 c815158cd4b93db07620416f32b5a21c5a415b4dc061c907c6f51aee47501599
MD5 cd65a01408eda50471b597e2cb6330e2
BLAKE2b-256 ea773d502bf13e1a7cff2f457457ebc0011a0bec7e4e5c96c26d9f880766268a

See more details on using hashes here.

Provenance

The following attestation bundles were made for nicknames-0.1.11-py3-none-any.whl:

Publisher: release.yml on carltonnorthern/nicknames

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page