Skip to main content

Race and Ethnicity Prediction from names

Project description

RaceBERT -- A transformer based model to predict race and ethnicty from names

Installation

pip install racebert

Using a virtual environment is highly recommended! You may need to install pytorch as instructed here: https://pytorch.org/get-started/locally/

Paper

Todo

Usage

raceBERT predicts race (U.S census race) and ethnicity from names.

from racebert import RaceBERT

model = RaceBERT()

# To predict race
model.predict_race("Barack Obama")
>>> {"label": "nh_black", "score": 0.5196923613548279}

The race categories are:

Race Label
Non-hispanic White nh_white
Hispanic hispanic
Non-hispanic Black nh_black
Asian & Pacific Islander api
American Indian & Alaskan Native aian
# Predict ethnicity
model.predict_ethnicty("Arjun Gupta")
>>> {"label": "Asian,IndianSubContinent", "score": 0.9612812399864197}

The ethnicity categories are:

Ethnicity
GreaterEuropean,British
GreaterEuropean,WestEuropean,French
GreaterEuropean,WestEuropean,Italian
GreaterEuropean,WestEuropean,Hispanic
GreaterEuropean,Jewish
GreaterEuropean,EastEuropean
Asian,IndianSubContinent
Asian,GreaterEastAsian,Japanese
GreaterAfrican,Muslim
Asian,GreaterEastAsian,EastAsian
GreaterEuropean,WestEuropean,Nordic
GreaterEuropean,WestEuropean,Germanic
GreaterAfrican,Africans

GPU

If you have a GPU, you can speed up the computation by specifying the CUDA device when you instantiate the model.

from racebert import RaceBERT

model = RaceBERT(device=0)

# predict race in batch
model.predict_race(["Barack Obama", "George Bush"])
>>>
[
        {"label": "nh_black", "score": 0.5196923613548279},
        {"label": "nh_white", "score": 0.8365859389305115}
]
# predict ethnicity in batch
model.predict_ethnicity(["Barack Obama", "George Bush"])

HuggingFace

Alternatively, you can work with the transformers models hosted on the huggingface hub directly.

Please refer to the transformers documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

racebert-1.1.0.tar.gz (3.1 kB view details)

Uploaded Source

Built Distribution

racebert-1.1.0-py3-none-any.whl (3.1 kB view details)

Uploaded Python 3

File details

Details for the file racebert-1.1.0.tar.gz.

File metadata

  • Download URL: racebert-1.1.0.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.5.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.0

File hashes

Hashes for racebert-1.1.0.tar.gz
Algorithm Hash digest
SHA256 2078e1b813368f2a23c48df448bf9aa49bcfd772092c5405791bd829637ffbf5
MD5 50d4af6d59e7ec11605c81f8ec789097
BLAKE2b-256 8fc4f15c3e3fe9929323a628e1c08d82e278bf533e9e4df173e9891d5df93718

See more details on using hashes here.

File details

Details for the file racebert-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: racebert-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.5.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.0

File hashes

Hashes for racebert-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cb2c78a0ef9bb282f1980c29950edea054d91ae1175ebbd4996ff23e1a29e8e6
MD5 e18de9ca6c8ce682ec3d4f26e9e1f683
BLAKE2b-256 62b3ce1665a9c5c1cae6592d79c3b83d782982999a830776e1ebf2439e0de180

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page