Race and Ethnicity Prediction from names
Project description
RaceBERT -- A transformer based model to predict race and ethnicty from names
Installation
pip install racebert
Using a virtual environment is highly recommended! You may need to install pytorch as instructed here: https://pytorch.org/get-started/locally/
Paper
Todo
Usage
raceBERT predicts race (U.S census race) and ethnicity from names.
from racebert import RaceBERT
model = RaceBERT()
# To predict race
model.predict_race("Barack Obama")
>>> {"label": "nh_black", "score": 0.5196923613548279}
The race categories are:
Race | Label |
---|---|
Non-hispanic White | nh_white |
Hispanic | hispanic |
Non-hispanic Black | nh_black |
Asian & Pacific Islander | api |
American Indian & Alaskan Native | aian |
# Predict ethnicity
model.predict_ethnicty("Arjun Gupta")
>>> {"label": "Asian,IndianSubContinent", "score": 0.9612812399864197}
The ethnicity categories are:
Ethnicity |
---|
GreaterEuropean,British |
GreaterEuropean,WestEuropean,French |
GreaterEuropean,WestEuropean,Italian |
GreaterEuropean,WestEuropean,Hispanic |
GreaterEuropean,Jewish |
GreaterEuropean,EastEuropean |
Asian,IndianSubContinent |
Asian,GreaterEastAsian,Japanese |
GreaterAfrican,Muslim |
Asian,GreaterEastAsian,EastAsian |
GreaterEuropean,WestEuropean,Nordic |
GreaterEuropean,WestEuropean,Germanic |
GreaterAfrican,Africans |
GPU
If you have a GPU, you can speed up the computation by specifying the CUDA device when you instantiate the model.
from racebert import RaceBERT
model = RaceBERT(device=0)
# predict race in batch
model.predict_race(["Barack Obama", "George Bush"])
>>>
[
{"label": "nh_black", "score": 0.5196923613548279},
{"label": "nh_white", "score": 0.8365859389305115}
]
# predict ethnicity in batch
model.predict_ethnicity(["Barack Obama", "George Bush"])
HuggingFace
Alternatively, you can work with the transformers models hosted on the huggingface hub directly.
- Race Model: https://huggingface.co/pparasurama/raceBERT
- Ethnicity Model: https://huggingface.co/pparasurama/raceBERT-ethnicity
Please refer to the transformers documentation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file racebert-1.1.0.tar.gz
.
File metadata
- Download URL: racebert-1.1.0.tar.gz
- Upload date:
- Size: 3.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.5.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2078e1b813368f2a23c48df448bf9aa49bcfd772092c5405791bd829637ffbf5 |
|
MD5 | 50d4af6d59e7ec11605c81f8ec789097 |
|
BLAKE2b-256 | 8fc4f15c3e3fe9929323a628e1c08d82e278bf533e9e4df173e9891d5df93718 |
File details
Details for the file racebert-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: racebert-1.1.0-py3-none-any.whl
- Upload date:
- Size: 3.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.5.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb2c78a0ef9bb282f1980c29950edea054d91ae1175ebbd4996ff23e1a29e8e6 |
|
MD5 | e18de9ca6c8ce682ec3d4f26e9e1f683 |
|
BLAKE2b-256 | 62b3ce1665a9c5c1cae6592d79c3b83d782982999a830776e1ebf2439e0de180 |