Skip to main content

Predict gender, country, and region from names using a byte-level transformer.

Project description

nameprediction

nameprediction is a Python package for predicting gender, country, and region from personal names using a byte-level transformer.

The repository contains both:

  • the reusable inference package under src/nameprediction
  • the original training and local inference scripts used to produce checkpoints

Install

For local development:

pip install -e .

For users after publication:

pip install nameprediction

Quick Start

1. Download a model from Hugging Face

from nameprediction import download_model

model_path = download_model(
   repo_id="romor/nameprediction",
   filename="name_gender_country_model_v14.pth",
)

The same flow is available from the CLI:

nameprediction-download-model --repo-id romor/nameprediction --filename name_gender_country_model_v14.pth

Direct model URLs still work for backward compatibility:

model_path = download_model("https://example.com/name_gender_country_model_v14.pth")

2. Load the predictor

from nameprediction import NamePredictor

predictor = NamePredictor.from_pretrained(model_path=model_path)

If you want the package to download from Hugging Face directly while loading, use:

predictor = NamePredictor.from_pretrained(
   repo_id="YOUR_HF_USER_OR_ORG/nameprediction-model",
   filename="name_gender_country_model_v14.pth",
)

Predict Single Names

result = predictor.predict_name("Ada Lovelace")
print(result)
print(result.to_dict())

predict_name returns a NamePrediction dataclass with these fields:

  • name
  • predicted_gender
  • f_prob
  • predicted_country
  • predicted_country_confidence
  • predicted_region
  • predicted_region_confidence

Predict Lists Of Names

results = predictor.predict_names([
   "Ada Lovelace",
   "Alan Turing",
   "Grace Hopper",
])

You can also use the convenience function:

from nameprediction import predict_names

results = predict_names(
   ["Ada Lovelace", "Alan Turing"],
   repo_id="YOUR_HF_USER_OR_ORG/nameprediction-model",
   filename="name_gender_country_model_v14.pth",
)

Predict DataFrames

import pandas as pd
from nameprediction import predict_dataframe

df = pd.DataFrame({"name": ["Ada Lovelace", "Alan Turing"]})

predicted = predict_dataframe(
   df,
   repo_id="YOUR_HF_USER_OR_ORG/nameprediction-model",
   filename="name_gender_country_model_v14.pth",
   name_col="name",
   column_suffix="_v15",
)

This appends:

  • predicted_gender_v15
  • f_prob_v15
  • predicted_country_v15
  • predicted_country_confidence_v15
  • predicted_region_v15
  • predicted_region_confidence_v15

Package Notes

  • The package derives model architecture settings such as embed_dim, num_heads, num_layers, and max_len from the checkpoint when available.
  • Label encoders are bundled with the package and validated against the checkpoint class counts at load time.
  • If no default Hugging Face repo is configured, callers must pass repo_id explicitly or set the environment variable NAMEPREDICTION_HF_REPO_ID.

Repository Layout

  • src/nameprediction: package code intended for PyPI users
  • train.py: original training script
  • inference.py: legacy local inference helper script
  • config.yaml: local training and inference configuration

Publishing

Maintainer instructions for PyPI and Hugging Face are in PUBLISHING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nameprediction-0.1.0.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nameprediction-0.1.0-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file nameprediction-0.1.0.tar.gz.

File metadata

  • Download URL: nameprediction-0.1.0.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for nameprediction-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3e1541a81ce3e92ae3f3cbbfbbce2a37345d5f44e237a146b4f8a95af9c9567d
MD5 32b68626f69d286d1e7ad21710d64668
BLAKE2b-256 f3c9cc19bda43f8a57650fa73accefa0b15e506fb73abb1ac2aaefbdf3845148

See more details on using hashes here.

File details

Details for the file nameprediction-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: nameprediction-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for nameprediction-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b1e0262beed53089ef1cdc7a69b72f6111f332382e92b470279f8ddbe64fab63
MD5 e68ec0eefa590ae49d73b1f050684755
BLAKE2b-256 b82b96a953856bd8928553f77d603e7d7d7f272b39ac9d0d6bd86c1f684d7f59

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page