Skip to main content

Predict gender, country, and region from names using a byte-level transformer.

Project description

nameprediction

nameprediction is a Python package for predicting gender, country, and region from personal names using a byte-level transformer.

The repository contains both:

  • the reusable inference package under src/nameprediction
  • the original training and local inference scripts used to produce checkpoints

Install

For local development:

pip install -e .

For users after publication:

pip install nameprediction

Quick Start

1. Download a model from Hugging Face

from nameprediction import download_model

model_path = download_model()

The same flow is available from the CLI:

nameprediction-download-model

By default this downloads from romor/nameprediction and fetches name_gender_country_model_v14.pth.

You can override the source explicitly if needed:

model_path = download_model(repo_id="romor/nameprediction")

Direct model URLs also work:

model_path = download_model("https://example.com/name_gender_country_model_v14.pth")

2. Load the predictor

from nameprediction import NamePredictor

predictor = NamePredictor.from_pretrained()

If you already downloaded the model and want to point at a specific local path, use:

predictor = NamePredictor.from_pretrained(
   model_path=model_path,
)

Predict Single Names

result = predictor.predict_name("Ada Lovelace")
print(result)
print(result.to_dict())

predict_name returns a NamePrediction dataclass with these fields:

  • name
  • predicted_gender
  • f_prob
  • predicted_country
  • predicted_country_confidence
  • predicted_region
  • predicted_region_confidence

Predict Lists Of Names

results = predictor.predict_names([
   "Ada Lovelace",
   "Alan Turing",
   "Grace Hopper",
])

You can also use the convenience function:

from nameprediction import predict_names

results = predict_names(["Ada Lovelace", "Alan Turing"])

Predict DataFrames

import pandas as pd
from nameprediction import predict_dataframe

df = pd.DataFrame({"name": ["Ada Lovelace", "Alan Turing"]})

predicted = predict_dataframe(
   df,
   name_col="name",
   column_suffix="_v15",
)

This appends:

  • predicted_gender_v15
  • f_prob_v15
  • predicted_country_v15
  • predicted_country_confidence_v15
  • predicted_region_v15
  • predicted_region_confidence_v15

Package Notes

  • The package derives model architecture settings such as embed_dim, num_heads, num_layers, and max_len from the checkpoint when available.
  • Label encoders are bundled with the package and validated against the checkpoint class counts at load time.
  • The default published model source is romor/nameprediction with filename name_gender_country_model_v14.pth.
  • You can override the default model source with repo_id=..., filename=..., or the environment variables NAMEPREDICTION_HF_REPO_ID, NAMEPREDICTION_MODEL_FILENAME, and NAMEPREDICTION_MODEL_REVISION.

Repository Layout

  • src/nameprediction: package code intended for PyPI users
  • train.py: original training script
  • inference.py: legacy local inference helper script
  • config.yaml: local training and inference configuration

Publishing

Maintainer instructions for PyPI and Hugging Face are in PUBLISHING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nameprediction-0.1.1.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nameprediction-0.1.1-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file nameprediction-0.1.1.tar.gz.

File metadata

  • Download URL: nameprediction-0.1.1.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for nameprediction-0.1.1.tar.gz
Algorithm Hash digest
SHA256 817677311fc1750e90c57d965ba61181623f9ae490eb0b9e40c26cf70f747dab
MD5 4253d9f1dd4c8e3fe4b529bd18db6e42
BLAKE2b-256 816813f415a4debf3e16666cdc7b948f4e5b277812921b24558353beab3078c4

See more details on using hashes here.

File details

Details for the file nameprediction-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: nameprediction-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for nameprediction-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5df2650daeaf7ae1ea2fed05a04f4c67b87bcbe77fe6d711782c69270711e434
MD5 1e06e1ac3387a49ff7985ed5cf6668f1
BLAKE2b-256 2e03666f793c989d658f68d091da5cb07ab769b883865c5cf5313b5d54586005

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page