Skip to main content

Nationality prediction from name using ONNX Runtime

Project description

image image image

name2nat-onnx

name2nat-onnx is a lightweight inference-focused fork of name2nat.

The original project by Kyubyong Park predicts nationality from a name written in Roman letters. This fork keeps that same goal, but changes the deployment path so inference does not need Torch or Flair at runtime.

Why this project exists

The original package is built around a Flair model checkpoint. That works well for experimentation and training, but it pulls in a heavy runtime stack for simple prediction.

This fork exists to make the model easier to deploy when you care about:

  • smaller runtime dependencies
  • faster startup
  • CPU-friendly batch inference
  • processing very large name lists without shipping a Torch stack

What changed

This project converts the original trained bidirectional GRU classifier into an ONNX model and serves it through ONNX Runtime.

At a high level, inference now works like this:

  1. Normalize a name into a character sequence.
  2. Convert each character into an index from the original vocabulary.
  3. Run the ONNX model on a batch of encoded names.
  4. Return the top predicted nationalities.
  5. If the name exists in the bundled lookup table, return the exact dictionary hit with score 1.0.

The shipped runtime path is optimized for prediction only. Training provenance still comes from the original project.

How it was created

The ONNX model in this repository was produced from the original name2nat checkpoint.

The conversion flow is:

  1. Load the original Flair checkpoint.
  2. Rebuild the equivalent model in plain PyTorch.
  3. Copy the learned weights into the rebuilt model.
  4. Export that model to ONNX.
  5. Save the original vocabulary, labels, and dictionary lookup data in runtime-friendly formats.

The conversion script is in convert_to_onnx.py.

Disclaimer

The original author's disclaimer still applies: this project is not intended as a political statement. It is a statistical name-classification model, not a definitive statement of identity.

Installation

pip install name2nat-onnx

With uv:

uv init
uv add name2nat-onnx

Usage

from name2nat import Name2nat

predictor = Name2nat()

results = predictor(
  ["Kyubyong Park", "Takeshi Yamamoto", "Francois Dupont"],
  top_n=3,
)

for item in results:
  print(item)

For large input sets:

results = predictor(names, top_n=1, batch_size=4096)

Project Scope

This fork is mainly about runtime packaging and deployment.

If you want the full background on:

  • dataset construction
  • training details
  • the NaNa dataset
  • the original research motivation
  • the original Flair-based implementation

see the original repository:

Credit

Model idea, dataset creation, training pipeline, and original package design are from Kyubyong Park's name2nat project.

This fork focuses on converting that trained model into a faster, lighter ONNX Runtime package for inference.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

name2nat_onnx-1.0.0.tar.gz (15.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

name2nat_onnx-1.0.0-py3-none-any.whl (15.7 MB view details)

Uploaded Python 3

File details

Details for the file name2nat_onnx-1.0.0.tar.gz.

File metadata

  • Download URL: name2nat_onnx-1.0.0.tar.gz
  • Upload date:
  • Size: 15.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for name2nat_onnx-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d57a15639e018e5c3616e4902e85095c323cb4a0f76086c365a1c7cf88df10b0
MD5 29e72b69654801531b32d8ab737fa268
BLAKE2b-256 861390ba81c5ca8749ed7b4f04218f7dccf4ec9ef60540422c8fcd095783f8d0

See more details on using hashes here.

File details

Details for the file name2nat_onnx-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: name2nat_onnx-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 15.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for name2nat_onnx-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e80c58c57be15f41123d9822ffb739312e08052ba11c499d1b5c49c0454762fe
MD5 f81944710a67029b9a254a3c5e3d4d68
BLAKE2b-256 39eb7c2854021344478719b6bb42136e32205ee4f406889d62efec1ac78e8093

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page