Infer Gender from Indian Names

These details have not been verified by PyPI

Project description

naampy: Infer Sociodemographic Characteristics from Indian Names

The ability to programmatically and reliably infer the social attributes of a person from their name can be useful for a broad set of tasks, from estimating bias in coverage of women in the media to estimating bias in lending against certain social groups. But unlike the American Census Bureau, which produces a list of last names and first names, which can (and are) used to infer the gender, race, ethnicity, etc., from names, the Indian government produces no such commensurate datasets. Hence inferring the relationship between gender, ethnicity, language group, etc., and names has generally been done with small datasets constructed in an ad-hoc manner.

We fill this yawning gap. Using data from the Indian Electoral Rolls (parsed data here), we estimate the proportion female, male, and [third sex]{.title-ref} (see here) for a particular [first name, year, and state.]{.title-ref}

Please also check out pranaam that uses land record data from Bihar to infer religion based on the name. The package uses indicate to transliterate Hindi to English.

Try it Online

Check out our interactive Streamlit App to test naampy with your own names!

Features

🚀 Easy to use: Simple API with just two main functions
📊 Data-driven: Based on millions of names from Indian Electoral Rolls
🎯 Accurate: Provides confidence scores with predictions
🗺️ State-specific: Get region-specific predictions for better accuracy
🤖 ML-powered: Neural network fallback for names not in database
📈 Comprehensive: Covers 31 states and union territories

Installation

Requirements

Python 3.11
pip or uv package manager

Install from PyPI

We strongly recommend installing naampy inside a Python virtual environment (see venv documentation):

pip install naampy

Or if you're using uv:

uv pip install naampy

Install from Source

To install the latest development version:

git clone https://github.com/appeler/naampy.git
cd naampy
pip install -e .

Quick Start

Basic Usage

import pandas as pd
from naampy import in_rolls_fn_gender, predict_fn_gender

# Create a DataFrame with names
names_df = pd.DataFrame({'name': ['Priyanka', 'Rahul', 'Anjali']})

# Get gender predictions from electoral roll data
result = in_rolls_fn_gender(names_df, 'name')
print(result[['name', 'prop_female', 'prop_male']])

Using the ML Model

For names not in the electoral roll database:

# Use the neural network model for predictions
names = ['Aadhya', 'Reyansh', 'Kiara']
predictions = predict_fn_gender(names)
print(predictions)

Detailed Usage Examples

Electoral Roll Data

import pandas as pd
from naampy import in_rolls_fn_gender

# Sample data
names = [{'name': 'gaurav'}, {'name': 'yasmin'}, {'name': 'deepti'}]
df = pd.DataFrame(names)

result = in_rolls_fn_gender(df, 'name')
print(result[['name', 'n_male', 'n_female', 'prop_female', 'prop_male']])

Output:

     name    n_male  n_female  prop_female  prop_male
0  gaurav   25625.0      47.0     0.001831   0.998169
1  yasmin      58.0    6079.0     0.990549   0.009451
2  deepti      35.0    5784.0     0.993985   0.006015

Machine Learning Predictions

from naampy import predict_fn_gender

# Names not in electoral roll database
names = ["nabha", "hrithik", "kiara", "reyansh"]
predictions = predict_fn_gender(names)
print(predictions)

Output:

      name pred_gender  pred_prob
0    nabha      female   0.755028
1  hrithik        male   0.922181
2    kiara      female   0.614125
3  reyansh        male   0.891234

How it Works

When you first run in_rolls_fn_gender, it downloads data from Harvard Dataverse to a local cache folder. Subsequent runs use the cached data for faster performance.

The package provides two complementary approaches:

Electoral Roll Data: Statistical data from millions of Indian voters
Machine Learning Model: Neural network trained on name patterns

For names not found in the electoral roll database, the package automatically falls back to the ML model.

Documentation

For comprehensive documentation, examples, and API reference, visit: https://appeler.github.io/naampy/

Authors

Suriyan Laohaprapanon, Gaurav Sood, and Rajashekar Chintalapati

Related Projects

appeler/pranaam — Predict religion based on names
appeler/outkast — Map last names to caste categories
appeler/parsernaam — AI-powered name parsing

🔗 Adjacent Repositories

appeler/pranaam — pranaam: predict religion based on name
appeler/outkast — Using data from over 140M+ Indians from the SECC 2011, we map last names to caste (SC, ST, Other)
appeler/parsernaam — AI name parsing. Predict first or last name using a DL model.
appeler/namesexdata — Data on international first names and sex of people with that name
appeler/graphic_names — Infer the gender of person with a particular first name using Google image search and Clarifai

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.8.0

Jun 18, 2026

0.7.0

Dec 1, 2025

0.6.1

Oct 7, 2025

0.6.0

Apr 17, 2023

0.5.0

Sep 19, 2022

0.4.2

Jul 22, 2022

0.4.1

Jun 30, 2022

0.4.0

May 14, 2022

0.3.0

Aug 14, 2021

0.2.0

Feb 4, 2020

0.1.0

Jan 30, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

naampy-0.8.0.tar.gz (8.3 MB view details)

Uploaded Jun 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

naampy-0.8.0-py3-none-any.whl (8.3 MB view details)

Uploaded Jun 18, 2026 Python 3

File details

Details for the file naampy-0.8.0.tar.gz.

File metadata

Download URL: naampy-0.8.0.tar.gz
Upload date: Jun 18, 2026
Size: 8.3 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for naampy-0.8.0.tar.gz
Algorithm	Hash digest
SHA256	`75f6a736f1ef5ce4e564136b17752864b78b58e6ebd5368d33f2b8c889929006`
MD5	`dd587f079c3ba9b8387839552e137a78`
BLAKE2b-256	`d5fc9119f5da9775a1a9dcac025e19780f8fe5aaa427d64c00823e563cc714ef`

See more details on using hashes here.

Provenance

The following attestation bundles were made for naampy-0.8.0.tar.gz:

Publisher: python-publish.yml on appeler/naampy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: naampy-0.8.0.tar.gz
- Subject digest: 75f6a736f1ef5ce4e564136b17752864b78b58e6ebd5368d33f2b8c889929006
- Sigstore transparency entry: 1856984380
- Sigstore integration time: Jun 18, 2026
Source repository:
- Permalink: appeler/naampy@1031f0ea49beb138f88dabcfcd6f69cb15a66be4
- Branch / Tag: refs/heads/master
- Owner: https://github.com/appeler
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@1031f0ea49beb138f88dabcfcd6f69cb15a66be4
- Trigger Event: workflow_dispatch

File details

Details for the file naampy-0.8.0-py3-none-any.whl.

File metadata

Download URL: naampy-0.8.0-py3-none-any.whl
Upload date: Jun 18, 2026
Size: 8.3 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for naampy-0.8.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3fa084e5050598639ebbd0517803c842cbff984a57f9f054e8a2eb7f4f355a6d`
MD5	`c8a553d1e5460cdb76d516ad9f15b0a3`
BLAKE2b-256	`2afe28888c0cc82899780c280c924ad713e4118604d263d32b8564d4703ba252`

See more details on using hashes here.

Provenance

The following attestation bundles were made for naampy-0.8.0-py3-none-any.whl:

Publisher: python-publish.yml on appeler/naampy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: naampy-0.8.0-py3-none-any.whl
- Subject digest: 3fa084e5050598639ebbd0517803c842cbff984a57f9f054e8a2eb7f4f355a6d
- Sigstore transparency entry: 1856984462
- Sigstore integration time: Jun 18, 2026
Source repository:
- Permalink: appeler/naampy@1031f0ea49beb138f88dabcfcd6f69cb15a66be4
- Branch / Tag: refs/heads/master
- Owner: https://github.com/appeler
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@1031f0ea49beb138f88dabcfcd6f69cb15a66be4
- Trigger Event: workflow_dispatch

naampy 0.8.0

Navigation

Verified details

Project links

Owner

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

naampy: Infer Sociodemographic Characteristics from Indian Names

Try it Online

Features

Installation

Requirements

Install from PyPI

Install from Source

Quick Start

Basic Usage

Using the ML Model

Detailed Usage Examples

Electoral Roll Data

Machine Learning Predictions

How it Works

Documentation

Authors

Related Projects

🔗 Adjacent Repositories

Project details

Verified details

Project links

Owner

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance