⚡ Ultra-fast name-to-gender prediction engine. Uses mmap and binary search for <1ms lookups. Features a pre-compiled 4.6MB binary database covering 700k+ global entries. Built for scale.

These details have not been verified by PyPI

Project links

Homepage

Project description

gender-detect

A high-performance, binary-search based gender detection library and CLI tool. This project uses a pre-compiled binary database to predict gender and country of origin based on first names with sub-millisecond latency using memory mapping (mmap).

Features

Extreme Speed: Uses binary search (O(log n)) on a packed binary database with mmap for zero-copy lookups.
Zero-Dependency: Built entirely using Python standard libraries.
CLI Ready: Includes a built-in table-formatted command line interface.
Privacy Focused: 100% local; no external API calls or data tracking.

Installation

pip install gender-detect

CLI Usage

After installation, you can use the gender-detect command directly from your terminal:

gender-detect John

For automation, you can output the result in raw JSON:

gender-detect John --json

Library Usage

Simple Prediction

Input a name to get a statistical analysis of the likely gender and primary origin.

from gender_detect import GenderDetector

gd = GenderDetector()
result = gd.predict("John")

print(result)

Response Format

The gender_probability represents the likelihood of the gender being correct based on total global samples.

{
  "name": "john",
  "likely_gender": "male",
  "gender_probability": 0.83,
  "top_reported_country": "US",
  "data_breakdown": [
    {
      "country": "US",
      "male_samples": 4,
      "female_samples": 1
    },
    {
      "country": "GB",
      "male_samples": 1,
      "female_samples": 0
    }
  ]
}

How it Works

The library utilizes a custom packed binary format (4sHBB):

4 bytes: BLAKE2b hash prefix of the name.
2 bytes: ISO-3166-1 numeric country code.
1 byte: Male sample count.
1 byte: Female sample count.

By sorting these 8-byte entries by their hash, the library performs a binary search directly on the file disk/memory, ensuring a tiny memory footprint regardless of database size.

Contribution

Data contributions are managed through contribute.json in the main repository.

Add your name data to the JSON list.
Ensure country_code is the numeric ISO-3166-1 value.
Submit a Pull Request.

The CI/CD pipeline automatically validates the JSON and recompiles the names.bin database upon merging.

License

MIT - See LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Apr 9, 2026

gender-detect 0.1.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Project description

gender-detect

Features

Installation

CLI Usage

Library Usage

Simple Prediction

Response Format

How it Works

Contribution

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Release history Release notifications | RSS feed