Clean, functional data processing for human-centric applications. Normalize and standardize names, emails, phones, departments, and job titles with a single unified API.

These details have not been verified by PyPI

Project links

Project description

HumanMint v2

HumanMint cleans and normalizes messy contact data with one call. It standardizes names, emails, phones, addresses, departments, titles, and organizations. It's built for both public-sector data and B2B (CEOs, VPs, directors, managers) and ships with curated public-sector mappings.

from humanmint import mint

result = mint(
    name="Dr. John Q. Smith, PhD",
    email="JOHN.SMITH@CITY.GOV",
    phone="(202) 555-0173 ext 456",
    department="001 - Public Works Dept",
    title="Chief of Police",
    address="123 N. Main St Apt 4B, Madison, WI 53703",
    organization="City of Madison Police Department",
)

result.name_standardized          # "John Q Smith"
result.email_standardized         # "john.smith@city.gov"
result.phone_pretty               # "+1 202-555-0173"
result.department_canonical       # "Public Works"
result.title_canonical            # "police chief"
result.address_canonical          # "123 N. Main St Apt 4B Madison WI 53703 US"

Multi-person splitting:

mint(name="John and Jane Smith", split_multi=True)
# -> [MintResult(John Smith), MintResult(Jane Smith)]

Why HumanMint

General-purpose: works for government and B2B without swapping libraries.
Real-world chaos: handles titles inside names, departments with codes/phones, smashed addresses, anti-scraper emails, casing quirks.
Unique data: 23K+ department variants -> 64 categories; 73K+ titles with curated canonicals + BLS; context-aware title mapping.
Safe defaults: length guards, optional aggressive cleaning, semantic conflict checks, bulk dedupe, multi-person name splitting.
Fast: lazy imports for quick startup, process-based bulk for CPU-bound speed, built-in dedupe to avoid redundant work.

AI extraction (optional)

Install the ML extra (pip install humanmint[ml]) and pass text= with use_gliner=True to extract from unstructured text, then normalize. Structured fields you pass always win. GLiNER extraction is experimental; prefer structured inputs when available.

from humanmint.gliner import GlinerConfig
result = mint(text=signature_block, use_gliner=True, gliner_cfg=GlinerConfig(threshold=0.85))

Installation

pip install humanmint
# Optional extras:
#   pip install humanmint[address]  # usaddress parsing
#   pip install humanmint[pandas]   # DataFrame helpers
#   pip install humanmint[ml]       # GLiNER2 extraction

Quickstart

from humanmint import mint, compare, bulk

r1 = mint(name="Jane Doe", email="jane.doe@city.gov", department="Public Works", title="Engineer")
r2 = mint(name="J. Doe",  email="JANE.DOE@CITY.GOV", department="PW Dept",       title="Public Works Engineer")

score, why = compare(r1, r2, explain=True)

records = [
    {"name": "Alice", "email": "alice@example.com"},
    {"name": "Bob",   "email": "bob@example.com"},
]
results = bulk(records, workers=4)

Access Patterns

Dicts: result.title["canonical"], result.department["canonical"], result.department["category"]
Properties: name_standardized, title_canonical, department_canonical, email_standardized, phone_standardized, address_canonical, organization_canonical
Full dicts: result.title, result.department, result.email, etc.

Recommended Properties

Names: name_standardized, name_first, name_last, name_middle, name_suffix, name_gender, name_nickname
Name extras: name_salutation (Mr./Ms./Mx.)
Emails: email_standardized, email_domain, email_is_valid, email_is_generic_inbox, email_is_free_provider
Phones: phone_standardized, phone_e164, phone_pretty, phone_extension, phone_is_valid, phone_type, phone_location, phone_time_zones
Departments: department_canonical, department_category, department_normalized, department_override
Titles: title_canonical, title_raw, title_normalized, title_is_valid, title_confidence, title_seniority
Addresses: address_canonical, address_raw, address_street, address_unit, address_city, address_state, address_zip, address_country
Organizations: organization_raw, organization_normalized, organization_canonical, organization_confidence

Use result.get("email.is_valid") to fetch nested dict values via dot paths.

Comparing Records

from humanmint import compare
score, reasons = compare(r1, r2, explain=True)  # 0->100

Batch & Export

from humanmint import bulk, export_json, export_csv, export_parquet, export_sql

# Process records in parallel
results = bulk(records, workers=4, progress=True)

# Export results to various formats
export_json(results, "out.json")
export_csv(results, "out.csv", flatten=True)

# Note: For per-record overrides (dept_overrides, title_overrides), include them in each record dict
records_with_overrides = [
    {**rec, "dept_overrides": {"IT": "Information Technology"}}
    for rec in records
]
results = bulk(records_with_overrides, workers=4)

CLI

humanmint clean input.csv output.csv --name-col name --email-col email --phone-col phone --dept-col department --title-col title

Performance (current)

Cold import: ~0.5 s (with pandas installed).
First call warm-up: ~0.5 s (loads caches).
Bulk: process-based parallelism; throughput scales with cores and workload size.

Notes

US-focused address parsing; usaddress used when available, otherwise heuristics.
Optional deps (pandas, pyarrow, sqlalchemy, rich, tqdm) enhance exports and progress bars.
Department and title datasets are curated and updated regularly for best accuracy.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.0.1

Dec 3, 2025

2.0.1b0 pre-release

Dec 3, 2025

2.0.0

Dec 1, 2025

2.0.0b3 pre-release

Dec 1, 2025

2.0.0b2 pre-release

Dec 1, 2025

2.0.0b1 pre-release

Dec 1, 2025

0.1.17

Dec 1, 2025

0.1.14

Nov 28, 2025

0.1.13

Nov 28, 2025

0.1.12

Nov 28, 2025

0.1.11

Nov 28, 2025

0.1.10

Nov 28, 2025

0.1.8

Nov 28, 2025

0.1.7

Nov 28, 2025

0.1.6

Nov 28, 2025

0.1.5

Nov 28, 2025

0.1.4

Nov 28, 2025

0.1.3

Nov 28, 2025

0.1.2

Nov 28, 2025

0.1.1

Nov 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

humanmint-2.0.1.tar.gz (1.9 MB view details)

Uploaded Dec 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

humanmint-2.0.1-py3-none-any.whl (2.0 MB view details)

Uploaded Dec 3, 2025 Python 3

File details

Details for the file humanmint-2.0.1.tar.gz.

File metadata

Download URL: humanmint-2.0.1.tar.gz
Upload date: Dec 3, 2025
Size: 1.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for humanmint-2.0.1.tar.gz
Algorithm	Hash digest
SHA256	`51931e970d722ed8c278e346f4b07827397fdfac5d0b31cc5bcf18d1533a21ef`
MD5	`c9fa26c8b26398ea8db01a862d3c3c65`
BLAKE2b-256	`dcac3236f0622a55f0c53067befd30e69b6bee940b196a9e015547706a2c0fa9`

See more details on using hashes here.

File details

Details for the file humanmint-2.0.1-py3-none-any.whl.

File metadata

Download URL: humanmint-2.0.1-py3-none-any.whl
Upload date: Dec 3, 2025
Size: 2.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for humanmint-2.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`05504f2567671fb3624eb4df8d4c0d16adf791829949cb60ec4ab9890d8d8aa8`
MD5	`b0492afc1ffdce7ba9d4ce8cdd00209b`
BLAKE2b-256	`876e9ea2129224cfa497c5616c308f46b7c9bf327084eed431937524baa7d194`

See more details on using hashes here.

humanmint 2.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

HumanMint v2

Why HumanMint

AI extraction (optional)

Installation

Quickstart

Access Patterns

Recommended Properties

Comparing Records

Batch & Export

CLI

Performance (current)

Notes

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes