The universal contact field mapper โ route messy, inconsistent contact data to a clean canonical schema.
Project description
๐ Rolodexter
The universal contact field mapper.
Route messy, inconsistent contact data from any source to a clean, canonical schema.
The Problem
Every CRM, email platform, and CSV export uses different field names for the same data:
| Service | First Name | Phone | Company |
|---|---|---|---|
| HubSpot | firstname |
mobilephone |
company |
| Salesforce | FirstName |
MobilePhone |
Company |
| Mailchimp | FNAME |
PHONE |
COMPANY |
| Google CSV | Given Name |
Phone 1 - Value |
Organization 1 - Name |
| Random CSV | Column A |
Column B |
Column C |
The Solution
from rolodexter import ContactMapper
mapper = ContactMapper()
result = mapper.map_payload({
"fname": "jane",
"surname": "doe",
"mobile": "+1-555-019-9876",
"employer": "Tech Corp",
"Column 1": "jane.doe@example.com", # auto-detected by shape
})
print(result.normalized)
# {
# "first_name": "Jane",
# "last_name": "Doe",
# "phone": "+15550199876",
# "company": "Tech Corp",
# "email": "jane.doe@example.com"
# }
Installation
# Core (zero dependencies)
pip install rolodexter
# With fuzzy matching for typo recovery
pip install rolodexter[fuzzy]
# With on-demand i18n translation (40 languages)
pip install rolodexter[i18n]
# Everything
pip install rolodexter[all]
# Development
pip install rolodexter[dev]
Features
๐ฏ Four-Layer Matching Pipeline
Every field runs through the strategy chain in priority order:
- Service Match โ instant lookup against 20+ platform-specific dictionaries
- Exact Match โ O(1) hit against 300+ known aliases
- Fuzzy Match โ
rapidfuzzcatches typos like"phne_nmbr"โphone - Heuristic Match โ regex detects emails, phones, URLs, postal codes by data shape
๐ Confidence Scoring
Every match comes with a confidence score (0.0โ1.0):
match = mapper.identify("fname")
# FieldMatch(original='fname', canonical='first_name', confidence=1.0, strategy='exact')
match = mapper.identify("phne")
# FieldMatch(original='phne', canonical='phone', confidence=0.85, strategy='fuzzy')
match = mapper.identify("Column X", value="jane@test.com")
# FieldMatch(original='Column X', canonical='email', confidence=0.6, strategy='heuristic')
๐ 20+ Service Profiles
Built-in mappings for:
| CRM / Sales | Email / Marketing | Productivity | Other |
|---|---|---|---|
| HubSpot | Mailchimp | Google Contacts | Stripe |
| Salesforce | SendGrid | Apple Contacts | Notion |
| Pipedrive | Brevo (Sendinblue) | Outlook | Airtable |
| Zoho | ConvertKit (Kit) | LinkedIn Export | โ |
| Close CRM | ActiveCampaign | โ | โ |
| Freshsales | Omnisend | โ | โ |
| โ | Beehiiv | โ | โ |
| โ | Resend | โ | โ |
| โ | Intercom | โ | โ |
๐ On-Demand i18n (40 Languages)
English ships by default. Request any of 40 supported languages and aliases are generated on the fly via Google Translate, then cached so translation only happens once:
from rolodexter import ContactMapper
# Load Spanish aliases on demand
mapper = ContactMapper(languages=["es"])
result = mapper.map_payload({"correo_electronico": "juan@example.com"})
print(result.normalized["email"]) # juan@example.com
# CLI: generate and cache all 40 languages
python -m rolodexter.i18n
# Or specific languages
python -m rolodexter.i18n --languages es,fr,de
# List supported languages
python -m rolodexter.i18n --list
๐ Cross-Service Translation
# Translate HubSpot data directly to Salesforce schema
salesforce_data = mapper.translate(
hubspot_payload,
from_service="hubspot",
to_service="salesforce",
)
๐งน Value Normalization
Automatic cleanup on matched fields:
- Phone โ strips formatting, adds
+for international - Email โ lowercase, trimmed
- Names โ title case with particle awareness (
"jane van der berg"โ"Jane van der Berg") - Addresses โ excess whitespace collapsed, title-cased
๐ฆ Batch Processing
results = mapper.map_batch([contact1, contact2, contact3, ...])
๐ Rich Diagnostics
result = mapper.map_payload(data)
print(result.match_rate) # 0.857
print(result.matched_count) # 6
print(result.unmatched_count) # 1
print(result.to_dict()) # Full JSON-serializable report
API Reference
ContactMapper
ContactMapper(
*,
patterns=None, # Custom pattern dict
patterns_path=None, # Path to custom patterns.json
default_service=None, # Default service profile
normalize=True, # Apply value normalization
strategies=None, # Override strategy pipeline
languages=None, # i18n: None=English only, "es", ["es","fr"], "all"
)
Methods:
| Method | Description |
|---|---|
identify(header, *, value, service) |
Resolve a single field header |
map_payload(payload, *, service) |
Normalize an entire dict |
map_batch(payloads, *, service) |
Process multiple payloads |
translate(payload, *, from_service, to_service) |
Cross-service translation |
CanonicalField
Enum of all 50+ canonical fields. Inherits from str for JSON compatibility:
from rolodexter import CanonicalField
assert CanonicalField.EMAIL == "email"
assert CanonicalField.PHONE.value == "phone"
Custom Patterns
custom = {
"fields": {
"first_name": ["fname", "given", "nombre"],
"loyalty_tier": ["tier", "vip_level", "membership"],
},
"services": {
"my_crm": {
"contact_first": "first_name",
"loyalty": "loyalty_tier",
}
}
}
mapper = ContactMapper(patterns=custom)
Architecture
rolodexter/
โโโ __init__.py # Public API
โโโ core.py # ContactMapper, PatternRegistry, strategies, normalizers
โโโ _phone.py # Built-in E.164 phone parser (zero deps)
โโโ i18n.py # On-demand i18n generator (40 languages, cached)
โโโ _data/
โโโ patterns.json # Master truth table (550+ aliases, 20+ services)
โโโ i18n/ # Cached language files (generated on demand)
Contributing
git clone https://github.com/rolodexter/rolodexter.git
cd rolodexter
pip install -e ".[dev]"
pytest
License
MIT โ see LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rolodexter-2.6.0.tar.gz.
File metadata
- Download URL: rolodexter-2.6.0.tar.gz
- Upload date:
- Size: 63.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56c6454eb6b703463714752a94ae724ddc48d55b9d5d111bb41b089ce2caa688
|
|
| MD5 |
7190152815289f3e2832ed9b7ca55c6d
|
|
| BLAKE2b-256 |
7631aa200e0b3eb751660df2ba6246d1ef68cc57aa2f3cc936d5432cf8256d3f
|
File details
Details for the file rolodexter-2.6.0-py3-none-any.whl.
File metadata
- Download URL: rolodexter-2.6.0-py3-none-any.whl
- Upload date:
- Size: 32.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
196b837625fc58fc93f502318ad978223e759d2bd38c163e30c4182102037eca
|
|
| MD5 |
ad29650ed2aae8ec02d59ab55db8001e
|
|
| BLAKE2b-256 |
917784e57d5897d7f3c73d619a73dc4c675ebe12984c41b55ebaaf53071b0e7f
|