Skip to main content

Address validation helpers for Google's i18n address database

Project description

Google i18n address

codecov.io GH Actions PyPi downloads PyPi version PyPi pythons

This package contains a copy of Google's i18n address metadata repository that contains great data but comes with no uptime guarantees.

Contents of this package will allow you to programmatically build address forms that adhere to rules of a particular region or country, validate local addresses, and format them to produce a valid address label for delivery.

The package also contains a Python interface for address validation.

Addresses validation

The normalize_address function checks the address and either returns its canonical form (suitable for storage and use in addressing envelopes) or raises an InvalidAddressError exception that contains a list of errors.

Address fields

Here is the list of recognized fields:

  • country_code is a two-letter ISO 3166-1 country code
  • country_area is a designation of a region, province, or state. Recognized values include official names, designated abbreviations, official translations, and Latin transliterations
  • city is a city or town name. Recognized values include official names, official translations, and Latin transliterations
  • city_area is a sublocality like a district. Recognized values include official names, official translations, and Latin transliterations
  • street_address is the (possibly multiline) street address
  • postal_code is a postal code or zip code
  • sorting_code is a sorting code
  • name is a person's name
  • company_name is a name of a company or organization

Errors

Address validation with only country code:

from i18naddress import InvalidAddressError, normalize_address

try:
    address = normalize_address({'country_code': 'US'})
except InvalidAddressError as e:
    print(e.errors)

Output:

{'city': 'required',
 'country_area': 'required',
 'postal_code': 'required',
 'street_address': 'required'}

With correct address:

from i18naddress import normalize_address

address = normalize_address({
    'country_code': 'US',
    'country_area': 'California',
    'city': 'Mountain View',
    'postal_code': '94043',
    'street_address': '1600 Amphitheatre Pkwy'
})
print(address)

Output:

{'city': 'MOUNTAIN VIEW',
 'city_area': '',
 'country_area': 'CA',
 'country_code': 'US',
 'postal_code': '94043',
 'sorting_code': '',
 'street_address': '1600 Amphitheatre Pkwy'}

Postal code/zip code validation example:

from i18naddress import InvalidAddressError, normalize_address

try:
    address = normalize_address({
        'country_code': 'US',
        'country_area': 'California',
        'city': 'Mountain View',
        'postal_code': '74043',
        'street_address': '1600 Amphitheatre Pkwy'
    })
except InvalidAddressError as e:
    print(e.errors)

Output:

{'postal_code': 'invalid'}

Address latinization

In some cases, it may be useful to display foreign addresses in a more accessible format. You can use the latinize_address function to obtain a more verbose, Latinized version of an address.

This version is suitable for display and useful for full-text search indexing, but the normalized form is what should be stored in the database and used when printing address labels.

from i18naddress import latinize_address

address = {
    'country_code': 'CN',
    'country_area': '云南省',
    'postal_code': '677400',
    'city': '临沧市',
    'city_area': '凤庆县',
    'street_address': '中关村东路1号'
}
latinize_address(address)

Output:

{'country_code': 'CN',
 'country_area': 'Yunnan Sheng',
 'city': 'Lincang Shi',
 'city_area': 'Lincang Shi',
 'sorting_code': '',
 'postal_code': '677400',
 'street_address': '中关村东路1号'}

It will also return expanded names for area types that normally use codes and abbreviations such as state names in the US:

from i18naddress import latinize_address

address = {
    'country_code': 'US',
    'country_area': 'CA',
    'postal_code': '94037',
    'city': 'Mountain View',
    'street_address': '1600 Charleston Rd.'
}
latinize_address(address)

Output:

{'country_code': 'US',
 'country_area': 'California',
 'city': 'Mountain View',
 'city_area': '',
 'sorting_code': '',
 'postal_code': '94037',
 'street_address': '1600 Charleston Rd.'}

Address formatting

You can use the format_address function to format the address following the destination country's post office regulations:

address = {
    'country_code': 'CN',
    'country_area': '云南省',
    'postal_code': '677400',
    'city': '临沧市',
    'city_area': '凤庆县',
    'street_address': '中关村东路1号'
}
print(format_address(address))

Output:

677400
云南省临沧市凤庆县
中关村东路1号
CHINA

You can also ask for a Latin-friendly version:

address = {
    'country_code': 'CN',
    'country_area': '云南省',
    'postal_code': '677400',
    'city': '临沧市',
    'city_area': '凤庆县',
    'street_address': '中关村东路1号'
}
print(format_address(address, latin=True))

Output:

中关村东路1号
凤庆县
临沧市
云南省, 677400
CHINA

Validation rules

You can use the get_validation_rules function to obtain validation data useful for constructing address forms specific for a particular country:

from i18naddress import get_validation_rules

get_validation_rules({'country_code': 'US', 'country_area': 'CA'})

Output:

ValidationRules(
    country_code='US',
    country_name='UNITED STATES',
    address_format='%N%n%O%n%A%n%C, %S %Z',
    address_latin_format='%N%n%O%n%A%n%C, %S %Z',
    allowed_fields={'street_address', 'company_name', 'city', 'name', 'country_area', 'postal_code'},
    required_fields={'street_address', 'city', 'country_area', 'postal_code'},
    upper_fields={'city', 'country_area'},
    country_area_type='state',
    country_area_choices=[('AL', 'Alabama'), ..., ('WY', 'Wyoming')],
    city_type='city',
    city_choices=[],
    city_area_type='suburb',
    city_area_choices=[],
    postal_code_type='zip',
    postal_code_matchers=[re.compile('^(\\d{5})(?:[ \\-](\\d{4}))?$'), re.compile('^9[0-5]|96[01]')],
    postal_code_examples=['90000', '96199'],
    postal_code_prefix=''
)

All known fields

You can use the KNOWN_FIELDS set, to render optional address fields as hidden elements of your form:

from i18naddress import get_validation_rules, KNOWN_FIELDS

rules = get_validation_rules({'country_code': 'US'})
KNOWN_FIELDS - rules.allowed_fields

Output:

{'city_area', 'sorting_code'}

Raw i18n data

Raw data is stored in a dict:

from i18naddress import load_validation_data

i18n_country_data = load_validation_data()
i18n_country_data['US']

Output:

{'fmt': '%N%n%O%n%A%n%C, %S %Z',
 'id': 'data/US',
 'key': 'US',
 'lang': 'en',
 'languages': 'en',
 'name': 'UNITED STATES',
 'posturl': 'https://tools.usps.com/go/ZipLookupAction!input.action',
 'require': 'ACSZ',
 'state_name_type': 'state',
 'sub_keys': 'AL~AK~AS~AZ~AR~AA~AE~AP~CA~CO~CT~DE~DC~FL~GA~GU~HI~ID~IL~IN~IA~KS~KY~LA~ME~MH~MD~MA~MI~FM~MN~MS~MO~MT~NE~NV~NH~NJ~NM~NY~NC~ND~MP~OH~OK~OR~PW~

PA~PR~RI~SC~SD~TN~TX~UT~VT~VI~VA~WA~WV~WI~WY',
 'sub_names': 'Alabama~Alaska~American Samoa~Arizona~Arkansas~Armed Forces (AA)~Armed Forces (AE)~Armed Forces (AP)~California~Colorado~Connecticut~Delaware~District of Columbia~Florida~Georgia~Guam~Hawaii~Idaho~Illinois~Indiana~Iowa~Kansas~Kentucky~Louisiana~Maine~Marshall Islands~Maryland~Massachusetts~Michigan~Micronesia~Minnesota~Mississippi~Missouri~Montana~Nebraska~Nevada~New Hampshire~New Jersey~New Mexico~New York~North Carolina~North Dakota~Northern Mariana Islands~Ohio~Oklahoma~Oregon~Palau~Pennsylvania~Puerto Rico~Rhode Island~South Carolina~South Dakota~Tennessee~Texas~Utah~Vermont~Virgin Islands~Virginia~Washington~West Virginia~Wisconsin~Wyoming',
 'sub_zipexs': '35000,36999~99500,99999~96799~85000,86999~71600,72999~34000,34099~09000,09999~96200,96699~90000,96199~80000,81999~06000,06999~19700,19999~20000,20099:20200,20599:56900,56999~32000,33999:34100,34999~30000,31999:39800,39899:39901~96910,96932~96700,96798:96800,96899~83200,83999~60000,62999~46000,47999~50000,52999~66000,67999~40000,42799~70000,71599~03900,04999~96960,96979~20600,21999~01000,02799:05501:05544~48000,49999~96941,96944~55000,56799~38600,39799~63000,65999~59000,59999~68000,69999~88900,89999~03000,03899~07000,08999~87000,88499~10000,14999:06390:00501:00544~27000,28999~58000,58999~96950,96952~43000,45999~73000,74999~97000,97999~96940~15000,19699~00600,00799:00900,00999~02800,02999~29000,29999~57000,57999~37000,38599~75000,79999:88500,88599:73301:73344~84000,84999~05000,05999~00800,00899~20100,20199:22000,24699~98000,99499~24700,26999~53000,54999~82000,83199:83414',
 'sub_zips': '3[56]~99[5-9]~96799~8[56]~71[6-9]|72~340~09~96[2-6]~9[0-5]|96[01]~8[01]~06~19[7-9]~20[02-5]|569~3[23]|34[1-9]~3[01]|398|39901~969([1-2]\\d|3[12])~967[0-8]|9679[0-8]|968~83[2-9]~6[0-2]~4[67]~5[0-2]~6[67]~4[01]|42[0-7]~70|71[0-5]~039|04~969[67]~20[6-9]|21~01|02[0-7]|05501|05544~4[89]~9694[1-4]~55|56[0-7]~38[6-9]|39[0-7]~6[3-5]~59~6[89]~889|89~03[0-8]~0[78]~87|88[0-4]~1[0-4]|06390|00501|00544~2[78]~58~9695[0-2]~4[3-5]~7[34]~97~969(39|40)~1[5-8]|19[0-6]~00[679]~02[89]~29~57~37|38[0-5]~7[5-9]|885|73301|73344~84~05~008~201|2[23]|24[0-6]~98|99[0-4]~24[7-9]|2[56]~5[34]~82|83[01]|83414',
 'upper': 'CS',
 'zip': '(\\d{5})(?:[ \\-](\\d{4}))?',
 'zip_name_type': 'zip',
 'zipex': '95014,22162-1010'}

Used with Django form

Django forms will return only required address fields in form.cleaned_data dict. So addresses in the database will be normalized.

from django import forms

from i18naddress import InvalidAddressError, normalize_address, get_validation_rules

class AddressForm(forms.Form):

    COUNTRY_CHOICES = [
        ('PL', 'Poland'),
        ('AE', 'United Arab Emirates'),
        ('US', 'United States of America')
    ]

    ERROR_MESSAGES = {
        'required': 'This field is required',
        'invalid': 'Enter a valid name'
    }

    name = forms.CharField(required=True)
    company_name = forms.CharField(required=False)
    street_address = forms.CharField(required=False)
    city = forms.CharField(required=False)
    city_area = forms.CharField(required=False)
    country_code = forms.ChoiceField(required=True, choices=COUNTRY_CHOICES)
    country_area = forms.CharField(required=False)
    postal_code = forms.CharField(required=False)

    def clean(self):
        clean_data = super(AddressForm, self).clean()
        validation_rules = get_validation_rules(clean_data)
        try:
            valid_address = normalize_address(clean_data)
        except InvalidAddressError as e:
            errors = e.errors
            valid_address = None
            for field, error_code in errors.items():
                if field == 'postal_code':
                    examples = validation_rules.postal_code_examples
                    msg = 'Invalid value, use format like %s' % examples
                else:
                    msg = self.ERROR_MESSAGES[error_code]
                self.add_error(field, msg)
        return valid_address or clean_data

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

google_i18n_address-3.1.1.tar.gz (721.6 kB view details)

Uploaded Source

Built Distribution

google_i18n_address-3.1.1-py2.py3-none-any.whl (772.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file google_i18n_address-3.1.1.tar.gz.

File metadata

  • Download URL: google_i18n_address-3.1.1.tar.gz
  • Upload date:
  • Size: 721.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for google_i18n_address-3.1.1.tar.gz
Algorithm Hash digest
SHA256 c9ea70e35cb312948651fbbbe4ad4ba97781e129096502fab6ed4e94f629ab0e
MD5 2ad8bed45d85ea46eb6a3466df4ceee2
BLAKE2b-256 fe52d00c490e19a727ee67e97ff04fd2ed7003088b0e28f40604784c97c4c946

See more details on using hashes here.

File details

Details for the file google_i18n_address-3.1.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for google_i18n_address-3.1.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f66f4fd2b75d1cd371fc0a7678a1d656da4aa3b32932279e78dd6cae776fc23d
MD5 51339757c2f00bc27a5a2eaa81cd8e87
BLAKE2b-256 3775c4dadb4845c8c930b94c8ff9d2dfa9855c0a005366af539fee8095e30765

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page