Skip to main content

Validate tax identifiers and resolve their metadata with Pydantic

Project description

tax-identifiers

Country-aware tax identifier validation, normalization, and metadata resolution for Pydantic models.

Installation

pip install tax-identifiers

Quick Start

Construct a TaxValidator for a country and validate an identifier — the validator normalizes the value, applies that country's structural rules, and resolves any metadata. Currently, only the US validators have dedicated validation rules; every other country falls back to generic normalization.

from tax_identifiers import TaxValidator, Country, TaxIdentifierType

validator = TaxValidator(Country.US)
result = validator.validate("123-45-6789", TaxIdentifierType.SSN)

result.valid                   # True — passes the SSN reserved-range checks
result.country                 # Country.US
result.tax_id_type             # TaxIdentifierType.SSN
result.metadata.issued_state   # a USState enum — e.g. USState.NEW_YORK ("NY")
result.metadata.issued_years   # e.g. "1936-1950"

TaxValidationResult omits the raw identifier, so it is safe to log or return from an API.

Resolving Countries

Country.from_string normalizes codes and names — "US", "us", "United States", and "USA" all resolve to Country.US — so a validator can be built straight from a stored country string:

validator = TaxValidator(Country.from_string(row.country))   # ISO code or full name

A named country without dedicated rules can't assert validity — its validator raises NotImplementedError:

TaxValidator(Country.from_string("France")).validate(
    "FR1234567", TaxIdentifierType.FOREIGN_TIN
)   # raises NotImplementedError — no validation rules for France

Country.UNKNOWN is the country-agnostic exception: it accepts any non-empty identifier, so foreign identifiers of any shape validate against it.

An unrecognized country string raises UnknownCountryError:

Country.from_string("Atlantis")   # raises UnknownCountryError

Error Handling

validate raises on malformed or unsupported input. A parseable-but-reserved identifier is not an error — it comes back with valid=False:

from tax_identifiers import InvalidTaxIdError, UnsupportedTaxIdTypeError

validator.validate("666-12-3456", TaxIdentifierType.SSN).valid          # False — 666 is a reserved area
validator.validate("123-45-67890", TaxIdentifierType.SSN)               # raises InvalidTaxIdError — 10 digits
TaxValidator(Country.US).validate("X1", TaxIdentifierType.FOREIGN_TIN)  # raises UnsupportedTaxIdTypeError

TaxValidationResult.from_tax_identifier returns None for missing or malformed input instead of raising:

from tax_identifiers import TaxValidationResult

summary = TaxValidationResult.from_tax_identifier(
    country=Country.US, tax_id="12-3456789", tax_id_type=TaxIdentifierType.EIN
)
summary.valid   # True

Normalization Utilities

Reusable normalization helpers and annotated Pydantic field types you can drop into your own models:

from tax_identifiers import clean_us_tax_identifier, format_us_ssn, format_us_ein, ComparableUsTaxIdentifier

clean_us_tax_identifier(" 123-45-6789 ")                  # "123456789"
format_us_ssn("123456789")                                # "123-45-6789"
format_us_ein("123456789")                                # "12-3456789"
ComparableUsTaxIdentifier("123-45-6789") == "123456789"   # True — equality ignores formatting

NormalizedString and StringBool are configurable annotated field types:

from tax_identifiers import BaseModel, NormalizedString, StringBool


def is_yes(value: str) -> bool:
    return value.strip().upper() in {"YES", "Y"}


class IntakeForm(BaseModel):
    business_name: NormalizedString(normalize_to_uppercase=True)
    consented: StringBool(predicate=is_yes)


form = IntakeForm(business_name="  acme llc ", consented="yes")
form.business_name   # "ACME LLC"
form.consented       # True

NormalizedString collapses internal and edge whitespace first, then applies any of:

Option Effect
normalize_to_uppercase Uppercase the value
normalize_to_lowercase Lowercase the value
normalize_to_titlecase Title-case the value
strip_non_digits Remove every non-digit character
strip_trailing_punctuation Drop trailing . and , from each token

Masking Tax Identifiers

A TaxIdField carries a country and identifier type and normalizes on construction. Mix in TaxIdentifierPairMixin to mask the value while keeping the original recoverable:

from tax_identifiers import BaseModel, Country, TaxIdentifierPairMixin, TaxIdentifierType, TaxIdField


class ContractorTaxInfo(TaxIdentifierPairMixin, BaseModel):
    name: str
    tax_id: TaxIdField(country=Country.US, tax_id_type=TaxIdentifierType.SSN)


record = ContractorTaxInfo(name="Jane Doe", tax_id="123-45-6789")
record.tax_id == "123456789"   # normalized on construction — equality ignores formatting

masked = record.to_masked()
masked.tax_id                  # "*******6789"
masked.to_unmask().tax_id      # "123-45-6789" — original recovered

TaxIdField defaults to Country.UNKNOWN — a country-agnostic field that normalizes (uppercases) but is never validated. Pass country=Country.US to apply a country's rules.

Run Tests

pip install -e ".[dev]"
pytest -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tax_identifiers-0.0.1.tar.gz (86.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tax_identifiers-0.0.1-py3-none-any.whl (96.8 kB view details)

Uploaded Python 3

File details

Details for the file tax_identifiers-0.0.1.tar.gz.

File metadata

  • Download URL: tax_identifiers-0.0.1.tar.gz
  • Upload date:
  • Size: 86.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tax_identifiers-0.0.1.tar.gz
Algorithm Hash digest
SHA256 28b9f434fa5a65b89084bbb5676b424f1730007a26e920358f146495e0d86901
MD5 c7ebb6ca29aebe27118994d93ec540e8
BLAKE2b-256 8e378be74cd8985e1f9e13fa51083cabf9cb2f1324038cc5217f2900f9e08f4d

See more details on using hashes here.

Provenance

The following attestation bundles were made for tax_identifiers-0.0.1.tar.gz:

Publisher: publish-to-pypi.yml on julien777z/tax-identifiers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tax_identifiers-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: tax_identifiers-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 96.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tax_identifiers-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5274bac60727862bf8c6093edb28333d4ba7fd7138e2c5413148ffc421d9e1fc
MD5 39d2f97018785d48d8093435c6ff7b93
BLAKE2b-256 b7fa4705bf5fcd8c6db2254510d68c225d2ca1e17809b368a2c71e025b48927e

See more details on using hashes here.

Provenance

The following attestation bundles were made for tax_identifiers-0.0.1-py3-none-any.whl:

Publisher: publish-to-pypi.yml on julien777z/tax-identifiers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page