Skip to main content

Validate tax identifiers and resolve their metadata with Pydantic

Project description

tax-identifiers

Country-aware tax identifier validation, normalization, and metadata resolution for Pydantic models.

Installation

pip install tax-identifiers

Quick Start

Construct a TaxValidator for a country and validate an identifier — the validator normalizes the value, applies that country's structural rules, and resolves any metadata. Currently, only the US validators have dedicated validation rules; every other country falls back to generic normalization.

from tax_identifiers import TaxValidator, Country, TaxIdentifierType

validator = TaxValidator(Country.US)
result = validator.validate("123-45-6789", TaxIdentifierType.SSN)

result.valid                   # True — passes the SSN reserved-range checks
result.country                 # Country.US
result.tax_id_type             # TaxIdentifierType.SSN
result.metadata.issued_state   # a USState enum — e.g. USState.NEW_YORK ("NY")
result.metadata.issued_years   # e.g. "1936-1950"

TaxValidationResult omits the raw identifier, so it is safe to log or return from an API.

Resolving Countries

Country.from_string normalizes codes and names — "US", "us", "United States", and "USA" all resolve to Country.US — so a validator can be built straight from a stored country string:

validator = TaxValidator(Country.from_string(row.country))   # ISO code or full name

A named country without dedicated rules can't assert validity — its validator raises NotImplementedError:

TaxValidator(Country.from_string("France")).validate(
    "FR1234567", TaxIdentifierType.FOREIGN_TIN
)   # raises NotImplementedError — no validation rules for France

Country.UNKNOWN is the country-agnostic exception: it accepts any non-empty identifier, so foreign identifiers of any shape validate against it.

An unrecognized country string raises UnknownCountryError:

Country.from_string("Atlantis")   # raises UnknownCountryError

Error Handling

validate raises on malformed or unsupported input. A parseable-but-reserved identifier is not an error — it comes back with valid=False:

from tax_identifiers import InvalidTaxIdError, UnsupportedTaxIdTypeError

validator.validate("666-12-3456", TaxIdentifierType.SSN).valid          # False — 666 is a reserved area
validator.validate("123-45-67890", TaxIdentifierType.SSN)               # raises InvalidTaxIdError — 10 digits
TaxValidator(Country.US).validate("X1", TaxIdentifierType.FOREIGN_TIN)  # raises UnsupportedTaxIdTypeError

TaxValidationResult.from_tax_identifier returns None for missing or malformed input instead of raising:

from tax_identifiers import TaxValidationResult

summary = TaxValidationResult.from_tax_identifier(
    country=Country.US, tax_id="12-3456789", tax_id_type=TaxIdentifierType.EIN
)
summary.valid   # True

Normalization Utilities

Reusable normalization helpers and annotated Pydantic field types you can drop into your own models:

from tax_identifiers import clean_us_tax_identifier, format_us_ssn, format_us_ein, ComparableUsTaxIdentifier

clean_us_tax_identifier(" 123-45-6789 ")                  # "123456789"
format_us_ssn("123456789")                                # "123-45-6789"
format_us_ein("123456789")                                # "12-3456789"
ComparableUsTaxIdentifier("123-45-6789") == "123456789"   # True — equality ignores formatting

NormalizedString and StringBool are configurable annotated field types:

from tax_identifiers import BaseModel, NormalizedString, StringBool


def is_yes(value: str) -> bool:
    return value.strip().upper() in {"YES", "Y"}


class IntakeForm(BaseModel):
    business_name: NormalizedString(normalize_to_uppercase=True)
    consented: StringBool(predicate=is_yes)


form = IntakeForm(business_name="  acme llc ", consented="yes")
form.business_name   # "ACME LLC"
form.consented       # True

NormalizedString collapses internal and edge whitespace first, then applies any of:

Option Effect
normalize_to_uppercase Uppercase the value
normalize_to_lowercase Lowercase the value
normalize_to_titlecase Title-case the value
strip_non_digits Remove every non-digit character
strip_trailing_punctuation Drop trailing . and , from each token

Masking Tax Identifiers

A TaxIdField carries a country and identifier type and normalizes on construction. Mix in TaxIdentifierPairMixin to mask the value while keeping the original recoverable:

from tax_identifiers import BaseModel, Country, TaxIdentifierPairMixin, TaxIdentifierType, TaxIdField


class ContractorTaxInfo(TaxIdentifierPairMixin, BaseModel):
    name: str
    tax_id: TaxIdField(country=Country.US, tax_id_type=TaxIdentifierType.SSN)


record = ContractorTaxInfo(name="Jane Doe", tax_id="123-45-6789")
record.tax_id == "123456789"   # normalized on construction — equality ignores formatting

masked = record.to_masked()
masked.tax_id                  # "*******6789"
masked.to_unmask().tax_id      # "123-45-6789" — original recovered

TaxIdField defaults to Country.UNKNOWN — a country-agnostic field that normalizes (uppercases) but is never validated. Pass country=Country.US to apply a country's rules.

Run Tests

pip install -e ".[dev]"
pytest -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tax_identifiers-0.0.2.tar.gz (86.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tax_identifiers-0.0.2-py3-none-any.whl (97.0 kB view details)

Uploaded Python 3

File details

Details for the file tax_identifiers-0.0.2.tar.gz.

File metadata

  • Download URL: tax_identifiers-0.0.2.tar.gz
  • Upload date:
  • Size: 86.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tax_identifiers-0.0.2.tar.gz
Algorithm Hash digest
SHA256 d34de37bcd52396c3d091a853cbc596c9c036c96d94a5a7ce354049f05e10bee
MD5 6be9144934c8413dc26acb15f71dba23
BLAKE2b-256 51f9a97259671019c40a95113d9678620f119ec74a14fd62d2f2a355b70fdac2

See more details on using hashes here.

Provenance

The following attestation bundles were made for tax_identifiers-0.0.2.tar.gz:

Publisher: publish-to-pypi.yml on julien777z/tax-identifiers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tax_identifiers-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: tax_identifiers-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 97.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tax_identifiers-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3740fbecba8405fcef0d2a071678e2d4356073c86e7a53e1cd48bba5596324be
MD5 9911088ddeafbb3e0bf2670f9a35ab60
BLAKE2b-256 01db3b521fa7ecf08b476ada7759f983e5712d0fc7b40741c0b6f660bdc37043

See more details on using hashes here.

Provenance

The following attestation bundles were made for tax_identifiers-0.0.2-py3-none-any.whl:

Publisher: publish-to-pypi.yml on julien777z/tax-identifiers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page