A microkernel-based library for validating country-specific URN formats

These details have not been verified by PyPI

Project links

Project description

International URNs

A microkernel-based Python library for validating, generating, and extracting metadata from country-specific URN (Uniform Resource Name) formats.

Overview

International URNs provides a pluggable architecture for validating, generating, and extracting metadata from URNs associated with countries using ISO 3166-1 Alpha-2 codes. The library uses a microkernel design where country-specific validators, generators, and extractors are provided by separate plugin packages.

URN Format: urn:country_code:document_type:document_value

Example: urn:es:dni:12345678X

Features

Microkernel Architecture: Core library provides the framework, plugins provide country-specific validation, generation, and extraction
Auto-registration: Validators, generators, and extractors automatically register themselves using Python's __init_subclass__
Entry Point Discovery: Plugins are discovered and loaded via Python entry points
ISO 3166-1 Alpha-2 Enforcement: Country codes are validated to be exactly 2 letters (or "--" for wildcard)
URN Generation: Generate random valid URNs for testing and fixtures
Metadata Extraction: Extract structured metadata from URNs with automatic parser selection
Faker Integration: Generators are compatible with Faker providers for easy test data generation
Pydantic Integration: Seamless integration with Pydantic's BeforeValidator and AfterValidator
Case-Insensitive: URN scheme, country codes, and document types are case-insensitive (NSS remainder preserves case)
Type-Safe: Full type hints with mypy support
Extensible: Easy to add new country and document type validators, generators, and extractors

Installation

pip install international-urns

For development:

# Create virtual environment and install with test dependencies
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -e ".[test]"

Usage

Note: Examples use iurns as an abbreviated import alias for convenience.

Basic Validation with Pydantic

from pydantic import BaseModel, AfterValidator, BeforeValidator
from typing import Annotated
import international_urns as iurns

class Document(BaseModel):
    urn: Annotated[
        str,
        BeforeValidator(iurns.create_normalizer()),
        AfterValidator(iurns.get_validator('es', 'dni'))
    ]

# Validates and normalizes the URN
doc = Document(urn="URN:ES:DNI:12345678X")
print(doc.urn)  # Output: "urn:es:dni:12345678X"

Normalization

URN normalization converts the scheme, country code, and document type to lowercase while preserving the case of the document value:

import international_urns as iurns

normalized = iurns.normalize_urn("URN:ES:DNI:12345678X")
print(normalized)  # Output: "urn:es:dni:12345678X"

Wildcard Validator

The library includes a built-in wildcard validator that accepts any URN with a 2-letter country code:

import international_urns as iurns

validator = iurns.get_validator('--', '--')
result = validator('urn:es:dni:12345678X')  # Valid
result = validator('urn:us:ssn:123-45-6789')  # Valid
result = validator('urn:--:--:anything')  # Also valid

Registry Introspection

import international_urns as iurns

# List all available validators
validators = iurns.list_validators()
print(validators)  # [('--', '--'), ('es', 'dni'), ...]

# Check if a validator exists
if iurns.has_validator('es', 'dni'):
    validator = iurns.get_validator('es', 'dni')
    result = validator('urn:es:dni:12345678X')

URN Generation

The library provides generators for creating random valid URNs, useful for testing and fixtures.

Basic Generation

import international_urns as iurns

# Get a generator for a specific country and document type
dni_generator = iurns.get_generator('es', 'dni')

# Generate a random URN
urn = dni_generator()
print(urn)  # Output: "urn:es:dni:12345678Z" (random valid DNI)

Generator Registry Introspection

import international_urns as iurns

# List all available generators
generators = iurns.list_generators()
print(generators)  # [('es', 'dni'), ('es', 'nie'), ...]

# Check if a generator exists
if iurns.has_generator('es', 'dni'):
    gen = iurns.get_generator('es', 'dni')
    urn = gen()

Faker Integration

Generators are designed to be compatible with Faker providers:

from faker import Faker
from faker.providers import BaseProvider
import international_urns as iurns

class SpanishURNProvider(BaseProvider):
    def spanish_dni(self):
        return iurns.get_generator('es', 'dni')()

    def spanish_nie(self):
        return iurns.get_generator('es', 'nie')()

fake = Faker()
fake.add_provider(SpanishURNProvider)

# Generate random URNs
dni = fake.spanish_dni()
nie = fake.spanish_nie()

URN Metadata Extraction

The library provides extractors for parsing URNs and extracting structured metadata.

Convenience Method

The simplest way to extract metadata is using the extract_urn() convenience function, which automatically selects the appropriate extractor:

import international_urns as iurns

# Extract metadata from any URN
metadata = iurns.extract_urn('urn:es:dni:12345678X')

print(metadata['country_code'])    # Output: 'es'
print(metadata['document_type'])   # Output: 'dni'
print(metadata['document_value'])  # Output: '12345678X'

The extract_urn() function:

Automatically parses the URN to determine country and document type
Uses a specific extractor if one is registered for that country/type combination
Falls back to the wildcard extractor if no specific extractor is available
Returns a dictionary with at minimum: country_code, document_type, and document_value

Using Specific Extractors

You can also get extractors directly from the registry:

import international_urns as iurns

# Get the wildcard extractor
extractor = iurns.get_extractor('--', '--')
metadata = extractor('urn:fr:passport:ABC123')

print(metadata)
# Output: {'country_code': 'fr', 'document_type': 'passport', 'document_value': 'ABC123'}

Extractor Registry Introspection

import international_urns as iurns

# List all available extractors
extractors = iurns.list_extractors()
print(extractors)  # [('--', '--'), ('es', 'dni'), ...]

# Check if an extractor exists
if iurns.has_extractor('es', 'dni'):
    extractor = iurns.get_extractor('es', 'dni')
    metadata = extractor('urn:es:dni:12345678X')

Custom Extractors with Additional Metadata

Plugins can provide extractors that return additional metadata fields specific to the document type. The base URNExtractor class uses the template method pattern: it handles extracting basic fields (country_code, document_type, document_value) automatically, and calls _extract_metadata() for document-specific extraction:

# Example custom extractor (in a plugin)
from international_urns import URNExtractor
import re

class SpanishDNIExtractor(URNExtractor):
    country_code = "es"
    document_types = ["dni"]

    def _extract_metadata(self, country_code: str, document_type: str,
                         document_value: str, nss_parts: list[str]) -> dict:
        """Extract DNI-specific metadata.

        The base class already provides country_code, document_type, and
        document_value. This method adds document-specific fields.

        :param country_code: The country code (e.g., 'es')
        :param document_type: The document type (e.g., 'dni')
        :param document_value: The document value (e.g., '12345678X')
        :param nss_parts: Tokenized NSS parts from urnparse
        :return: Dictionary with additional metadata fields
        """
        # Extract number and letter from DNI format
        match = re.match(r'^(\d{8})([A-Z])$', document_value.upper())

        if match:
            return {
                "number": match.group(1),
                "letter": match.group(2),
            }

        return {}

# Using the custom extractor
metadata = iurns.extract_urn('urn:es:dni:12345678X')
print(metadata['country_code'])  # Output: 'es' (from base class)
print(metadata['document_type'])  # Output: 'dni' (from base class)
print(metadata['document_value'])  # Output: '12345678X' (from base class)
print(metadata['number'])  # Output: '12345678' (from _extract_metadata)
print(metadata['letter'])  # Output: 'X' (from _extract_metadata)

Creating Plugins

To create a plugin for a new country or document type:

1. Create a new package

Example: international-urns-es for Spanish documents

2. Define validators

Subclass URNValidator and specify the country code (ISO 3166-1 Alpha-2) and document types:

from international_urns import URNValidator

class SpanishDNIValidator(URNValidator):
    country_code = "es"  # Must be 2 letters (or "--" for wildcard)
    document_types = ["dni", "nie"]

    def validate(self, urn: str) -> str:
        # Implement validation logic
        # Raise ValueError if invalid
        # Return the URN (possibly normalized) if valid

        if not self._check_dni_format(urn):
            raise ValueError(f"Invalid DNI format: {urn}")

        return urn

    def _check_dni_format(self, urn: str) -> bool:
        # Custom validation logic here
        return True

3. Define generators

Subclass URNGenerator to create random URNs. Note that wildcard ("--") is not supported for generators:

from international_urns import URNGenerator
import random
import string

class SpanishDNIGenerator(URNGenerator):
    country_code = "es"  # Must be 2 letters (no wildcard for generators)
    document_types = ["dni", "nie"]

    def generate(self) -> str:
        # Generate a random valid URN
        # self.document_type contains the specific document type for this instance

        # Generate random DNI number (8 digits + letter)
        number = random.randint(10000000, 99999999)
        letter = random.choice(string.ascii_uppercase)

        return f"urn:{self.country_code}:{self.document_type}:{number}{letter}"

Important: When a generator class supports multiple document types, each registration creates a separate instance with self.document_type set to the appropriate value. Use self.document_type in your generate() method to create the correct URN format.

4. Define extractors

Subclass URNExtractor to parse URNs and extract structured metadata. The base class uses the template method pattern and automatically extracts the basic fields (country_code, document_type, document_value). Override _extract_metadata() to add document-specific fields:

from international_urns import URNExtractor
import re

class SpanishDNIExtractor(URNExtractor):
    country_code = "es"  # Can be 2 letters or "--" for wildcard
    document_types = ["dni", "nie"]

    def _extract_metadata(self, country_code: str, document_type: str,
                         document_value: str, nss_parts: list[str]) -> dict:
        """Extract Spanish document-specific metadata.

        The base class already provides country_code, document_type, and
        document_value. This method adds additional fields specific to
        Spanish identity documents.

        :param country_code: The country code (e.g., 'es')
        :param document_type: The document type (e.g., 'dni', 'nie')
        :param document_value: The document value extracted by base class
        :param nss_parts: Tokenized NSS parts from urnparse
        :return: Dictionary with additional metadata fields
        """
        # Extract number and letter from DNI/NIE format
        match = re.match(r'^(\d{8})([A-Z])$', document_value.upper())

        if match:
            return {
                "number": match.group(1),
                "letter": match.group(2),
            }

        return {}

Important: When an extractor class supports multiple document types, each registration creates a separate instance with self.document_type set to the appropriate value (though you typically won't need to use it since the base class handles document_type extraction). The base extract() method automatically provides country_code, document_type, and document_value - you only need to implement _extract_metadata() to add additional fields.

5. Register via entry points

In your plugin's pyproject.toml:

[project.entry-points.'international_urns.plugins']
es = 'international_urns_es'

Validators, generators, and extractors will automatically register themselves when the plugin is imported.

Development

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=international_urns --cov-report=html

# Run specific test file
pytest tests/test_registry.py

Linting and Type Checking

# Lint and format
ruff check .
ruff format .

# Type checking
mypy international_urns

Requirements

Python 3.11+
urnparse

License

MIT License - see LICENSE file for details

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.1

Jan 30, 2026

1.0.0

Nov 12, 2025

This version

1.0.0rc4 pre-release

Nov 7, 2025

1.0.0rc3 pre-release

Nov 4, 2025

1.0.0rc2 pre-release

Nov 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

international_urns-1.0.0rc4.tar.gz (12.0 kB view details)

Uploaded Nov 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

international_urns-1.0.0rc4-py3-none-any.whl (18.3 kB view details)

Uploaded Nov 7, 2025 Python 3

File details

Details for the file international_urns-1.0.0rc4.tar.gz.

File metadata

Download URL: international_urns-1.0.0rc4.tar.gz
Upload date: Nov 7, 2025
Size: 12.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.7

File hashes

Hashes for international_urns-1.0.0rc4.tar.gz
Algorithm	Hash digest
SHA256	`b382ea6ef3c463680a84c742313ae39eba779f558a37cd446eaa1a8442365b6b`
MD5	`4f3cfece22462fdca2c31639477f5783`
BLAKE2b-256	`37a6e5f68f68d878e94781403f091e0c73b67a9a5870a6f2ca212413b0a1f137`

See more details on using hashes here.

File details

Details for the file international_urns-1.0.0rc4-py3-none-any.whl.

File metadata

Download URL: international_urns-1.0.0rc4-py3-none-any.whl
Upload date: Nov 7, 2025
Size: 18.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.7

File hashes

Hashes for international_urns-1.0.0rc4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c2c16fbb4edcfbba02c086f3b721dbb8c47e4d634ee7ffe5a96337c2f9e9d8f2`
MD5	`c9f37862882eaa92cc3808b3310fafd5`
BLAKE2b-256	`55c61dc5116675166b9293e23b8168e9530ca5b53d5ee206f1e1e1d7ce75f97d`

See more details on using hashes here.

international-urns 1.0.0rc4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

International URNs

Overview

Features

Installation

Usage

Basic Validation with Pydantic

Normalization

Wildcard Validator

Registry Introspection

URN Generation

Basic Generation

Generator Registry Introspection

Faker Integration

URN Metadata Extraction

Convenience Method

Using Specific Extractors

Extractor Registry Introspection

Custom Extractors with Additional Metadata

Creating Plugins

1. Create a new package

2. Define validators

3. Define generators

4. Define extractors

5. Register via entry points

Development

Running Tests

Linting and Type Checking

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes