Skip to main content

Encryption, hashing, and blind indexing for Pydantic

Project description

pydantic-encryption

Field-level encryption, hashing, and blind indexing for Pydantic models with SQLAlchemy integration.

Installation

pip install pydantic-encryption

Optional extras

pip install "pydantic-encryption[sqlalchemy]"  # SQLAlchemy integration
pip install "pydantic-encryption[aws]"         # AWS KMS encryption
pip install "pydantic-encryption[all]"         # All optional dependencies

Quick Start

from typing import Annotated
from pydantic_encryption import BaseModel, Encrypted, Hashed

class User(BaseModel):
    name: str
    address: Annotated[bytes, Encrypted]
    password: Annotated[str, Hashed]

user = User(name="John Doe", address="123 Main St", password="secret123")

print(user.name)      # "John Doe"
print(user.address)   # encrypted bytes
print(user.password)  # argon2 hash bytes

Fields marked with Encrypted are encrypted and fields marked with Hashed are hashed during model initialization.

Decrypting

Call decrypt_fields() on the model instance to decrypt all Encrypted fields in-place:

user = User(name="John", address="123 Main St", password="secret")

user.decrypt_fields()
print(user.address)  # "123 Main St"

decrypt_fields() returns self, so it can be chained.

Async Support

Use async_init() to construct models with async encryption, hashing, and blind indexing:

user = await User.async_init(name="John", address="123 Main St", password="secret")

Use async_decrypt_fields() for async decryption:

await user.async_decrypt_fields()

All phases (encrypt, hash, blind-index) run concurrently via asyncio.gather, and nested BaseModel instances — including those inside list, tuple, dict, and set containers — are processed recursively.

Encryption Methods

Set the encryption method via environment variable:

ENCRYPTION_METHOD=fernet   # Fernet symmetric encryption (requires ENCRYPTION_KEY)
ENCRYPTION_METHOD=aws      # AWS KMS (requires AWS_KMS_KEY_ARN, AWS_KMS_REGION, etc.)

There is no default — you must explicitly set ENCRYPTION_METHOD if using Encrypted fields.

Fernet Setup

# Generate a key
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"

# Set environment variables
ENCRYPTION_METHOD=fernet
ENCRYPTION_KEY=your_generated_key

AWS KMS Setup

ENCRYPTION_METHOD=aws
AWS_KMS_KEY_ARN=arn:aws:kms:us-east-1:123456789:key/your-key-id
AWS_KMS_REGION=us-east-1
AWS_KMS_ACCESS_KEY_ID=your_access_key
AWS_KMS_SECRET_ACCESS_KEY=your_secret_key

As an alternative to AWS_KMS_KEY_ARN, separate encrypt/decrypt keys are supported for key rotation or read-only scenarios:

AWS_KMS_ENCRYPT_KEY_ARN=arn:aws:kms:...encrypt-key
AWS_KMS_DECRYPT_KEY_ARN=arn:aws:kms:...decrypt-key

Use one mode or the other — combining AWS_KMS_KEY_ARN with either split variant raises a validation error. A decrypt-only key alone is allowed (read-only workloads).

Model-Level Config

Override encryption settings per model instead of relying on environment variables:

from pydantic_encryption import BaseModel, Encrypted, EncryptionMethod
from typing import Annotated

class SpecialUser(BaseModel, encryption_method=EncryptionMethod.FERNET, encryption_key="my-key"):
    email: Annotated[bytes, Encrypted]

Supported kwargs: encryption_method, encryption_key, blind_index_key. Falls back to env vars if not set.

Blind Indexes

Blind indexes enable equality searches on encrypted data by storing a deterministic keyed hash alongside the ciphertext.

Configuration: Set BLIND_INDEX_SECRET_KEY via environment variable.

Pydantic Models

from typing import Annotated
from pydantic_encryption import BaseModel, BlindIndex, BlindIndexMethod

class User(BaseModel):
    email_index: Annotated[bytes, BlindIndex(BlindIndexMethod.HMAC_SHA256)]

Normalization

Normalize values before hashing to ensure consistent lookups:

email_index: Annotated[bytes, BlindIndex(
    BlindIndexMethod.HMAC_SHA256,
    normalize_to_lowercase=True,
    strip_whitespace=True,
)]

Available options:

Option Effect
strip_whitespace Strip leading/trailing whitespace, collapse internal whitespace
strip_non_characters Remove all non-letter characters (keep only a-zA-Z)
strip_non_digits Remove all non-digit characters (keep only 0-9)
normalize_to_lowercase Convert to lowercase
normalize_to_uppercase Convert to uppercase

Methods

Method Description
BlindIndexMethod.HMAC_SHA256 Fast HMAC-SHA256 keyed hash. Standard choice.
BlindIndexMethod.ARGON2 Memory-hard Argon2 hash with deterministic salt. Better brute-force resistance.

SQLAlchemy Integration

Install with pip install "pydantic-encryption[sqlalchemy]".

from sqlalchemy import create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, Session

from pydantic_encryption import (
    SQLAlchemyEncryptedValue,
    SQLAlchemyHashedValue,
    SQLAlchemyBlindIndexValue,
    BlindIndexMethod,
)


class Base(DeclarativeBase):
    pass


class User(Base):
    __tablename__ = "users"

    id: Mapped[int] = mapped_column(primary_key=True)
    username: Mapped[str]
    email: Mapped[bytes] = mapped_column(SQLAlchemyEncryptedValue())
    password: Mapped[bytes] = mapped_column(SQLAlchemyHashedValue())
    blind_index_email: Mapped[bytes] = mapped_column(
        SQLAlchemyBlindIndexValue(BlindIndexMethod.HMAC_SHA256)
    )


engine = create_engine("sqlite:///:memory:")
Base.metadata.create_all(engine)

with Session(engine) as session:
    user = User(
        username="john",
        email="john@example.com",
        password="secret123",
        blind_index_email="john@example.com",
    )
    session.add(user)
    session.commit()

    # Query by blind index — automatically hashed
    found = session.query(User).filter(
        User.blind_index_email == "john@example.com"
    ).first()
    print(found.email)  # decrypted

Supported Types

SQLAlchemyEncryptedValue preserves the Python type of your data:

str, bytes, bool, int, float, Decimal, UUID, date, datetime, time, timedelta

Array Support (PostgreSQL)

from pydantic_encryption import SQLAlchemyPGEncryptedArray

tags: Mapped[list[str] | None] = mapped_column(SQLAlchemyPGEncryptedArray(), nullable=True)

Each element is individually encrypted. Requires PostgreSQL.

Async SQLAlchemy Decryption

SQLAlchemy's TypeDecorator is sync by contract — even under AsyncSession the result-processing pipeline runs inline. For fast backends (Fernet) this is fine, but a network-bound backend like AWS KMS can spend tens of milliseconds per call, blocking the event loop.

pydantic-encryption handles this with a two-tier strategy:

Tier 1 — automatic, zero code change. Under AsyncSession, decryption transparently uses SQLAlchemy's greenlet bridge (sqlalchemy.util.await_) so each decrypt yields the event loop during its network roundtrip. Other tasks on the loop keep progressing. The same bridge also wraps Argon2 hashing (SQLAlchemyHashedValue) and Argon2 blind-index computation (SQLAlchemyBlindIndexValue) so write-side commits don't block either.

Tier 2 — opt-in, real parallelism. For single fetches with many encrypted cells, pass defer_decrypt=True on the column and bulk-decrypt after the fetch. Every cell is decrypted concurrently via asyncio.gather, turning N sequential roundtrips into one concurrent burst.

from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession
from pydantic_encryption import async_decrypt_rows, SQLAlchemyEncryptedValue

class User(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[bytes] = mapped_column(SQLAlchemyEncryptedValue(defer_decrypt=True))
    secret: Mapped[bytes] = mapped_column(SQLAlchemyEncryptedValue(defer_decrypt=True))


async with AsyncSession(engine) as session:
    users = (await session.execute(select(User).limit(1000))).scalars().all()

    # Before this call, users[i].email is still an EncryptedValue.
    await async_decrypt_rows(users, User.email, User.secret)

    for u in users:
        print(u.email)  # decrypted plaintext

async_decrypt_rows accepts InstrumentedAttribute (e.g. User.email) or string column names. Pass concurrency=N to cap in-flight decrypts with an asyncio.Semaphore.

Custom Encryption or Hashing

Subclass BaseModel and override any of encrypt_data, hash_data, blind_index_data (or their async variants) to plug in your own logic. The post-init hook runs automatically:

from pydantic_encryption import BaseModel

class MyModel(BaseModel):
    def encrypt_data(self) -> None:
        # your encryption logic (mutate self in-place)
        ...

To implement a new backend instead of replacing the per-model path, subclass one of the adapter ABCs (EncryptionAdapter, HashingAdapter, BlindIndexAdapter) and register it via register_encryption_backend / register_blind_index_backend. Async variants are inherited by default — override async_encrypt / async_decrypt only for natively-async backends.

Run Tests

pip install -e ".[dev]"
pytest -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantic_encryption-0.7.0.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydantic_encryption-0.7.0-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file pydantic_encryption-0.7.0.tar.gz.

File metadata

  • Download URL: pydantic_encryption-0.7.0.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pydantic_encryption-0.7.0.tar.gz
Algorithm Hash digest
SHA256 6f8077c14889c6f87b72b3fd8806ce7eac37a85a1bea1ab082f79238ad5209fa
MD5 e65fd2920b12790ee4b7161602fbfa5b
BLAKE2b-256 6dff33da5227bef195c00a0b1d1153d67edce8aa943e9d0a211afccb35fa5fee

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydantic_encryption-0.7.0.tar.gz:

Publisher: publish-to-pypi.yml on julien777z/pydantic-encryption

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pydantic_encryption-0.7.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pydantic_encryption-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cd4c26a8725ee6374e18dc8680a184705bdc376a7af0a0b2bbe91af10bb8100b
MD5 05e5ca533af9c3ad250af48d7d60a468
BLAKE2b-256 f92b459e05f5fc0562a12c67a141e1dca5e2ff48b29242cb173172e18a9f3001

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydantic_encryption-0.7.0-py3-none-any.whl:

Publisher: publish-to-pypi.yml on julien777z/pydantic-encryption

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page