Skip to main content

Transformer model attention in Pydantic.

Project description

pydanttention

PyPI pdm-managed pre-commit.ci status Supported Python versions

Transformer model attention in Pydantic.

Adapted from the source by Theia Vogel (MIT licensed, included here as vogel_manual_transformer.py):

In turn using model ops from picoGPT (MIT license)

Motivation

Rewriting AI model source code as Pydantic data models is an interesting exercise. I'd note the following benefits.

  • All operations can be subclassed from an arbitrary Operation model (see .models.ops.base), i.e. they all expect their first argument to be a numpy array x. This naturally allows you to factor your code around a category of 'operations'.

  • Since all functions get turned into a class (a Pydantic data model with type-annotated fields for input state rather than funcdef kw/args), and classes are conventionally named in PascalCase whereas functions (like all other Python variables) are conventionally named in snake_case, you can easily observe from case alone where significant operations are called, as well as where the data model is referenced (by self.{field}) making these 2 types of data access distinct from the intermediate variables. This gives a better sense at a glance of data flow through your program.

  • State can be configured at runtime but also given defaults at import time through use of fields in the data model. The original source code hardcoded values in the config as module globals (similarly to using class variables), it was not possible to configure component parts at runtime. This was appropriate to author an expository demo, but made it difficult to approach as a reader wishing to modify and experiment (likewise code is easier to test if easier to configure at runtime).

  • Clear and consolidated declarations of input data (i.e. not scattered across many sites of declaration) without losing the ability to decompose into structured components. The original code used primitive types (lists of dictionaries) for the attention blocks, which became model field defaults in a self-contained module (see .models.config). Since Pydantic allows you to load ("validate") typed data models from these primitive types, we could supply the original dictionary primitive to AttentionBlock.model_validate and it'd still work (but doing so is actually more verbose than just constructing the model class directly).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydanttention-0.1.2.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

pydanttention-0.1.2-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file pydanttention-0.1.2.tar.gz.

File metadata

  • Download URL: pydanttention-0.1.2.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.9.1 CPython/3.10.6

File hashes

Hashes for pydanttention-0.1.2.tar.gz
Algorithm Hash digest
SHA256 1275c31aaa5a912c2d489a8077f8832787013cc27751e841db90864107cb56a7
MD5 d177b64b69f529a5200872fd284b6a46
BLAKE2b-256 52c48b0d5efa6089770f22ef4613ec26d6ddfaca1a77571e0279e0bf30c334ae

See more details on using hashes here.

File details

Details for the file pydanttention-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for pydanttention-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8668d6ca0ba4cdb7ea71744b81ff15b1c054054f5f862f7a3a30557226c9714d
MD5 91b2b350cd777d7c41b7528c6be55c2d
BLAKE2b-256 7ffa62986ff64578eca0bb9957cff487efc348baebefcd3bb1f867e3d2a1670f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page