Skip to main content

No project description provided

Reason this release was yanked:

Misnamed package - see `attention-smithy` instead

Project description

AttentionSmithy

The Attention Is All You Need paper completely revolutionized the AI industry. After inspiring such programs like GPT and BERT, it seems all deep learning research began exclusively focusing on the attention mechanism behind transformers. This has created a great deal of research surrounding the topic, spawning hundreds of variations to the original paper meant to enhance the original program or tailor it to new applications. Most of these developments happen in isolation, disconnected from the broader community and incompatible with tools made by other developers. For developers that want to experiment with combining these ideas to fit a new problem, such a disjointed state is frustrating.

AttentionSmithy was designed as a platform that allows for flexible experimentation with the attention mechanism in a variety of applications. This includes the ability to use a multitude of positional embeddings, variations on the attention mechanism, and others.

The baseline code was originally inspired by The Annotated Transformer blog code. We have created examples of transformer models in the following repositories:

Future Directions


🤝 Join the conversation! 🤝

As you read and have ideas, please go to the Discussions tab of this repository and share them with us. We have ideas for future extensions and applications, and would love your input.


AttentionSmithy Components

Here is a visual depiction of the different components of a transformer model, using Figure 1 from Attention Is All You Need as reference.

Screenshot 2025-02-10 at 3 53 42 PM

AttentionSmithy Numeric Embedding

Here is a visual depiction of where each positional or numeric embedding fits in to the original model. We have implemented 4 popular strategies (sinusoidal, learned, rotary, ALiBi), but would like to expand to more in the future.

Screenshot 2025-02-10 at 3 49 43 PM

AttentionSmithy Attention Methods

Here is a basic visual of possible attention mechanisms AttentionSmithy has been designed to incorporate in future development efforts. The provided examples include Longformer attention and Big Bird attention.

Screenshot 2025-02-10 at 3 45 58 PM

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

attention_smithyr-1.0.0.tar.gz (54.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

attention_smithyr-1.0.0-py3-none-any.whl (65.1 kB view details)

Uploaded Python 3

File details

Details for the file attention_smithyr-1.0.0.tar.gz.

File metadata

  • Download URL: attention_smithyr-1.0.0.tar.gz
  • Upload date:
  • Size: 54.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for attention_smithyr-1.0.0.tar.gz
Algorithm Hash digest
SHA256 94b8057d99c2adec23778d0252eeec8314d546c07266be46fc1edd54538a12a4
MD5 0243cbe2be600666f88f1d559e619072
BLAKE2b-256 67933985b68ac8e45ccd3e78e598d05d4e30436a7723ceb194d316bed7b17d6a

See more details on using hashes here.

File details

Details for the file attention_smithyr-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for attention_smithyr-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f367da6eb802f6c73ef7e5b6e3987279fcc3e3bbbefc845a4e61a1ad814c0a0d
MD5 1f41842f5902e0f07cb762d463391ed5
BLAKE2b-256 c4233ad4c086cfd91bd7b0b1712df0a60b63eeeb6b1c733d06775da963db37a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page