Skip to main content

Memodo: An linear attention solution

Project description

Memodo: An linear attention solution

Memodo is an linear attention solution that combining the advantages of both RWKV and DeltaNet.

Usage

Just use memodo.MemodoLayer, this is an subclass of torch.nn.Module.

Mechanism

Memodo use the General Delta Rule directly:

S -> S * diag(i) + S * a^T * b + c^T * d
return r * S

With Dynamic Token Shift:

d[t] = sigmoid(silu(lerp(x[t], x[t - 1], w1) * w2) * w3)
x[t] = lerp(x[t], x[t - 1], d[t])

And gated residual:

R -> R + Block(x) * sigmoid(silu(LayerNorm(R) * w1) * w2)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memodo-1.0.0-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file memodo-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: memodo-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for memodo-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3b00275db2b1f018b407c905c67a14e3a7ba6cf42de931131d198c2c3c632168
MD5 feea1cf30c2232e28943d41d44d45133
BLAKE2b-256 eb60be0e572bb5e48e2f1c2e2b3207b0ec526dcccfb74a6a5fd7f35de9ec9da3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page