Skip to main content

Sinkhorn Router - Pytorch

Project description

Sinkhorn Router - Pytorch (wip)

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise. Will contain both a causal and non-causal variant. The causal variant will follow the example used in Megatron

Install

$ pip install sinkhorn-router-pytorch

Usage

import torch
from torch import nn
from sinkhorn_router_pytorch import SinkhornRouter

experts = nn.Parameter(torch.randn(8, 8, 512, 256)) # (experts, heads, dim [in], dim [out])

router = SinkhornRouter(
    dim = 512,
    experts = experts,
    competitive = True,
    causal = False,
)

x = torch.randn(1, 8, 1017, 512)
out = router(x) # (1, 8, 1017, 256)

Citations

@article{Shoeybi2019MegatronLMTM,
    title   = {Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism},
    author  = {Mohammad Shoeybi and Mostofa Patwary and Raul Puri and Patrick LeGresley and Jared Casper and Bryan Catanzaro},
    journal = {ArXiv},
    year    = {2019},
    volume  = {abs/1909.08053},
    url     = {https://api.semanticscholar.org/CorpusID:202660670}
}
@article{Anthony2024BlackMambaMO,
    title   = {BlackMamba: Mixture of Experts for State-Space Models},
    author  = {Quentin Anthony and Yury Tokpanov and Paolo Glorioso and Beren Millidge},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2402.01771},
    url     = {https://api.semanticscholar.org/CorpusID:267413070}
}
@article{Csordas2023SwitchHeadAT,
    title   = {SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention},
    author  = {R'obert Csord'as and Piotr Piekos and Kazuki Irie and J{\"u}rgen Schmidhuber},
    journal = {ArXiv},
    year    = {2023},
    volume  = {abs/2312.07987},
    url     = {https://api.semanticscholar.org/CorpusID:266191825}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinkhorn_router_pytorch-0.0.12.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

sinkhorn_router_pytorch-0.0.12-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file sinkhorn_router_pytorch-0.0.12.tar.gz.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.12.tar.gz
Algorithm Hash digest
SHA256 5be8df6edc0bdb5de19eea68ad083dde750e2654ffc1e2bf399a5a5f9c46cf8a
MD5 8f40dd5868c27e77db904c3165fa8e34
BLAKE2b-256 654f85ed23295ab53d9281bc22aec36fa883f26cdc79e1554e6f2fb2c8c754b9

See more details on using hashes here.

File details

Details for the file sinkhorn_router_pytorch-0.0.12-py3-none-any.whl.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.12-py3-none-any.whl
Algorithm Hash digest
SHA256 f9ba0962549cc258eaec56c92d7844c46240cd3d2074e621001035c199fc3336
MD5 eaaae208fc907b1041be4f272fcede42
BLAKE2b-256 75ac1078f538ed4939bdbd451f55eedaa914c9b9563b98f2d2c8a4efc83754a5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page