Skip to main content

Sinkhorn Router - Pytorch

Project description

Sinkhorn Router - Pytorch (wip)

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise. Will contain both a causal and non-causal variant. The causal variant will follow the example used in Megatron

Install

$ pip install sinkhorn-router-pytorch

Usage

import torch
from torch import nn
from sinkhorn_router_pytorch import SinkhornRouter

experts = nn.Parameter(torch.randn(8, 8, 512, 256)) # (experts, heads, dim [in], dim [out])

router = SinkhornRouter(
    dim = 512,
    experts = experts,
    competitive = True,
    causal = False,
)

x = torch.randn(1, 8, 1017, 512)
out = router(x) # (1, 8, 1017, 256)

Citations

@article{Shoeybi2019MegatronLMTM,
    title   = {Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism},
    author  = {Mohammad Shoeybi and Mostofa Patwary and Raul Puri and Patrick LeGresley and Jared Casper and Bryan Catanzaro},
    journal = {ArXiv},
    year    = {2019},
    volume  = {abs/1909.08053},
    url     = {https://api.semanticscholar.org/CorpusID:202660670}
}
@article{Anthony2024BlackMambaMO,
    title   = {BlackMamba: Mixture of Experts for State-Space Models},
    author  = {Quentin Anthony and Yury Tokpanov and Paolo Glorioso and Beren Millidge},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2402.01771},
    url     = {https://api.semanticscholar.org/CorpusID:267413070}
}
@article{Csordas2023SwitchHeadAT,
    title   = {SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention},
    author  = {R'obert Csord'as and Piotr Piekos and Kazuki Irie and J{\"u}rgen Schmidhuber},
    journal = {ArXiv},
    year    = {2023},
    volume  = {abs/2312.07987},
    url     = {https://api.semanticscholar.org/CorpusID:266191825}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinkhorn_router_pytorch-0.0.14.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

sinkhorn_router_pytorch-0.0.14-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file sinkhorn_router_pytorch-0.0.14.tar.gz.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.14.tar.gz
Algorithm Hash digest
SHA256 1ce22bd1a936cfdbd3682a16725fd1334ed9361b4b248061f767a43fa50f325a
MD5 f599f47c1e0b9738c4e40e93a3a375c7
BLAKE2b-256 e79ac785b7df2ee0136b004c2c9d471ed7eaa1e1af1055103817581c386304d8

See more details on using hashes here.

File details

Details for the file sinkhorn_router_pytorch-0.0.14-py3-none-any.whl.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 7154471acc56ad556c8f206bd3c7f4adeaa72d35d8c7c09677b67683e2da8ffe
MD5 ebb9f3ce6998b83907fe5c381c129210
BLAKE2b-256 b9d02f92e4f4a3095b8e93fba39fea7c4a9331ac3b57244f1e752736bdb7165e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page