Skip to main content

Sinkhorn Router - Pytorch

Project description

Sinkhorn Router - Pytorch (wip)

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise. Will contain both a causal and non-causal variant. The causal variant will follow the example used in Megatron

Install

$ pip install sinkhorn-router-pytorch

Usage

import torch
from torch import nn
from sinkhorn_router_pytorch import SinkhornRouter

experts = nn.Parameter(torch.randn(8, 8, 512, 256)) # (experts, heads, dim [in], dim [out])

router = SinkhornRouter(
    dim = 512,
    experts = experts,
    competitive = True,
    causal = False,
)

x = torch.randn(1, 8, 1017, 512)
out = router(x) # (1, 8, 1017, 256)

Citations

@article{Shoeybi2019MegatronLMTM,
    title   = {Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism},
    author  = {Mohammad Shoeybi and Mostofa Patwary and Raul Puri and Patrick LeGresley and Jared Casper and Bryan Catanzaro},
    journal = {ArXiv},
    year    = {2019},
    volume  = {abs/1909.08053},
    url     = {https://api.semanticscholar.org/CorpusID:202660670}
}
@article{Anthony2024BlackMambaMO,
    title   = {BlackMamba: Mixture of Experts for State-Space Models},
    author  = {Quentin Anthony and Yury Tokpanov and Paolo Glorioso and Beren Millidge},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2402.01771},
    url     = {https://api.semanticscholar.org/CorpusID:267413070}
}
@article{Csordas2023SwitchHeadAT,
    title   = {SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention},
    author  = {R'obert Csord'as and Piotr Piekos and Kazuki Irie and J{\"u}rgen Schmidhuber},
    journal = {ArXiv},
    year    = {2023},
    volume  = {abs/2312.07987},
    url     = {https://api.semanticscholar.org/CorpusID:266191825}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinkhorn_router_pytorch-0.0.11.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

sinkhorn_router_pytorch-0.0.11-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file sinkhorn_router_pytorch-0.0.11.tar.gz.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.11.tar.gz
Algorithm Hash digest
SHA256 cf412d992a4a20df64594574b0ba197e21a9a7563c359fc3a0cd966ac796eed8
MD5 e656e51c8d396e64379fc754888bbdc7
BLAKE2b-256 695ce8c9ea65f780a8b5995814db6844c172f9ff0cb0ad73a6463c421c44275a

See more details on using hashes here.

File details

Details for the file sinkhorn_router_pytorch-0.0.11-py3-none-any.whl.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 9e5b02e87ffef10a94ba2c8bd6cac1bfa4153720a90844b8570b9839135dbb9b
MD5 7ba96d013703a115a02ff22de0ef09a4
BLAKE2b-256 f0f5d074c5da66a1d67a01c0215bf9a6f546b597cee55a7fb990bd12c002bb79

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page