Skip to main content

Sinkhorn Router - Pytorch

Project description

Sinkhorn Router - Pytorch (wip)

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise. Will contain both a causal and non-causal variant. The causal variant will follow the example used in Megatron

Install

$ pip install sinkhorn-router-pytorch

Usage

import torch
from torch import nn
from sinkhorn_router_pytorch import SinkhornRouter

experts = nn.Parameter(torch.randn(8, 8, 512, 512)) # (experts, heads, dim [in], dim [out])

router = SinkhornRouter(
    dim = 512,
    experts = experts,
    competitive = True,
    causal = False,
)

x = torch.randn(1, 8, 1017, 512)
out = router(x)

assert x.shape == out.shape

Citations

@article{Shoeybi2019MegatronLMTM,
    title   = {Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism},
    author  = {Mohammad Shoeybi and Mostofa Patwary and Raul Puri and Patrick LeGresley and Jared Casper and Bryan Catanzaro},
    journal = {ArXiv},
    year    = {2019},
    volume  = {abs/1909.08053},
    url     = {https://api.semanticscholar.org/CorpusID:202660670}
}
@article{Anthony2024BlackMambaMO,
    title   = {BlackMamba: Mixture of Experts for State-Space Models},
    author  = {Quentin Anthony and Yury Tokpanov and Paolo Glorioso and Beren Millidge},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2402.01771},
    url     = {https://api.semanticscholar.org/CorpusID:267413070}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinkhorn_router_pytorch-0.0.5.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

sinkhorn_router_pytorch-0.0.5-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file sinkhorn_router_pytorch-0.0.5.tar.gz.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.5.tar.gz
Algorithm Hash digest
SHA256 fbb7bc08ac9303cfbb48b83ced8683165c7d5fadf185767b2af2d60a1b8a4131
MD5 88f9b3fadce175bd9b931f7762479a08
BLAKE2b-256 b4b047da2a080603302cc48811c49bc85997ba5e7f9be1acf19ac0948f082938

See more details on using hashes here.

File details

Details for the file sinkhorn_router_pytorch-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 96eca3dff7f4a625c5ead359872e8e6d30ca151910a338d41dfd3ea370b92e59
MD5 ab91170dd251512a8495c70c0ad9a492
BLAKE2b-256 1f12cf964e14f7217bc6742bdb2ddd2352788d7d4ff33f0837ec0c1a044b73fa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page