Skip to main content

Sinkhorn Router - Pytorch

Project description

Sinkhorn Router - Pytorch (wip)

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise. Will contain both a causal and non-causal variant. The causal variant will follow the example used in Megatron

Install

$ pip install sinkhorn-router-pytorch

Usage

import torch
from torch import nn
from sinkhorn_router_pytorch import SinkhornRouter

experts = nn.Parameter(torch.randn(8, 8, 512, 512)) # (experts, heads, dim [in], dim [out])

router = SinkhornRouter(
    dim = 512,
    experts = experts,
    competitive = True,
    causal = False,
)

x = torch.randn(1, 8, 1017, 512)
out = router(x)

assert x.shape == out.shape

Citations

@article{Shoeybi2019MegatronLMTM,
    title   = {Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism},
    author  = {Mohammad Shoeybi and Mostofa Patwary and Raul Puri and Patrick LeGresley and Jared Casper and Bryan Catanzaro},
    journal = {ArXiv},
    year    = {2019},
    volume  = {abs/1909.08053},
    url     = {https://api.semanticscholar.org/CorpusID:202660670}
}
@article{Anthony2024BlackMambaMO,
    title   = {BlackMamba: Mixture of Experts for State-Space Models},
    author  = {Quentin Anthony and Yury Tokpanov and Paolo Glorioso and Beren Millidge},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2402.01771},
    url     = {https://api.semanticscholar.org/CorpusID:267413070}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinkhorn_router_pytorch-0.0.3.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

sinkhorn_router_pytorch-0.0.3-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file sinkhorn_router_pytorch-0.0.3.tar.gz.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.3.tar.gz
Algorithm Hash digest
SHA256 42685f4b8f70e7a3f846aab95fcc84db33f08e73aafffc5c794990a07213ee6f
MD5 7254eab5ff7b5651c8dba47885f1be2a
BLAKE2b-256 ef05eb3d5bb3295d085feaffdf502a2b908fae609b581711bc82f2df55e42b23

See more details on using hashes here.

File details

Details for the file sinkhorn_router_pytorch-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 413c4855f92955c3ccdd52d6c62faea406dba9e2276c20e90c27075e050a29c0
MD5 fcb9ec8a57c022bdb05e6f304ea32596
BLAKE2b-256 88edb0f6a15603d5c9c1dab888372466ec1281be5e74f2362febeed067e9aade

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page