Skip to main content

Sinkhorn Router - Pytorch

Project description

Sinkhorn Router - Pytorch (wip)

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise. Will contain both a causal and non-causal variant. The causal variant will follow the example used in Megatron

Install

$ pip install sinkhorn-router-pytorch

Usage

import torch
from torch import nn
from sinkhorn_router_pytorch import SinkhornRouter

experts = nn.Parameter(torch.randn(8, 8, 512, 512)) # (experts, heads, dim [in], dim [out])

router = SinkhornRouter(
    dim = 512,
    experts = experts,
    competitive = True,
    causal = False,
)

x = torch.randn(1, 8, 1017, 512)
out = router(x)

assert x.shape == out.shape

Citations

@article{Shoeybi2019MegatronLMTM,
    title   = {Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism},
    author  = {Mohammad Shoeybi and Mostofa Patwary and Raul Puri and Patrick LeGresley and Jared Casper and Bryan Catanzaro},
    journal = {ArXiv},
    year    = {2019},
    volume  = {abs/1909.08053},
    url     = {https://api.semanticscholar.org/CorpusID:202660670}
}
@article{Anthony2024BlackMambaMO,
    title   = {BlackMamba: Mixture of Experts for State-Space Models},
    author  = {Quentin Anthony and Yury Tokpanov and Paolo Glorioso and Beren Millidge},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2402.01771},
    url     = {https://api.semanticscholar.org/CorpusID:267413070}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinkhorn_router_pytorch-0.0.1.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

sinkhorn_router_pytorch-0.0.1-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file sinkhorn_router_pytorch-0.0.1.tar.gz.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.1.tar.gz
Algorithm Hash digest
SHA256 39d5a2c15b3550efcd1c3e2f88a38c2df3bb0c3770328e265ab76a65d9c16378
MD5 36c2ba763d0f8ae89b7d87adcb90adf2
BLAKE2b-256 2e3f3cfac75aa9edda7c578f143ea772a423bf60ea7b0f6e3041d177103446e8

See more details on using hashes here.

File details

Details for the file sinkhorn_router_pytorch-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 03a7ce6dca28998f71b5f4cba9a5c20f0203b985d58fe8ca318d241180e4450e
MD5 e6063c88eabda0a9ca6c19fa36cf5001
BLAKE2b-256 82b0f5a904ddddc3ec61cba881cb8b064e007f68b0abe47e5a4b42aefe4fbb8f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page