Skip to main content

Sinkhorn Router - Pytorch

Project description

Sinkhorn Router - Pytorch (wip)

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise. Will contain both a causal and non-causal variant. The causal variant will follow the example used in Megatron

Install

$ pip install sinkhorn-router-pytorch

Usage

import torch
from torch import nn
from sinkhorn_router_pytorch import SinkhornRouter

experts = nn.Parameter(torch.randn(8, 8, 512, 256)) # (experts, heads, dim [in], dim [out])

router = SinkhornRouter(
    dim = 512,
    experts = experts,
    competitive = True,
    causal = False,
)

x = torch.randn(1, 8, 1017, 512)
out = router(x)

assert x.shape[:-1] == out.shape[:-1]

Citations

@article{Shoeybi2019MegatronLMTM,
    title   = {Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism},
    author  = {Mohammad Shoeybi and Mostofa Patwary and Raul Puri and Patrick LeGresley and Jared Casper and Bryan Catanzaro},
    journal = {ArXiv},
    year    = {2019},
    volume  = {abs/1909.08053},
    url     = {https://api.semanticscholar.org/CorpusID:202660670}
}
@article{Anthony2024BlackMambaMO,
    title   = {BlackMamba: Mixture of Experts for State-Space Models},
    author  = {Quentin Anthony and Yury Tokpanov and Paolo Glorioso and Beren Millidge},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2402.01771},
    url     = {https://api.semanticscholar.org/CorpusID:267413070}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinkhorn_router_pytorch-0.0.6.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

sinkhorn_router_pytorch-0.0.6-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file sinkhorn_router_pytorch-0.0.6.tar.gz.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.6.tar.gz
Algorithm Hash digest
SHA256 7fb56e9db56265ff9c80d47648a7c40839607296340b08ab321c7eb0f746314c
MD5 863f16e320f890a4c29ee6453030afef
BLAKE2b-256 930f35bfae25025bf6be04b92cb0b81ff10cbe56ba9018833e449e698bdf5364

See more details on using hashes here.

File details

Details for the file sinkhorn_router_pytorch-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ae1aea11b766254b9aee83338287e69bebc64439c6fe2640971ea96e7c13479f
MD5 9168f54651d95bef52e4353a8479fee5
BLAKE2b-256 371b6b1a501ba3dd23e0483058851875b6d2c2f177d514a837c21e11a569727c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page