Skip to main content

Sinkhorn Router - Pytorch

Project description

Sinkhorn Router - Pytorch (wip)

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise. Will contain both a causal and non-causal variant. The causal variant will follow the example used in Megatron

Install

$ pip install sinkhorn-router-pytorch

Usage

import torch
from torch import nn
from sinkhorn_router_pytorch import SinkhornRouter

experts = nn.Parameter(torch.randn(8, 8, 512, 256)) # (experts, heads, dim [in], dim [out])

router = SinkhornRouter(
    dim = 512,
    experts = experts,
    competitive = True,
    causal = False,
)

x = torch.randn(1, 8, 1017, 512)
out = router(x)

assert x.shape[:-1] == out.shape[:-1]

Citations

@article{Shoeybi2019MegatronLMTM,
    title   = {Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism},
    author  = {Mohammad Shoeybi and Mostofa Patwary and Raul Puri and Patrick LeGresley and Jared Casper and Bryan Catanzaro},
    journal = {ArXiv},
    year    = {2019},
    volume  = {abs/1909.08053},
    url     = {https://api.semanticscholar.org/CorpusID:202660670}
}
@article{Anthony2024BlackMambaMO,
    title   = {BlackMamba: Mixture of Experts for State-Space Models},
    author  = {Quentin Anthony and Yury Tokpanov and Paolo Glorioso and Beren Millidge},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2402.01771},
    url     = {https://api.semanticscholar.org/CorpusID:267413070}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinkhorn_router_pytorch-0.0.7.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

sinkhorn_router_pytorch-0.0.7-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file sinkhorn_router_pytorch-0.0.7.tar.gz.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.7.tar.gz
Algorithm Hash digest
SHA256 7bde3952c3866b049ef32f32811bb066549bcc5cb02cc98c559eda90795e1d4d
MD5 076b672f750f4fd66b8e03b36e74836c
BLAKE2b-256 8953282ce0898d4640b9833fa6745ecb61c50e7c2bf838c8e77b7846f4b53ffc

See more details on using hashes here.

File details

Details for the file sinkhorn_router_pytorch-0.0.7-py3-none-any.whl.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 8b839c2bd89c56a4ba89312578b541411f7c2b199c65685522813ea59aebfbbe
MD5 5422f0b7857f672c6b798f2fa4d77e14
BLAKE2b-256 80dda70f5bed1e57e201bc61c44731b6df76ccdbadcf0c0b5b5ffc6dec634a70

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page