Skip to main content

Sinkhorn Router - Pytorch

Project description

Sinkhorn Router - Pytorch (wip)

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise. Will contain both a causal and non-causal variant. The causal variant will follow the example used in Megatron

Install

$ pip install sinkhorn-router-pytorch

Usage

import torch
from torch import nn
from sinkhorn_router_pytorch import SinkhornRouter

experts = nn.Parameter(torch.randn(8, 8, 512, 512)) # (experts, heads, dim [in], dim [out])

router = SinkhornRouter(
    dim = 512,
    experts = experts,
    competitive = True,
    causal = False,
)

x = torch.randn(1, 8, 1017, 512)
out = router(x)

assert x.shape == out.shape

Citations

@article{Shoeybi2019MegatronLMTM,
    title   = {Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism},
    author  = {Mohammad Shoeybi and Mostofa Patwary and Raul Puri and Patrick LeGresley and Jared Casper and Bryan Catanzaro},
    journal = {ArXiv},
    year    = {2019},
    volume  = {abs/1909.08053},
    url     = {https://api.semanticscholar.org/CorpusID:202660670}
}
@article{Anthony2024BlackMambaMO,
    title   = {BlackMamba: Mixture of Experts for State-Space Models},
    author  = {Quentin Anthony and Yury Tokpanov and Paolo Glorioso and Beren Millidge},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2402.01771},
    url     = {https://api.semanticscholar.org/CorpusID:267413070}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinkhorn_router_pytorch-0.0.4.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

sinkhorn_router_pytorch-0.0.4-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file sinkhorn_router_pytorch-0.0.4.tar.gz.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.4.tar.gz
Algorithm Hash digest
SHA256 6d88a45294b35155819b589ecd3b1f06bd6b0c8eb145381f9cdc496ec40e802a
MD5 adc44e7dfd35cfd2a687a250e119c02e
BLAKE2b-256 8be4a8d4d77d57d4a37701c0b6d8dadb625a0e3440fb4064ee24174271713194

See more details on using hashes here.

File details

Details for the file sinkhorn_router_pytorch-0.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a4ab89891fea55fec514db274fbb218b566b491d15d5931dbe5617658e0cd4da
MD5 40b0d4155251eecf56bf3bd7250b4754
BLAKE2b-256 7632da0f1c0f6c067eb10cc038413c6107559511520abd7277943116d3dbacba

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page