Skip to main content

Sinkhorn Router - Pytorch

Project description

Sinkhorn Router - Pytorch (wip)

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise. Will contain both a causal and non-causal variant. The causal variant will follow the example used in Megatron

Install

$ pip install sinkhorn-router-pytorch

Usage

import torch
from torch import nn
from sinkhorn_router_pytorch import SinkhornRouter

experts = nn.Parameter(torch.randn(8, 8, 512, 256)) # (experts, heads, dim [in], dim [out])

router = SinkhornRouter(
    dim = 512,
    experts = experts,
    competitive = True,
    causal = False,
)

x = torch.randn(1, 8, 1017, 512)
out = router(x) # (1, 8, 1017, 256)

Citations

@article{Shoeybi2019MegatronLMTM,
    title   = {Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism},
    author  = {Mohammad Shoeybi and Mostofa Patwary and Raul Puri and Patrick LeGresley and Jared Casper and Bryan Catanzaro},
    journal = {ArXiv},
    year    = {2019},
    volume  = {abs/1909.08053},
    url     = {https://api.semanticscholar.org/CorpusID:202660670}
}
@article{Anthony2024BlackMambaMO,
    title   = {BlackMamba: Mixture of Experts for State-Space Models},
    author  = {Quentin Anthony and Yury Tokpanov and Paolo Glorioso and Beren Millidge},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2402.01771},
    url     = {https://api.semanticscholar.org/CorpusID:267413070}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinkhorn_router_pytorch-0.0.8.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

sinkhorn_router_pytorch-0.0.8-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file sinkhorn_router_pytorch-0.0.8.tar.gz.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.8.tar.gz
Algorithm Hash digest
SHA256 d0949a9b4d71fe1c82f51f9e2b8676a4eacccec71b057605bf3b8fe771abec7f
MD5 84c8cf41556abf99c1909a4c2cf5b235
BLAKE2b-256 44d044146ebbf1e9834d2834f7d742c986f857a746256ad761c37658a03497d0

See more details on using hashes here.

File details

Details for the file sinkhorn_router_pytorch-0.0.8-py3-none-any.whl.

File metadata

File hashes

Hashes for sinkhorn_router_pytorch-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 be2f8f8dbacc39ea925f32f32715de5dc26e3ecd146f9c065f529b1df9d3d32e
MD5 f9c7b50e4733dcad919a90dfc9d214a3
BLAKE2b-256 2f1efa99e6e35efcbbc7c57961a2cdd0ff1a19799fe980f9d946e0f5f4a0b126

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page