Skip to main content

Hyper-Connections

Project description

Hyper Connections

Attempt to make multiple residual streams, proposed in Hyper-Connections paper out of Bytedance AI lab, accessible as an easy to use library, as well as for following any new research in this direction.

Write up on mHC from Subhadip Mitra

Install

$ pip install hyper-connections

Usage

import torch
from torch import nn

# a single branch layer

branch = nn.Linear(512, 512)

# before

residual = torch.randn(2, 1024, 512)

residual = branch(residual) + residual

# after, say 4 streams in paper

from hyper_connections import get_init_and_expand_reduce_stream_functions

init_hyper_conn, expand_stream, reduce_stream = get_init_and_expand_reduce_stream_functions(4)

# 1. wrap your branch function

hyper_conn_branch = init_hyper_conn(dim = 512, branch = branch)

# 2. expand to 4 streams, this must be done before your trunk, typically a for-loop with many branch functions

residual = expand_stream(residual)

# 3. forward your residual as usual into the wrapped branch function(s)

residual = hyper_conn_branch(residual)

# 4. reduce 4 streams with a summation, this has to be done after your for-loop trunk. for transformer, unsure whether to do before or after final norm

residual = reduce_stream(residual)

Or doing it manually, as in the paper

import torch
from torch import nn

# a single branch layer

branch = nn.Linear(512, 512)

# before

residual = torch.randn(2, 1024, 512)

residual = branch(residual) + residual

# after, say 4 streams in paper

from hyper_connections import get_init_and_expand_reduce_stream_functions

init_hyper_conn, expand_stream, reduce_stream = get_init_and_expand_reduce_stream_functions(4)

# 1. instantiate hyper connection with correct number of streams (4 in this case) - or use the init function above

hyper_conn = init_hyper_conn(dim = 512)

# 2. expand to 4 streams

residual = expand_stream(residual)

# 3. forward your residual into hyper connection for the branch input + add residual function (learned betas)

branch_input, add_residual = hyper_conn(residual)

branch_output = branch(branch_input)

residual = add_residual(branch_output)

# or you can do it in one line as so -> residual = hyper_conn.decorate_branch(branch)(residual)

# 4. reduce 4 streams with a summation, this has to be done after your for loop trunk

residual = reduce_stream(residual)

To compare hyper connections to plain residual without changing the code, just pass disable = True when fetching the functions

get_init_and_expand_reduce_stream_functions(4, disable = True)

To use the fractionated feature dimensions proposed in a follow up paper by same authors, just instantiate with num_fracs greater than 1 as so

get_init_and_expand_reduce_stream_functions(1, num_fracs = 4) # also allows you to mix streams and fractions of feature dimension

Citation

@article{Zhu2024HyperConnections,
    title   = {Hyper-Connections},
    author  = {Defa Zhu and Hongzhi Huang and Zihao Huang and Yutao Zeng and Yunyao Mao and Banggu Wu and Qiyang Min and Xun Zhou},
    journal = {ArXiv},
    year    = {2024},
    volume  = {abs/2409.19606},
    url     = {https://api.semanticscholar.org/CorpusID:272987528}
}
@misc{Rubin2024,
    author  = {Ohad Rubin},
    url     = {https://medium.com/@ohadrubin/exploring-weight-decay-in-layer-normalization-challenges-and-a-reparameterization-solution-ad4d12c24950}
}
@article{Zhu2025FracConnectionsFE,
    title   = {Frac-Connections: Fractional Extension of Hyper-Connections},
    author  = {Defa Zhu and Hongzhi Huang and Jundong Zhou and Zihao Huang and Yutao Zeng and Banggu Wu and Qiyang Min and Xun Zhou},
    journal = {ArXiv},
    year    = {2025},
    volume  = {abs/2503.14125},
    url     = {https://api.semanticscholar.org/CorpusID:277104144}
}
@misc{xie2025mhcmanifoldconstrainedhyperconnections,
    title   = {mHC: Manifold-Constrained Hyper-Connections},
    author  = {Zhenda Xie and Yixuan Wei and Huanqi Cao and Chenggang Zhao and Chengqi Deng and Jiashi Li and Damai Dai and Huazuo Gao and Jiang Chang and Liang Zhao and Shangyan Zhou and Zhean Xu and Zhengyan Zhang and Wangding Zeng and Shengding Hu and Yuqing Wang and Jingyang Yuan and Lean Wang and Wenfeng Liang},
    year    = {2025},
    eprint  = {2512.24880},
    archivePrefix = {arXiv},
    primaryClass = {cs.CL},
    url     = {https://arxiv.org/abs/2512.24880},
}
@misc{oh2026revisitingresidualconnectionsorthogonal,
    title   = {Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks},
    author  = {Giyeong Oh and Woohyun Cho and Siyeol Kim and Suhwan Choi and Youngjae Yu},
    year    = {2026},
    eprint  = {2505.11881},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV},
    url     = {https://arxiv.org/abs/2505.11881},
}
@misc{lu2026meanmodescreamingmeanvariance,
    title   = {Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers},
    author  = {Pengqi Lu},
    year    = {2026},
    eprint  = {2605.06169},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG},
    url     = {https://arxiv.org/abs/2605.06169},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyper_connections-0.4.11.tar.gz (17.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hyper_connections-0.4.11-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file hyper_connections-0.4.11.tar.gz.

File metadata

  • Download URL: hyper_connections-0.4.11.tar.gz
  • Upload date:
  • Size: 17.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.13

File hashes

Hashes for hyper_connections-0.4.11.tar.gz
Algorithm Hash digest
SHA256 d4f852733cf8595b2b15e81597ec0ce931907f284a812272c03edea0a4ef12e3
MD5 75e555446483013022a61e4ea4aeceae
BLAKE2b-256 862e47f00583d958527f006459e0e769aa3b20dc61b7b1916df1fa0b9820a192

See more details on using hashes here.

File details

Details for the file hyper_connections-0.4.11-py3-none-any.whl.

File metadata

File hashes

Hashes for hyper_connections-0.4.11-py3-none-any.whl
Algorithm Hash digest
SHA256 92a8d24340ef6dbf2a356b1f2e74ebb877d680aa96afc6f9f1cff636ab2e55f0
MD5 d0dd56ddb8216c031c263699bba20303
BLAKE2b-256 422d23143d9e5af1b6a2e7ed4e5f88a6c0cbf7587563acb8f15a740b726ce07f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page