2 projects
dog-optimizer
implementation of the algorithms in the paper DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
py-sled
SLED models use pretrained, short-range encoder-decoder models, and apply them over long-text inputs by splitting the input into multiple overlapping chunks, encoding each independently and perform fusion-in-decoder