Mastodon Share
Sharing on Mastodon:

Automatic Cross-Replica Sharding of Weight Update in Data-Parallel Training – PaperGrep

https://papergrep.dev/paper/automatic-cross-replica-sharding-of-weight-update-2747b7

HomeAbout