Mastodon Share
Sharing on Mastodon:

ZeRO: Memory Optimizations Toward Training Trillion Parameter Models – PaperGrep

https://papergrep.dev/paper/zero-memory-optimizations-toward-training-dbb1de

HomeAbout