Mastodon Share
Sharing on Mastodon:

Reducing Activation Recomputation in Large Transformer Models – PaperGrep

https://papergrep.dev/paper/reducing-activation-recomputation-in-large-d6f0c7

HomeAbout