Mastodon Share
Sharing on Mastodon:

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts – PaperGrep

https://papergrep.dev/paper/glam-efficient-scaling-of-language-models-with-9eb061

HomeAbout