mixture-of-experts
PublicA Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models