Memory Layers at Scale is an innovative implementation of memory layers that adds extra parameters to models through a trainable key-value lookup mechanism, without increasing floating-point operations. This method is particularly significant in large-scale language models as it enhances the model's storage and retrieval capabilities while maintaining computational efficiency. The key advantages of this technology include effective model capacity expansion, reduced computational resource consumption, and improved model flexibility and scalability. Developed by the Meta Lingua team, this project is suited for scenarios that handle large datasets and complex models.