Translated data: The Transformer large language model demonstrates the ability to learn from a few examples by providing contextual samples. However, researchers at DeepMind have discovered that the Transformer fails to generalize beyond the scope of its pre-training data. Through empirical studies, researchers have explored the generalization issues of the Transformer models and found that the model selection capability imposes certain limitations on its generalization ability.