A research team from Yale University recently unveiled a groundbreaking study, revealing a critical insight into AI model training: the optimal learning effect of AI does not necessarily improve with simpler or more complex data, but rather exists at an optimal level of complexity—a state known as the "edge of chaos."

The team conducted experiments using elementary cellular automata (ECAs), a simple system where each cell's future state depends only on itself and its two neighboring cells. Despite the simplicity of the rules, this system can generate a diverse range of patterns from simple to highly complex. Researchers then evaluated the performance of these language models on tasks such as reasoning and chess move prediction.

image.png

The results indicate that AI models trained on more complex ECA rules perform better on subsequent tasks. Models trained on Class IV ECAs in Wolfram's classification, which produce patterns that are neither entirely ordered nor completely chaotic but exhibit structured complexity, showed the best performance.

Researchers found that when models are exposed to overly simple patterns, they tend to learn only simple solutions. In contrast, models trained on more complex patterns can develop more sophisticated processing abilities even when simple solutions are available. The team hypothesizes that the complexity of learning representations is a key factor enabling models to transfer knowledge to other tasks.

This finding may explain why large language models like GPT-3 and GPT-4 are so efficient. Researchers believe that the vast and diverse data used in the training of these models may create effects similar to the complex ECA patterns in their study.