A recent study suggests that through special training, language models can partially achieve more efficient multi-step reasoning capabilities. This ability is similar to the "System 2 reasoning" described by psychologist Daniel Kahneman, which is a slow and conscious way of processing information.
Researchers at Meta have developed a new method that "refines" the computationally intensive multi-step reasoning process into parameters of a language model. The results show that in some cases, models trained with this method can achieve performance similar to the original multi-step process at a lower computational cost.
The working principle of this "refinement" method is: first, apply multi-step reasoning methods to a large amount of example data, then filter and retain results with high consistency, and finally use these data for fine-tuning training of the language model. Essentially, this method generates synthetic training data, enabling the language model to directly draw conclusions without intermediate steps.
Image Source Note: The image is generated by AI, and the image is provided by Midjourney, an image authorization service.
The researchers applied this method to four different multi-step reasoning techniques and five types of tasks. The results show that in many cases, this method can effectively improve model performance, but it is not applicable to all scenarios.
For example, in tasks such as avoiding bias and improving response quality, the "refined" models perform as well as multi-step methods but require significantly less computational resources. However, in complex mathematical reasoning tasks, this method did not work. The researchers speculate that certain tasks may be too complex for single-step reasoning.
Nevertheless, the researchers believe that this method provides a promising direction for developing more powerful language processing systems. In the future, this method can be combined with other technologies to focus on solving truly challenging problems.
This study opens up new paths for improving the reasoning capabilities of language models and is expected to bring breakthroughs in multiple application areas.