Recently, Kunlun Technology has collaborated with Nanyang Technological University in Singapore to successfully develop an algorithm named Q*, which significantly enhances the reasoning capabilities of existing large models. Q* enables smaller models to achieve reasoning abilities comparable to models with parameters tens or even hundreds of times larger, substantially improving model performance while markedly reducing the demand for computational resources. This breakthrough opens up new possibilities for the widespread application of artificial intelligence, ushering in a new era of efficient intelligence.

Kunlun Technology

In the paper titled "Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning," researchers introduced the Q* framework. By decomposing the reasoning trajectory of large language models into several states and utilizing the A* search algorithm for comprehensive planning, the performance of open-source models in reasoning tasks was enhanced.

Specifically, by defining functions for Path Cost and Accumulated Reward, the framework takes into account both historical state benefits and future expected rewards. In experiments, Q* helped various models achieve significant accuracy improvements across different datasets, outperforming some well-known models.

Currently, the research on Q* is still in its early stages and has room for improvement. In the future, Kunlun Technology will continue to delve deeper into this research, enhancing the reasoning capabilities of domestic open-source models and bringing more possibilities for the development of artificial intelligence technology.

Paper Link:

https://arxiv.org/abs/2406.14283