rStar
Enhances problem-solving capabilities of small language models through self-play mutual reasoning.
CommonProductProgrammingMachine LearningNatural Language Processing
rStar is a self-play mutual reasoning method that significantly boosts the reasoning capabilities of small language models (SLMs) by decomposing the reasoning process into solution generation and mutual verification, without the need for fine-tuning or advanced models. By combining Monte Carlo Tree Search (MCTS) with human reasoning actions, rStar constructs higher quality reasoning trajectories and employs another SLM with similar capabilities as a discriminator to validate the accuracy of these trajectories. Extensive experiments conducted on multiple SLMs have demonstrated its effectiveness in solving diverse reasoning problems.
rStar Visit Over Time
Monthly Visits
515580771
Bounce Rate
37.20%
Page per Visit
5.8
Visit Duration
00:06:42