Recently, researchers from institutions such as Stanford University fed thousands of top-tier conference articles, such as Nature and ICLR, into GPT-4 to generate review comments, which were then compared with those of human reviewers. The results showed that GPT-4's review comments were consistent with those of humans more than 50% of the time and were found to be helpful for 82% of the authors. The study also revealed that, unlike human reviewers, GPT-4 paid more attention to the impact factor of the papers rather than details like supplementary ablation experiments. Users generally believe that the review feedback generated by GPT-4 can improve the accuracy of reviews and reduce labor costs. This research indicates that using large language models (LLMs) to assist in academic paper reviews is feasible.
GPT-4 Becomes a Reviewer for Nature? Stanford and Tsinghua Alumni Test Paper Shows Over Half of the Review Comments Match Human Reviews
新智元
73
© Copyright AIbase Base 2024, Click to View Source - https://www.aibase.com/news/1802