Peer review is the cornerstone of scientific advancement, but with the surge in submissions, this system is under immense pressure. To alleviate this issue, there have been attempts to utilize large language models (LLMs) for assisting in the review process.

However, a recent study has revealed serious risks associated with LLM-assisted reviews, indicating that we may not be ready for the widespread adoption of LLMs in this context.

A research team from Shanghai Jiao Tong University found through experiments that authors can influence LLM review outcomes by embedding subtle manipulative content in their papers. This manipulation can be overt, such as adding barely noticeable white text at the end of the paper, instructing the LLM to emphasize the paper's strengths and downplay its weaknesses.

image.png

The experiments showed that such overt manipulation can significantly boost the scores given by LLMs, with all papers receiving positive evaluations, raising the average score from 5.34 to 7.99. More concerning is that the manipulated LLM review results showed a significant decrease in alignment with human review results, indicating a substantial reduction in reliability.

Additionally, the research uncovered a more covert form of manipulation: implicit manipulation. Authors can guide LLMs to repeat subtle flaws by actively disclosing minor defects within their papers.

image.png

Compared to human reviewers, LLMs are more susceptible to this method, being 4.5 times more likely to reiterate the limitations stated by the authors. This practice allows authors to more easily address review comments during the defense stage, gaining an unfair advantage.

The research also highlighted inherent flaws in LLM reviews:

Illusion problem: Even in the absence of content, LLMs can generate fluent review comments. For example, when presented with a blank paper, an LLM may still claim, "This paper presents a novel approach." Even with just the title provided, the LLM might give a score similar to that of a complete paper.

image.png

Preference for longer papers: The LLM review system tends to give higher scores to longer papers, suggesting a potential bias based on paper length.

Author bias: In single-blind reviews, if the author is from a well-known institution or is a prominent scholar, the LLM review system is more likely to give positive evaluations, which may exacerbate unfairness in the review process.

To further validate these risks, researchers conducted experiments using different LLMs, including Llama-3.1-70B-Instruct, DeepSeek-V2.5, and Qwen-2.5-72B-Instruct. The results indicated that these LLMs are all at risk of implicit manipulation and face similar illusion problems. The researchers found a positive correlation between LLM performance and its consistency with human reviews, but even the strongest model, GPT-4o, did not completely avoid these issues.

The researchers conducted extensive experiments using publicly available review data from ICLR2024. The results showed that overt manipulation could make LLM review opinions almost entirely controlled by the manipulative content, achieving up to 90% consistency, leading to all papers receiving positive feedback. Moreover, manipulating just 5% of the review opinions could result in 12% of papers losing their position in the top 30% ranking.

The researchers emphasized that the current robustness of LLMs is insufficient for them to replace human reviewers in academic evaluations. They recommend pausing the use of LLMs for peer review until a more comprehensive understanding of these risks is achieved and effective safety measures are established. Furthermore, journal and conference organizers should implement detection tools and accountability measures to identify and address malicious manipulation by authors and the use of LLMs in place of human judgment by reviewers.

The researchers believe that LLMs can serve as auxiliary tools, providing additional feedback and insights for reviewers, but they must never replace human judgment. They urge the academic community to continue exploring ways to make LLM-assisted review systems more robust and secure, maximizing the potential of LLMs while mitigating risks.

Paper link: https://arxiv.org/pdf/2412.01708