CAN ChatGPT AND OTHER AI BOTS SERVE AS PEER REVIEWERS?

: Published: 12 January 2024; Created: 12 January 2024

A recent study of large-scale empirical analysis (posted on a preprint server) suggested that AI can be a valuable tool in seeking feedback on the quality of published articles after finding overlap in the points raised by GPT-4 and by human reviewers. As an example, for Nature journals, the average overlap between GPT-4 and a human reviewer was 30.85%, while the overlap between two human reviewers was 28.58%. However, others have warned of the need to critically assess the validity of AI-generated reviews.

Currently, AI tools are useful in prescreening manuscripts to check for journal scope, validate authenticity of authors and their affiliations, identify missing items such as figures and tables, and suggest potential reviewers. However, the question is how effective these AI bots would be if we must rely on their reviews without any human input:

LLMs, such as ChatGPT, rely on a database, with limited access to crawl the Internet to seek all available information. Consequently, LLMs are unlikely to gather complete background information.
LLMs are effective at summarizing the work of a manuscript and making suggestions for additional work. However, critical analysis of the results in the context of earlier published work or identifying novelty is lacking.
Interestingly, ChatGPT and Bing AI Chat provide the requested review of a scientific article in response to a user prompt, viz., the resulting review depends on how one phrases the review request.
If the request is to write a negative review, the bot generates a review with critical assessments that highlight deficiencies and recommends against publication. Examples of these two different reviews created by ChatGPT-2 and Bing AI Chat for the same hypothetical title of a paper are presented here.
AI bots are susceptible to bias and discrimination.
AI bots such as ChatGPT are known to cite non-existent sources. Given this deficiency, AI bots are ill-equipped to validate cited references in a manuscript or make reliable new suggestions when discussing previously published work.
AI bots are unlikely to recognize new inventions for which there is minimal previous scientific basis or background information.

More: https://pubs.acs.org/doi/10.1021/acsenergylett.3c02586#

Popular articles

Comment of the week

CAN ChatGPT AND OTHER AI BOTS SERVE AS PEER REVIEWERS?

Will NSF’s flagship training program survive under Trump?