Abstract
Attitudes and Perceptions Toward the Use of Artificial Intelligence Chatbots in Medical Journal Peer Review: A Large-Scale, International Cross-Sectional Survey
Jeremy Y. Ng,1,2,3 Daivat Bhavsar,1,2 Neha Dhanvanthry,1,2 Lex Bouter,4,5 Teresa M. Chan,6 Holger Cramer,1,2 Annette Flanagin,7 Alfonso Iorio,8,9 Cynthia Lokker,8 Hervé Maisonneuve,10,11 Ana Marušić,12 David Moher3,13
Objective
Peer review is a cornerstone of scientific publishing, ensuring the rigor and credibility of research. However, it is increasingly strained by underexplored challenges, such as variability in quality, reviewer fatigue, and biases. Artificial intelligence chatbots (AICs) are emerging tools with the potential to address these challenges by automating tasks, such as identifying methodological flaws and improving language clarity. This study aimed to explore peer reviewers’ attitudes and perceptions toward the use of AICs in medical journal peer review, focusing on their benefits, challenges, and ethical implications.
Design
A large-scale, international, cross-sectional survey targeting peer reviewers of medical journals was conducted. Eligible participants had completed at least 1 peer review report for a MEDLINE-indexed journal within the last 24 months at the point of invitation. The names and email addresses of a complete sample of 72,847 corresponding authors who published in MEDLINE-indexed journals from September 1, 2024, to October 15, 2024, were collected. The survey was administered and data were collected via the SurveyMonkey survey platform. The survey included questions about participant demographics, familiarity with AICs, experiences with their use, and perceptions of AIC’s roles, benefits, challenges, and ethical implications in peer review. The survey included multiple-choice, yes/no, Likert scale, and open-ended questions to ensure comprehensive data collection. The survey was first piloted with a group of 7 peer reviewers to optimize question clarity and relevance. The survey was launched in April 2025 and remained open for 7 weeks in total. This study’s protocol was registered1 and the CHERRIES2 and STROBE3 reporting guidelines were used to inform the reporting of this survey study.
Results
The survey received 1194 respondents from 33,388 opened email invitations in the first 4 weeks, 1018 of whom completed the survey (response rate: 3.0%). The majority (578/1018 [56.8%]) were senior-career researchers (ie, >10 years of career experience). Most reported never having used an AIC for peer review purposes (707/999 [70.8%]), and notably, many (460/1007 [45.7%]) did not anticipate using AICs in the future. However, most respondents indicated interest in receiving training on AIC use for peer review tasks (587/980 [59.9%]). The most-recognized benefit of AIC use in peer review processes (ie, ”agree” or “strongly agree”) was “reduc[ing] the workload” (601/958 [62.7%]). Conversely, the most-recognized challenge was “risk producing errors or inaccuracies” (738/934 [79.0%]).
Conclusions
This study provides critical insights into peer reviewers’ attitudes and perceptions toward the use of AICs in the peer review process. Findings may inform the development of evidence-based guidelines and policies to ensure the ethical, transparent, and effective use of AICs, contributing to improvements in efficiency, quality, and equity in scholarly publishing.
References
1. Ng JY, Bhavsar D, Dhanvanthry N, et al. Attitudes and perceptions toward the use of artificial intelligence chatbots in medical journal peer review: a large-scale, international cross-sectional survey. OSF. May 21, 2025. https://osf.io/fhc2m
2. Eysenbach G. Improving the quality of Web surveys: the Checklist for Reporting Results of Internet E-Surveys (CHERRIES). J Med Internet Res. 2004;6(3):e34. doi:10.2196/jmir.6.3.e34
3. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP; STROBE Initiative. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. The Lancet. 2007;370(9596):1453-1457. doi:10.1016/S0140-6736(07)61602-X
1Institute of General Practice and Interprofessional Care, University Hospital Tübingen, Tübingen, Germany, jeremyyng.phd@gmail.com; 2Robert Bosch Center for Integrative Medicine and Health, Bosch Health Campus, Stuttgart, Germany; 3Centre for Journalology, Ottawa Methods Centre, Ottawa Hospital Research Institute, Ottawa, Ontario, Canada; 4Department of Epidemiology and Data Science, Amsterdam University Medical Centers, Amsterdam, the Netherlands; 5Department of Philosophy, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands; 6School of Medicine, Toronto Metropolitan University, Toronto, Ontario, Canada; 7JAMA and the JAMA Network, Chicago, IL, US; 8Department of Health Research Methods, Evidence, and Impact, Faculty of Health Sciences, McMaster University, Hamilton, Ontario, Canada; 9Department of Medicine, McMaster University, Hamilton, Ontario, Canada; 10Consultant, Lyon, France; 11Scientific Committee, Institute of Research and Action on Fraud and Plagiarism in Academia (IRAFPA), Geneva, Switzerland; 12Department of Research in Biomedicine and Health and Center for Evidence-based Medicine, University of Split School of Medicine, Split, Croatia; 13School of Epidemiology, Public Health and Preventive Medicine, Faculty of Medicine, University of Ottawa, Ottawa, Canada.
Conflict of Interest Disclosures
Lex Bouter, Annette Flanagin, Ana Marušić, and David Moher are members of the Peer Review Congress Advisory Board but were not involved in the review or decision for this abstract.