Abstract
Unravelling the Spin and Selective Reporting in Medical AI Research
Vincent Yuan,1 Aidan Christopher Tan2
Objective
Our study aimed to determine the prevalence, types, and facilitators of “spin” and other selective reporting practices in recently published medical artificial intelligence (AI) research.
Design
This was a cross-sectional meta-research study of interventional and observational studies related to medical AI. Medical AI was defined as computational models (eg, machine learning, deep learning, natural language processing) that amplify or augment health care–related tasks (eg, diagnosis, prognosis, risk stratification).1 Studies were identified by searching MEDLINE, Embase, and the Cochrane Library. Studies were included if they were observational or interventional studies related to medical AI published in 2023, English-language, and full text. Spin was defined as the distortion of findings to present the performance of a medical AI tool as more beneficial than it appears. Examples of spin include misrepresentation or omission of significant or nonsignificant results and selective focus on abstract conclusions. Other study characteristics investigated included the journal, academic qualifications of first authors, funding, and competing interests. The features of spin were derived from coding manuals of Boutron et al2 and Andaur Navarro et al.3 Spin was categorized as either misleading interpretation, referring to an overemphasis of results, or misleading transportability, referring to an exaggeration of a model’s clinical applicability. Instances of spin were abstracted by manually reviewing the Results and Discussion sections of abstracts and body and extracting the data into a prepiloted data collection form. This form was modified from SPIN-Prediction Models to exclude or include certain indicators, with the addition of custom items based on previous spin studies. Data were collected by 1 reviewer, and a second reviewer independently recollected the data in instances of uncertainty.
Results
A total of 423 studies were included (Figure 25-0950). Slightly more than one-third (35% [n = 149]) of studies described AI as superior to the comparator. Nine percent of studies (n = 38) contained leading words in the title, and more than half (56% [n = 236]) selectively focused abstract conclusions on significant results. In almost half (48% [n = 203]) of study abstracts, qualifiers (eg, might, very) were present to subtly suggest stronger findings than likely warranted. The majority of studies (71% [n = 300]) contained strong statements (eg, clearly shows) in the body to emphasise the AI model’s performance, accuracy, or effectiveness. Twenty-four percent (n = 101) of studies made recommendations for clinical practice in the body without external validation in the same study.
Conclusions
Our findings suggest that selective reporting practices, including spin, are prevalent in medical AI research. These practices can misinform readers of the technical performance and real-world applicability of AI tools. This is especially crucial for titles and abstracts because these sections often influence first impressions and inform clinical decision-making.
References
1. Bajwa J, Munir U, Nori A, Williams B. Artificial intelligence in healthcare: transforming the practice of medicine. Future Healthc J. 2021;8(2):e188-e194. doi:10.7861/fhj.2021-0095
2. Boutron I, Dutton S, Ravaud P, Altman DG. Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA. 2010;303(20):2058-2064. doi:10.1001/jama.2010.651
3. Andaur Navarro CL, Damen JAA, Ghannad M, et al. SPIN-PM: a consensus framework to evaluate the presence of spin in studies on prediction models. J Clin Epidemiol. 2024;170:111364. doi:10.1016/j.jclinepi.2024.111364
1School of Clinical Medicine, Faculty of Medicine and Health, University of New South Wales, Sydney, NSW, Australia, vincent.yuan1@student.unsw.edu.au; 2Sydney School of Public Health, Faculty of Medicine and Health, University of Sydney, Sydney, NSW, Australia.
Conflict of Interest Disclosures
None reported.