Abstract

Scale and Resilience in Organizations Enabling Systematic Scientific Fraud

Reese A. K. Richardson,1,2 Spencer S. Hong,1,3 Jennifer A. Byrne,4,5 Thomas Stoeger,6,7,8 Luís A. Nunes Amaral1,2,9,10,11

Objective

Some suggest that the ease of communication provided by the internet and open access publishing has created the conditions for the emergence of entities (including paper mills, brokers, and predatory publishers) that facilitate systematic scientific fraud. However, little is understood about the organization of these entities as well as how they react to and evade science integrity measures. Here, we sought to demonstrate that large networks that produce scientific fraud at scale can be identified by the footprints they have left in the published scientific literature. These footprints can be analyzed to glean insights about their organizational structure and operations.

Design

Our work consisted of observational case studies making use of article metadata from OpenAlex and the complete corpora of PLOS One and Hindawi articles; all PubPeer comments made before January 2024; historical indexing data from MEDLINE, Scopus, and Web of Science; and archived webpages obtained with the Internet Archive’s Wayback Machine. Our case studies addressed networks of compromised editors at large mega-journals, analysis of article provenance through observations of interarticle image duplication, a longitudinal study of the operations of a large broker organization, a comparison of revision and retraction rates in closely related biomedical subfields, and an overall assessment of growth rates of systematic scientific fraud. This analysis was performed from June to September 2024.

Results

We identified 45 editors at PLOS One, 53 editors at 10 Hindawi journals, and 205 Institute of Electrical and Electronics Engineers (IEEE) conferences that handled articles that were eventually retracted, articles with PubPeer comments, or articles featuring tortured phrases far more often than expected by chance (1-sided Poisson binomial test, Benjamini-Hochberg false discovery rate, <0.05). We characterized a network of image duplication spanning 2213 articles. We catalogued the involvement of a broker organization with an evolving portfolio spanning 188 journals over 7 years. We found similar rates of revision in subfields of RNA biology but retraction rates orders of magnitude apart. We found that while publication rates of scientific articles doubled approximately every 15 years and the publication rates of eventually retracted articles every 3 years, articles of likely paper mill provenance were published at a rate that doubled less than every 2 years.

Conclusions

Here, we demonstrated through case studies that (1) individuals have colluded to publish problematic papers in a number of journals; (2) brokers can ensure publication in infiltrated journals at scale; and (3) within a field of science, not all subfields are equally targeted for scientific fraud. Our results revealed some of the strategies that enable entities promoting scientific fraud to evade interventions. Our final analysis suggested that this ability to evade interventions is enabling the number of fraudulent publications to grow at a rate far outpacing that of legitimate science.

Author Affiliations

1Department of Engineering Sciences & Applied Mathematics, Northwestern University, Evanston, IL, US, richardsonr43@gmail.com; 2Department of Molecular Biosciences, Northwestern University, Evanston, IL, US; 3Department of Chemical & Biological Engineering, Northwestern University, Evanston, IL, US; 4School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, NSW, Australia; 5NSW Health Statewide Biobank, NSW Health Pathology, Camperdown, NSW, Australia; 6The Potocsnak Longevity Institute, Northwestern University, Chicago, IL, US; 7Simpson Querrey Lung Institute for Translational Science, Northwestern University, Chicago, IL, US; 8Division of Pulmonary and Critical Care, Northwestern University Feinberg School of Medicine, Chicago, IL, US; 9Department of Physics and Astronomy, Northwestern University, Evanston, IL, US; 10Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, US; 11NSF-Simons National Institute for Theory and Mathematics in Biology, Chicago, IL, US.

Conflict of Interest Disclosures

Luís A. Nunes Amaral is a member of the Peer Review Congress Advisory Board but was not involved in the review or decision for this abstract.

Funding/Support

Reese A. K. Richardson was supported in part by a National Institutes of Health Training Grant (T32GM008449) through Northwestern University’s Biotechnology Training Program and receives funding from the Dr John N. Nicholson Fellowship from Northwestern University and Moderna. Spencer S. Hong receives support from the Ryan Fellowship and the International Institute for Nanotechnology at Northwestern University. Jennifer A. Byrne receives funding from a National Health and Medical Research Council of Australia Ideas grant (APP1184263) and Rewarding Research Success funding from the Faculty of Medicine and Health at The University of Sydney. Thomas Stoeger receives funding from the National Institute on Aging, Integrative Multi-Scale Systems Analysis of Gene-Expression-Driven Aging Morbidity (R00AG068544), National Institute of Allergy and Infectious Diseases (AI135964), and Successful Clinical Response In Pneumonia Therapy (SCRIPT) Systems Biology Center. Luís A. Nunes Amaral and Thomas Stoeger receive funding from SCISIPBIO: a data-science approach to evaluating the likelihood of fraud and error in published studies (1956338).

Role of the Funder/Sponsor

The funders had no role in the design or execution of this study.