Adrian G. Barnett,1 Nicholas Graves,1 Pauline Zardo2
The “publish or perish” incentive drives researchers to focus on volume and novelty, which crowds the literature with unreliable findings. Different incentives are needed that prioritize quality. We examined whether random audits of researchers could improve quality.
We reproduced a simulated Darwinian model of research laboratories where the most successful laboratories produce “children” laboratories that inherit their parents’ characteristics (mutation probability, 0.01). Success is determined by the number of positive findings, which drives laboratories to publish multiple positive findings regardless of their veracity. Two key features of a laboratory are its power to detect positive findings (fixed at 0.8) and the effort it exerts, which controls whether it starts a new hypothesis. We used the assumptions of the original simulation and extended the model to include random audits that examine laboratories’ entire publication histories and calculated the false-positive rate with knowledge of which hypotheses are true. Laboratories with at least 50 publications with false-positive rates in the bottom third were removed. Audited laboratories increased their effort, as did their parents and existing children. We estimated the costs of auditing by assuming that 1 scientist per month was needed to review 10 publications. We used 500 simulations per scenario.
Without audits, effort declined, and the false-positive rate reached two-thirds in almost every simulation, meaning that most positive findings were wrong (Figure). Auditing 1.94% of all publications avoided the explosive increase in false positives in 95.0% of simulations. Audits decreased the volume of publications as laboratories worked longer on each question. Auditing 1.94% would cost an estimated $169 per publication in US dollars (95% CI, $152$171). Adding measurement error of ±20% did not affect the efficacy of the audits.
Figure. False-positive probabilities by time (No. of publishing cycles ×100,000) for 4 auditing levels.
The gray lines are 20 randomly selected simulations; the dark red line in each graph is the average of 500 simulations.
Our simulation is a gross simplification of the research world, but the results provide food for thought. A random audit of a small proportion of the evidence base allows a detailed assessment of performance and may provide an incentive to researchers to raise standards. Audits are difficult to manipulate, but substantial funding and time would be needed from researchers to perform the audits. Audits could shift the maxim from “publish or perish” to “quality or perish.”
1School of Public Health and Social Work, Queensland University of Technology, Brisbane, Australia, firstname.lastname@example.org; 2Law School, Faculty of Law, Queensland University of Technology, Brisbane, Australia
Conflict of Interest Disclosures:
Adrian G. Barnett is funded by a National Health and Medical Research Council (NHMRC) Senior Research Fellowship (APP1117784). Nicholas Graves is also funded by an NHMRC Senior Research Fellowship (APP1059565).
Computational resources and services used in this work were provided by the High Performance Computer and Research Support Group, Queensland University of Technology, Brisbane, Australia.
For the statistical analysis, all the R code is available at https://github.com/agbarnett/taxinspect.