Comparison of Changes in High-Quality vs Low-Quality Evidence in Original and Updated Systematic Reviews

Benjamin Djulbegovic,1 Muhammad Muneeb Ahmed,2 Iztok Hozo,3 Despina Koletsi,4 Lars Hemkens,5,6,7 Amy Price,8 Rachel Riera,9 Paulo Nadanovsky,10 Ana Paula Pires dos Santos,11 Daniela Melo,12 Ranjan Pathak,13 Rafael Leite Pacheco,14,15 Luis Eduardo Fontes,15,16 Enderson Miranda,17 David Nunan,17,18


It is generally believed that evidence from a weaker body of evidence (eg, poorly designed and executed and potentially biased studies, sparse studies and/or heterogenous results) will generate inaccurate estimates about treatment effects more often than evidence from a stronger body of evidence. As a result, estimates of effects of health interventions initially based on high certainty (quality) of evidence (CoE) are expected to change less frequently than the effects estimated by lower CoE, and the estimates of magnitude of effect size are expected to differ between high and low CoE. Empirical assessment of these foundational principles of evidence-based medicine has been lacking.


The Cochrane Database of Systematic Reviews was reviewed from January 2016 through May 2021 for pairs of original and updated reviews for change in CoE assessments based on the Grading of Recommendations Assessment, Development and Evaluation (GRADE) method. The difference in effect sizes between the original and updated reviews were assessed as a function of change in CoE, which was reported as a ratio of odds ratios (RORs). The RORs generated in the studies that changed CoE from very low/low (VL/L) to moderate/high (M/H) vs MH/H to VL/L were compared. Heterogeneity and inconsistency were assessed using the τ and I2 statistic. The change in precision of effect size estimates was assessed by calculating the ratio of standard errors (seR), and the absolute deviation in estimates of treatment effects was assessed with adjusted RORs.


Overall, 419 pairs of reviews were included, of which 414 (207 × 2) informed the CoE appraisal and 384 (192 × 2) the assessment of effect size. Certainty of evidence originally appraised as VL/L had 2.1 (95% CI, 1.19-4.12; P = .01) times higher odds to be changed in the future studies than those with M/H CoE. However, the pooled effect size was not different when the CoE changed from VL/L to M/H (ROR, 1.02; 95% CI, 0.74-1.39) compared with M/H CoE changing to VL/L (ROR,1.02; 95% CI, 0.44-2.37). Similarly, the overlap in aROR between the VL/L CoE to M/H vs the M/H to VL/L subgroups was observed (median [IQR], 1.12; 95% CI, 1.07-1.57 vs 1.21; 95% CI, 1.12-2.43). There was a large inconsistency across ROR estimates (I2 = 99%). There was larger imprecision in treatment effects when the CoE changed from VL/L to M/H (seR = 1.46) than when it changed from M/H to VL/L (seR = 0.72).


This study found that low-quality evidence changed more often than high CoE. However, the effect size was not systematically different between studies with low vs high CoE, indicating the need for improving contemporary critical appraisal methods.

1Beckman Research Institute, Department of Computational & Quantitative Medicine, City of Hope, Duarte, CA, USA, bdjulbegovic@coh.org; 2Michael G. DeGroote School of Medicine, McMaster University, Hamilton, ON, Canada; 3Department of Mathematics, Indiana University Northwest, Gary, IN, USA; 4Clinic of Orthodontics and Pediatric Dentistry, Center of Dental Medicine, University of Zurich, Zurich, Switzerland; 5Department of Clinical Research, University of Basel, Basel Institute for Clinical Epidemiology & Biostatistics, University Hospital Basel, Basel, Switzerland; 6Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, CA, USA; 7Meta-Research Innovation Center Berlin (METRIC-B), Berlin Institute of Health, Berlin, Germany; 8Anesthesia Informatics and Media Lab, Stanford University, Stanford, CA, USA; 9Universidade Federal de São Paulo, Escola Paulista de Medicina, Brazil (Unifesp), São Paulo, Brazil; 10Department of Epidemiology and Quantitative Methods in Health, National School of Public Health, Fundação Oswaldo Cruz (FIOCRUZ)–Department of Epidemiology, Institute of Social Medicine, Universidade do Estado do Rio de Janeiro (UERJ), Rio de Janeiro, Brazil; 11Department of Community and Preventive Dentistry, Faculty of Dentistry, Universidade do Estado do Rio de Janeiro (UERJ), Rio de Janeiro, Brazil; 12Department of Pharmaceutical Sciences, Universidade Federal de São Paulo (Unifesp), São Paulo, Brazil; 13Department of Medical Oncology and Therapeutics Research, City of Hope, Duarte, CA, USA; 14Centro Universitário São Camilo, São Paulo, Brazil; 15Center of Health Technology Assessment, Hospital Sirio-Libanês–São Paulo, Brazil; 16Department of Intensive Care and Emergency Medicine, Faculdade de Medicina de Petrópolis, Petrópolis, Rio de Janeiro, Brazil; 17Kellogg College, University of Oxford, Oxford, UK; 18Centre for Evidence-Based Medicine, Nuffield Department of Primary Care Health Sciences, Oxford University, Oxford, UK

Conflict of Interest Disclosures

None reported.