Reporting Statistical Inference in Psychiatry 1975-2015: An Analysis of Abstracts in Major Psychiatric Journals

Christopher Baethge,1,2 Markus Deckert,3 Andreas Stang3,4

Objective We analyzed changes in the reporting of statistical inference in the abstracts of psychiatric journals, specifically comparing changes over time in the reporting of P value thresholds (eg, P ≤ .05 or use of the term significant) vs precise P values and the use of confidence intervals.

Design Using an SAS-based algorithm for PubMed, we searched 63,928 abstracts published between 1975 and 2015 in 15 leading psychiatric journals (defined by 2015 impact factors). For validation, we checked 950 randomly selected abstracts.

Results The mean (SD) length of abstracts increased from 716 (317) characters in 1975-1979 to 1928 (1113) characters in 2010-2015, and the use of numbers increased from a median of 2 per abstract (10th to 90th percentile range, 0-10 per abstract) to 14 per abstract (10th to 90th percentile range, 0-72 per abstract). An estimated 3.6% (95% CI, 2.5%-5.2%) of abstracts contained nonstatistical significance terminology only. About 45% of all abstracts reported some form of statistical inference, increasing from 26% in 1975-1979 to 52% in 2010-2015. In those abstracts, statistical inference based on P value thresholds was dominant (Figure), decreasing from 99% (1975-1979) to 66% (2010-2015); but with more articles being published, the absolute numbers that reported P value thresholds increased from 1095 in 1975-1979 to 3806 recently. While reporting precise P values did not appear at all 40 years ago and remained rare until recently (6% of abstracts in 2010-2015), combining precise and threshold P values is now more common (1% of abstract in 1975-1979 vs 20% of abstracts in 2010-2015). From 2010 to 2015, 22% of abstracts included confidence intervals and 7% displayed confidence intervals only; from 1975 to 1979, no abstracts included confidence intervals. All results varied widely across journals.

Figure. Statistical Inference in 15 Psychiatric Journals (1975-2015)

The journals included were World Psychiatry, JAMA Psychiatry, the American Journal of Psychiatry, Molecular Psychiatry, Biological Psychiatry, the Schizophrenia Bulletin, Psychotherapy and Psychosomatics, the Journal of the American Academy of Child and Adolescent Psychiatry, the British Journal of Psychiatry, the Journal of Child Psychology and Psychiatry, the Journal of Neurology and Neurosurgery, Neuropsychopharmacology, Acta Psychiatrica Scandinavica, Lancet Psychiatry, and the Journal of Psychiatry and Neuroscience.

Conclusions In the abstracts of psychiatric articles, we detected a shift from reporting P value thresholds only (eg, P ≤ .05) to presenting precise P values and confidence intervals, although the use of P value thresholds and the absence of confidence intervals remain common. Analyzing full-text articles might provide more complete information about these trends, but the findings suggest that the decades-long debate on “estimation over testing” has not gained much ground among researchers in psychiatry and has not led to a substantial replacement of P values by confidence intervals.

1Department of Psychiatry and Psychotherapy, University of Cologne Medical School, Cologne, Germany; 2Deutsches Ärzteblatt and Deutsches Ärzteblatt International, Editorial Offices, Cologne, Germany, baethge@aerzteblatt.de; 3Center of Clinical Epidemiology, c/o Institute of Medical Informatics, Biometry, and Epidemiology, University Hospital of Essen, Essen, Germany; 4Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA

Conflict of Interest Disclosures: None reported.

Back To Top