Abstract

A Comparison of Self-Acknowledged Limitations With Risk-of-Bias Assessments From Systematic Reviews

Joe D. Menke,1 Mengfei Lan,1 Halil Kilicoglu1

Objective

The importance of self-acknowledged limitations (SALs) has long been recognized.1 Prior work detecting2 and categorizing3 limitation statements using natural language processing (NLP) has made large-scale assessments of limitation reporting possible. By comparing SALs in research articles with limitations identified by systematic reviewers, this study aimed to evaluate the quality and coverage of SALs in relation to risk of bias.

Design

We constructed a dataset pairing SALs with limitations discussed by reviewers through risk-of-bias assessments. Articles from the Cochrane Database of Systematic Reviews available in the open-access subset of PubMed Central were downloaded in September 2024. For papers cited in these reviews, full texts were downloaded when available, and existing NLP models3 were used to extract and classify sentences discussing limitations (December 2024). Identifying information and risk-of-bias assessments for cited articles were extracted from reviews’ Characteristics of Included Studies tables when a Risk of Bias section was present; reviews without this section were excluded. For living reviews, duplicate assessments were removed to retain only 1 assessment per cited article across versions of a living review. Risk-of-bias statements, originally in tabular form (eg, “Allocation concealment [Selection bias] High risk CBA; no allocation concealment done.”), were rephrased into natural language using a large language model (Llama3.3-70B). Low-risk components were excluded, retaining only high- and unclear-risk items. A random subset was manually reviewed to check for systematic issues, which did not reveal concerning patterns or hallucinations. Paraphrased information from reviews and SALs from cited papers were classified using NLP models that recognized limitations sentences and types.3 We analyzed and compared the distribution of categorized limitations mentioned in both groups in an effort to assess the quality of SALs.

Results

From 145 reviews, we extracted and classified limitations statements for 803 cited articles (5.54 articles per review). From risk-of-bias tables, 3172 limitations were extracted (3.95 per cited article). From 750 cited articles, 136,290 sentences were processed and classified (170 per article), yielding 5616 SAL sentences (7.49 per article). No SALs were found in 53 articles. Categorized data are reported in Table 25-1017. Outcome measures were the most common SAL (69.12% of articles), while reviewers noted missing data as the most common high-risk limitation (33.75% of articles) and blinding as the most common unclear risk (48.19% of articles). On average, 56.67% of reviewer-identified limitations were not self-acknowledged, including 38.10% of high-risk and 53.01% of unclear-risk limitations.

Conclusions

We present categorized limitations data and compared SALs with reviewer-derived limitations, showing that more than one-third of high-risk limitations are not self-acknowledged. Our findings may be used to better understand the quality of existing self-acknowledged limitations as well as highlight categories that are potentially underreported by authors.

References

1. Ioannidis JP. Limitations are not properly acknowledged in the scientific literature. J Clin Epidemiol. 2007;60(4):324-329. doi:10.1016/j.jclinepi.2006.09.011

2. Kilicoglu H, Rosemblat G, Malički M, Ter Riet G. Automatic recognition of self-acknowledged limitations in clinical research literature. J Am Med Inform Assoc. 2018;25(7):855-861. doi:10.1093/jamia/ocy038

3. Lan M, Cheng M, Hoang L, Ter Riet G, Kilicoglu H. Automatic categorization of self-acknowledged limitations in randomized controlled trial publications. J Biomed Inform. 2024;152:104628. doi:10.1016/j.jbi.2024.104628

1School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, US, jmenke2@illinois.edu.

Conflict of Interest Disclosures

None reported.

Funding/Support

This work was supported by the National Library of Medicine of the National Institutes of Health (NIH; R01LM014079).

Role of the Funder/Sponsor

The funder had no role in the study design or in the collection, analysis, or interpretation of data; writing of the report; or decision to submit the abstract for publication.

Additional Information

The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.