Development of a List to Detect Statistical and Methodological Terms in Peer Reviews

Ivan Buljan,1 Daniel Garcia-Costa,2 Francisco Grimaldo,2 Richard A. Klein,3 Marjan Bakker,3 Ana Marušić1


The peer review system used by academic journals to assess the quality of submitted papers is constantly in need of new technologies that can help reviewers and increase the objectivity of the process. The aim of the present study was to develop a comprehensive list of words related to research methods and statistics, which could then be used to automatically study the content and the quantity and types of methodological and statistical issues identified in peer review reports.


The terms for the list were identified from different glossaries.1,2 The terms were organized into subcategories, some of which were more related to methodological aspects (eg, study design, sampling, or procedures) or statistical aspects (eg, parametric descriptive parameters or data presentation) of research. The list was refined and enriched with terms using a computational language model. The final list and the preregistration can be found at the Open Science Framework website (https://osf.io/d34b9/). The occurrence of terms in the PEERE database3 of peer review reports, for which data on reviewer gender, continental region, review recommendation, journal impact factor, and subject area (N = 496,928) were available, was counted. Regression models were created to determine the frequency of the terms used based on peer review characteristics.


The mean length of a peer review was 196 words (95% CI, 108-330 words). A total of 26.8% of review reports in the sample contained at least 1 methodological term, while 70.7% contained at least 1 statistical term. It was more likely that the review contained methodological terms if the reviewer was female and resided in North America or Oceania compared with other continents, if the recommendation was to reject the article or send it for major revisions, if the journal was from the social sciences and humanities, if the journal was in the top impact factor quartile, and if the article was a longer peer review. More words related to statistics were observed when the reviewer was female, for reviews recommending any revisions, for journals from health and medical sciences, in journals in the top impact factor quartile, and in longer reviews. Mixed regression models indicated that the scientific area was more strongly associated with words related to statistics than were reviewer recommendations.


The application of this newly developed list showed that methodological and statistical terms are not a common topic in peer review and that they are not always present in review reports from different fields. The list successfully covered research terms because the most-identified terms were related to categories relevant for most types of research. Future analysis should try to understand the context of the occurrence of words from the list to confirm the validity of the list of terms.


1. Stark PB. Glossary of Statistical Terms. Accessed March 23, 2021. https://www.stat.berkeley.edu/~stark/SticiGui/Text/gloss.htm

2. Framework for Open and Reproducible Research Training. The Glossary of Terms. Accessed July 12, 2021. https://docs.google.com/document/d/1oNsHSRmgAsDGUKvq5joqBpv

3. Squazzoni F, Ahrweiler P, Barros T, et al. Unlock ways to share data on peer review. Nature. 2020;578(7796):512-514. doi:10.1038/d41586-020-00500-y

1Department of Research in Biomedicine and Health, University of Split School of Medicine, Split, Croatia, ibuljan@mefst.hr; 2Department of Computer Science, University of Valencia, Burjassot, Spain; 3Department of Methodology and Statistics, Tilburg University, Tilburg, the Netherlands

Conflict of Interest Disclosures

Ana Marušić is a member of the Peer Review Congress Advisory Board by was not involved in the editorial review or decision for this abstract. No other disclosures reported.


This study was funded by the Croatian Science Foundation “Professionalism in Health–Decision making in practice and research” under grant agreement IP-2019-04-4882. Richard A. Klein received funding from a Replication Studies grant (401.18.053) from the Netherlands Organization for Scientific Research, and a Consolidator Grant (IMPROVE) from the European Research Council (grant 726361). Francisco Grimaldo and Daniel Garcia-Costa are partially supported by the Spanish Ministry of Science, Innovation and Universities, the Spanish State Research Agency, and the European Regional Development Fund under project RTI2018-095820-B-I00.

Role of the Funder/Sponsor

The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the abstract; and decision to submit the abstract for presentation.