Influences of Independent Peer Reviewer Scores on UK National Institute for Health Research Grant Funding Recommendations

Nicola McArdle,1 Helen Payne,1 Sheila Turner,1 Jeremy C. Wyatt2


Research funding boards expend much effort obtaining and considering peer reviews before making funding recommendations. While a reviewer’s textual comments provide most value, their review score should be a good proxy for opinion regarding funding outcome. The United Kingdom’s National Institute for Health Research (NIHR) is keen to improve review processes so we studied relationships between reviewer scores, funding board scores, and funding recommendations across 4 programs managed by a single NIHR coordinating center.


This is a cross-sectional study of reviewer and board scores for all full applications assessed in 2015. Both scores can range from 1 to 6 (1 indicates unfundable; 2 and 3, major changes necessary; 4 and 5, fundable with changes; and 6, fundable). Relationships between individual and mean reviewer scores and board scores per application were compared, and the influence of scores from different reviewer types (health economist, methodologist, clinician, patient, subject expert; number of funded applications to NIHR programs in this study) on board funding recommendations assessed using logistic regression and receiver-operator characteristic (ROC) curves.


The analysis is based on 1599 reviewer scores from 295 funding applications (54% successful). The number of reviews per application varied from 2 to 9 (mean [SD], 5.5 [1.2] reviews) and was not correlated with the amount of funding requested. Reviewers tended to be more generous than boards to applications scoring in the low- to mid-range. Logistic regression (based on decision to fund or not [265 applications]) showed a large increase in funding probability for each unit increase in mean reviewer score (relative risk, 7.0; 95% CI, 3.7-13.5) but no correlation with the number of reviewers per application; ROC curves using mean reviewer scores showed fair discrimination (area under ROC [AUROC], 0.77; 95% CI, 0.71-0.82) (Figure), which changed little from 4 to 7 or more reviewers per application. There was a trend towards increasing AUROC for individual reviewers receiving more NIHR grants. There were minor differences in AUROC by type of reviewer (0.60 for health economist or methodologist; 0.61, clinician; 0.64, patient representative; 0.66, subject expert). All appeared lower than the AUROC for mean reviewer score per application.

Figure. Different Possible Cut Points for Fund or Reject Recommendation.

The area under curve for the Board score is 0.97 (95% CI, 0.95-0.99); Reviewer mean score, 0.77 (95% CI, 0.71-0.82).


Our analysis covering 4 NIHR programs shows large variation in the number of peer reviews informing recommendations. Our results suggest that sometimes 4 reviewers might suffice. The type of reviewer does not make much difference, but using the mean score from different types of reviewer instead of individual scores does.

1National Institute for Health Research, University of Southampton, Southampton, UK, nicola.mcardle@nihr.ac.uk; 2Wessex Institute, University of Southampton, Southampton, UK

Conflict of Interest Disclosures:

None reported.


This research was supported by the National Institute for Health Research Evaluation, Trials, and Studies Coordinating Centre (NETSCC) through its Research on Research program.

Role of the Funder/Sponsor:

The funder had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the abstract.


The views and opinions expressed are those of the authors and do not necessarily reflect those of the Department of Health or of the NETSCC.