I am shocked – shocked – to read in Nature that the NIH peer review “system fails to account for individual bias, and places undue weight on panel members who have not even read the proposals.”
Valen Johnson, a biostatistician at the University of Texas M.D. Anderson Cancer, proposes use of an innovative statistical model to analyze R01 peer-review ratings in the Proc Natl Acad Sci this week. Using data from CSR reviews of ~19,000 proposals (involving ~14,000 reviewers or 2.8 reviewers per proposal) from 2005, Johnson looked for evaluation trends. Johnson found that “variability inherent to rater scores, and differences in the criteria used by individual raters to assign scores to proposals, have an enormous impact on funding decisions.” As summarized by Nature, Johnson also “found that the top grants were largely unaffected by reader bias, but that such bias did impact grants closer to the funding cut-off line.”
He notes that his model “accounts for differences in reviewer scoring criteria, provides a model for the sequential rating of items by various subsets of reviewers, and quantifies uncertainty associated with final proposal ratings” (his Bayesian hierarchical statistical model is available as supporting material).
In discussing his results, he suggests validation studies that CSR could conduct to examine the “discussion effect”, which could in turn be used to “assess the tradeoff between the cost of conducting SRG meetings and the cost of collecting additional, independent ratings of applications.” He proposes an alternative approach to making awards that would change the pool of funded proposals by 25-35% by taking into consideration the cost of proposals that fall close to the payline and rewarding those who ask for less than the maximum amount they can justify.
Nature reports a little tiff between CSR Director Toni Scarpa and Valen Johnson: “After Scarpa became director of the CSR in 2005, it asked Johnson to return the data. Johnson returned the original reviews, but was able to keep copies by placing a Freedom of Information Act request. Scarpa says that the CSR had heard Johnson present preliminary results and was not interested in pursuing the project further. Although the center is interested in revising its scoring procedures, “there was relatively little enthusiasm” about Johnson’s analysis, says Scarpa.”
Not surprisingly (and perhaps correctly), Scarpa “chafes” at the suggestion that funding “less expensive grants would allow the agency to fund more projects … and thus increase the likelihood that they have supported the best applications.” Indeed, “Some studies are inherently more expensive than others, and a proposal that includes a clinical trial should not be penalized for being more expensive than a proposal that does not, he says.”
Certainly a more rigorous statistical analysis of the peer review rating system was needed, and one hopes CSR will not discard these data or the model out of hand.
PhysioProf said
Dog Bites Man. yawn.
BB said
Shocked?
I’m only shocked it took so long to reveal it.
I think a lot of us knew/suspected it.
Of course since I am on study section, I better clean up my act and eliminate bias. Need some anti-bias pills now.
whimple said
The comments on the Nature website seem to have some insightful suggestions.
JSinger said
I liked in the Nature comments where someone responded to Janet Leatherwood with “In response to Dr. Leatherhead…”
As far as the article goes, it makes the same mistake that elaborate schemes for voting make: transparency and comprehensibility are important and outweigh modest theoretical advantages.
writedit said
I’ll add this note about an alternative bit of reviewer bias here, too. Newsweek has a little piece that is meant to take IRBs to task for being too nit-picky but in fact includes a universal truth long touted by our friend Whimple, among many others:
whimple said
For what it’s worth, I haven’t found the IRB to be all that bad. There’s a lot of red tape, but if you keep chipping away at it, you can generally get through it.