Why Clinical R01s Fare Worse than Nonclinical R01s

The July issue of the American Journal of Medicine has an interesting report by Martin et al. that asks “Why Are Peer Review Outcomes Less Favorable for Clinical Science than for Basic Science Grant Applications?

Jumping ahead, the answer is not entirely satisfying:

“current data suggest that nearly all of the difference in review outcomes for clinical and nonclinical applications is due to a failure to adequately address human subject protection requirements and to a lower rate of submission of competing continuation applications by clinical applicants.”

To arrive at this conclusion, the authors examined CSR-reviewed R01s from Oct 2000 through May 2004 (12 rounds), so a couple years in doubling, a couple years after (though none of the recent lean years). Clinical research applications were those identified as such on the face page (box checked for human subjects involvement). They used the NIH definition for new investigator.

Aggregate data (new & experienced PIs, A0-A1-A2 submissions, Type 1 & Type 2 applications) show that 22.53% of all nonclinical R01s score within the 30th percentile compared with 17.85% of clinical R01s. (correction per Whimple’s comment below)

Conversely (unfortunate strategy for clinical research applicants), 28.3% nonclinical PIs submit Type 2 applications compared with 20% of the clinical PI pool. Thus, the overall success rate of clinical R01s takes a hit due to this lower rate of competing renewals. The authors give a few possible explanations but suggest that empirical data are needed to document why this discrepancy occurs so effective corrective measures can be taken. Indeed, the authors estimated that:

“the lower rate of submission of competing clinical applications contributes to approximately one half of the aggregate difference in peer review outcomes between clinical and nonclinical applications.”

So then there’s the issue of human subjects concerns, which do affect priority score. Among all clinical R01 submissions, 14.8% did not adequately address human subjects protection requirements (no comparable data for nonclinical R01s that used animals in terms of addressing vertebrate animal requirements). Overall, 19% of clinical R01s with no human subjects concerns scored within the 20th percentile, compared with 10% of those with human subjects concerns. The authors note that:

“approximately one-half of the observed differences in peer review outcomes for clinical versus basic research applications can be attributed to applicants failing to adequately address human subject concerns in their applications.”

CTSAs have an entire core devoted to regulatory knowledge (we have a regulatory compliance facilitator devoted to helping PIs with their grant application human subjects sections & IRB protocols) and bundle clinical research ethics with the design, biostatitics, and ethics core. With all the reporting data NCRR is collecting, one would hope they could eventually analyze whether these transformative resources have made a difference in the funding of clinical R01s.



  1. Clinical Research has become extremely important today. As Clinical research is a branch of medical science that determines the safety and effectiveness of medications, devices and treatment regimens intended for human use, it is concerned with treatment. The result of clinical research studies are essential for gaining approval for marketing new medications and devices or new inventions from regulatory agencies throughout the world. This further highlights its importance

  2. BB said

    If your IRB approval has to be signed, sealed, and delivered before you apply, where’s the concern? Is there such a variation in IRB quality among institutions? What am I not understanding here? I posit that once again it’s a case of reviewers run amuck with SRAs asleep on the job.
    BTW, I once sat on a study section for KO1 awards, and several apps had incomplete or missing Vertebrate Animal sections, yet the research required animals. I pointed the deficiencies out to the chair, who dismissed them out of hand and gave fundable scores to the apps anyway (he was pretty autocratic and knew most of the mentors on the proposals; if he knew them he strong-armed reviewers to give better scores than if he didn’t). The SRA was definitely asleep in the back of the room.

    Briefly, IRB approval (or even submission) is not required at the time of NIH submission. It is only required for those applications with fundable scores as part of the Just in Time (JIT) request prior to release of an award. I suspect these deficiencies did not occur (or as often) prior to JIT, but relief from the burden of getting all those approvals for projects that would never be funded was needed (& also likely supported the escalation in grant application submissions … interesting trend to check – NIH waiver of IRB submission/approval at the time of application occurred in 2000). Also, I believe concerns related to human and animal research protections have only recently been become a factor when assigning a priority score, but I need to check (1997 guidelines seem fuzzy, definite priority score impact by 2002 incorporated in the 2005 updated review criteria). Further enlightenment from others would be welcome. -writedit

  3. PhysioProf said

    Interesting post, dude. Another possible factor is the fact (well, I assume this is a fact) that clinical R01 applications have much larger budgets than basic science. I would be interested to know what proportion of non-clinical R01s are maxed-out modular budgets versus clinical apps.

    Yes, you’re correct about clinical applications, especially clinical trials, having larger budgets than nonclinical R01s. However, the authors were looking at scores (well, percentiles), not actual awards, and the budget request itself, unlike human and animal subjects protections, should not affect the score assigned. – writedit

  4. drugmonkey said

    BTW, I once sat on a study section for KO1 awards, and several apps had incomplete or missing Vertebrate Animal sections, yet the research required animals. I pointed the deficiencies out to the chair, who dismissed them out of hand and gave fundable scores to the apps anyway (he was pretty autocratic and knew most of the mentors on the proposals; if he knew them he strong-armed reviewers to give better scores than if he didn’t). The SRA was definitely asleep in the back of the room.

    I will echo writedit in saying that there has been a change in, IIRC, the last 2-3 rounds. Previously, a deficient Vertebrate Animals section (there does not necessarily have to be any actual concerns, btw) was not supposed to affect score and was discussed after the final scoring of the proposal. In this era, a Vert Animals notation meant that the PI had to supply a response to the critique to the Program staff for review before the grant could be awarded, no matter what priority score was assigned.

    So if BB’s “autocratic chair” was operating in this era, “giving” a “fundable score” is no evidence of a problem. Suppressing the Vert Animal notation would have been a violation of the process however.

    I should note here that if the reviewer’s concern about animal use was such that s/he judged that it affected the conduct of the science, then this was to be included in the discussion of scoring.

    At present time, my understanding is that any Vert Animal concern is to be discussed in the context of scoring. Of course, just because one reviewer thinks something is “deficient” in the application and should affect the scoring does not mean that everyone will vote accordingly…

  5. drugmonkey said

    and the budget request itself, unlike human and animal subjects protections, should not affect the score assigned.


    There is a big gap between the way reviewers are supposed to behave and they way they actually do behave when reviewing grants. The size of the budget is one of the great unknowns but to think that it doesn’t contribute is dangerously naive.

    It would be really interesting to use budget as a factor in the analysis of scoring outcome, I’m with PP in betting that it would chip away at the perceived disadvantage of clinical apps.

    aggregating A0-A2 is also a potential danger spot- do we know that physician-scientist PIs stick to the revision process, or do they give it up (disproportionately) because they are running for their dinner instead of their lives? Same thing regarding perPI submissions over time- do physicianPIs have the same opportunity to hone their grant writing and grant-process-navigating as do non-physician PIs? I’d bet not when you take them in aggregate.

    I said “should” … and I’m still not sure it is the budget per se assuming the dollar amount requested is justified by the proposed study. A clinician who puts forth a monster multisite trial without good justification, sound study design, careful data collection and management, rigorous coordination, etc. will not fare well. The budget will be big, but the score will be dinged due to poor study design/rationale rather than the size of the budget. The larger and more complex the trial, the larger the budget, and the more room for reviewer target practice, especially if the PI does not have robust (or any) pilot study data. Not sure you could ever prove cause & effect, but I suspect the trend is indeed that bigger R01s (clinical or nonclinical) with bigger budgets fare worse in review.

    And I agree with your comment about physician PIs not having the time-opportunity (dare I say motivation?) to hone their grant-writing skills. However, the drop from A0 to A1 shown in Table 2 shows a similar decline for clinical and nonclinical R01s, which I actually found surprising. For clinical R01s slammed as A0s, my guess is that many are resubmitted as R21 pilot studies per reviewer suggestions, or perhaps as an R34 to organize a multisite trial. I haven’t seen the same trend in nonclinical R01s here, where PIs tend to pursue the A1-A2 death march to the end. -writedit

  6. BB said

    DM, thank you for the clarification. When I was on the KO1 study sections, the rule was that missing or incomplete sections (vert. animals & human subjects) tabled the app without score. The PI is then supposed to fix the missing section(s) and resubmit. Strong-arming by chair wasn’t supposed to happen to any app, ever. Am I alone in having experienced that on study section?

  7. whimple said

    “current data suggest that nearly all of the difference in review outcomes for clinical and nonclinical applications is due to a failure to adequately address human subject protection requirements…”

    This I do not believe. They have no data on whether it is these concerns solely that negatively affect the application score, or whether failure to adequately address human subjects concerns is simply reflective of poorer overall grant quality generally.

    ” …and to a lower rate of submission of competing continuation applications by clinical applicants”

    This is easily (speciously?) explained. With a non-clinical study you can just keep on endlessly cranking out data. With a clinical study, the typical goal is to figure out if something is going to be useful. If the initial data from the first funded grant suggests not, the project is dropped.

    You might think there are more nonclinical R01s submitted, but, surprisingly, you would be wrong: for Type 1 A0, the breakdown is 28% established PI clinical, 27.4% established PI nonclinical, 24.5% new PI clinical, and 20% new PI nonclinical.

    Is this a misreading? (yes – writedit) From the report:
    “From October 2000 through May 2004, CSR reviewed 92,922 R01 applications; 62,735 (67.5%) were considered nonclinical applications and 30,187 (32.5%) were considered clinical applications.”

    Figure 1 of the report (CSR’s currently popular “cumulative percent” vs “percentile” graph) tells the tale: 17.85% of clinical applications score at or better than 20th percentile, while 22.53% of non-clinical applications score at or better than 20th percentile. This directly indicates two things:
    1) clinical applications are being evaluated against non-clinical applications
    2) the non-clinical evaluations are winning

    Figure 3 is very discouraging for potential translational investigators: the grants scoring the best are non-human-subjects competitive renewals with 37 percent of applications scoring better than 20th percentile. the grants scoring the worst are grants from new investigators (which must be new grants by the definition of “new investigator” ) involving human subjects with 6% scoring better than 20th percentile. The study sections are heavily favoring more-of-the-same non-translational work at the expense of new translational work. I would guess that as money becomes tighter this tendency will increase. In principle this disparity could be corrected at the council level, if that’s what the NIH wanted to do.

    Thanks for the correction, Whimple. I knew you would examine this article carefully. The Table lists the distribution of applicant characteristics within each grouping (clinical-nonclinical); both columns add up to 100% but not the same N.

    However, I think the problems in the human subjects sections probably are a reasonable “biomarker” (canary?) for the success or failure of clinical applications since problems here would also likely be present in the Research Design & Methods section … and if a reviewer is citing deficiencies here, the reviewer has likely found other reasons to ding the application. I agree that clinical and translational applications will always have a tougher row to hoe since they do not propose experiments that can be tightly controlled and precisely replicated in triplicate. A reviewer so inclined could always think of something that could go wrong with so many people involved. On the other hand, I am distressingly familiar with an abysmal application for a multisite clinical trial that was funded, I assume because the reviewers and Institute wanted to see the study done, and the applicant team was uniquely qualified (the Institute converted the R01 to a U01 to manage the conduct of the study). – writedit

  8. BB said

    Abysmal non-clinical studies get funded too. I know of a PO1 that could serve as a case in point. But I wanted to answer what Whimple wrote about the NIH funding more of the same old, same old in a tighter funding environment. Wasn’t the new improved EZ-approved plan to revamp peer review going to reward more innovative research? If the NIH director can’t make it happen, the cynic in me says it never will.
    I wonder if anyone reading this has served on both DoD and NIH study sections. DoD tells you up front that innovation and/or novelty count the most in scoring. Since I’m on a DoD study section, I can say from the other side that the training class I had to take also emphasized innovation and novelty, defined what the DoD means, and gave examples of what they would be in the context of an app. I think this is very different from NIH study sections. Somebody correct me if I am wrong.

  9. drugmonkey said

    The notion that study section members should receive explicit training in how they are supposed to review grants is very different. IME. There are some documents on reviewing behavior that they send out but those are next to useless when it comes to the important stuff.

    For example:

    CSR directive: “You are supposed to focus on ‘significance’ now”

    Chorus of reviewers: “OK. Which aspect of ‘significance’? Public health? Theoretical resolution? Conflicting data? HighImpactFactor Journal?…etc, etc. We can do it any way you slice it, just want to know what you are thinking here….”

    CSR: “Um, that’s up to you. We don’t want to meddle with the independence of peer review, you know”

  10. BB said

    Maybe DoD SRAs are more pro-active than the average NIH SRA. Certainly the training videos I had to watch for the DoD told me what innovation and novelty mean. And I can rewatch them and takes note to be sure I understand what the DoD means by innovation and novelty. Maybe NIH should take a page from DoD.

  11. writedit said

    Newsweek has a little piece that is meant to take IRBs to task for being too nit-picky but in fact includes a universal truth long touted by our friend Whimple, among many others:

    One is that “translational” research, in which fundamental biological discoveries are put to practical use, just doesn’t have the sex appeal of basic science, the kind done in test tubes and lab animals and which yields fundamental new insights into, say, molecular mechanisms of disease. Most of the people who evaluate study proposals for the National Institutes of Health “are basic scientists,” notes Daniel Sessler of the Cleveland Clinic. “In basic science, being cutting edge and innovative is what’s valued. But in clinical research you’d never take something completely innovative and try it in people.” It must have already been proven in animals. “So the answer comes back: this is not innovative enough,” he says. Sessler had been stymied in his attempt to get NIH funding for a study of whether the form of anesthesia—regional or general—used during surgery affects long-term cancer survival, something hinted at in animal studies. “More animal studies won’t help,” he says. “The commitment from the top [of NIH to translational research] is real, but it hasn’t filtered down” to scientists who evaluate grant proposals.

RSS feed for comments on this post · TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: