What is substantial evidence

How and when does one determine that a study is a positive source of evidence that will be considered 'substantial'? Regulatory assessments of clinical trials ordinarily rely on hypothesis testing employing a two-tailed alpha of 0.05 as the level of the test to reject the null hypothesis of no treatment effect. In the interests of protecting the experiment-wide alpha level, a study's prospectively written protocol should identify the specific hypothesis being tested, the primary outcome variable upon which the analysis turns, and the statistical methods and model that will be used to evaluate that outcome. Regulatory evaluation strategies are designed to guard against the inflation of the experiment-wide alpha through a multiplicity of endpoints and multiple examinations of the evidence.

As further safeguards, data-conditioned analyses are eschewed as sources of probative evidence, although they are allowed, even encouraged, for exploratory purposes, and independent substantiation of experimental findings is of pivotal concern.

This 'traditional' approach has been widely criticized and condemned for a host of reasons. No doubt, it is a 'conservative' one. Moreover, the size of the statistical test used in regulatory work is traditional, not rational. Given the nature of the question asked, the use of bidirectional test is arguably incoherent, and, perhaps, worst of all, the test is usually capable of being met for almost any active drug, even one with only a clinically trivial average effect, barely distinguishable from placebo, if the study is large enough. Despite these problems it is not in the least clear what other approach would be better for a regulatory agency to adopt.

To paraphrase Sir Winston Churchill, hypothesis testing may be a terrible way to evaluate evidence of a drug product's effectiveness but, unfortunately, it is the best method we have.

