## Statistical Significance and Clinical Significance Effect Sizes

It must be emphasized that a statistically significant effect of T, with P <0.05, merely means that the data satisfy minimal requirements to show a nonrandom difference between T and C. If P <0.01 or P <0.001, etc., the data satisfy more than minimal requirements, but the conclusion remains only that there is a nonrandom difference between T and C. Thus, a statistically significant effect of T may or may not be large, important, or of any clinical significance.

To show that a statistically significant effect of T is of clinical significance as well, descriptive statistics and an effect size should be reported (as required by the Consolidated Standards of Reporting Trials [CONSORT] guidelines) (Altman et al. 2001; Rennie 1996). For example, how likely is a patient given T to have a response that is clinically preferable to the response of a patient given C (an effect size called area under the curve [AUC]) (Acion et al. 2006; Grissom 1994; Kraemer and

Kupfer 2006)? Or how many patients would have to be given T to have one more "success" than if they had all been given C (another effect size, called number needed to treat) (Altman et al. 2001; Grissom and Kim 2005; Kraemer and Kupfer 2006; Wen et al. 2005)?

By the time there is rationale and justification to propose an RCT to compare T versus C, it is very unlikely that the null hypothesis of randomness is exactly true (Jones and Tukey 2000; Meehl 1967). In any case, with the best possible RCT, there is still a 5% chance of a false-positive result even if the null hypothesis of randomness is exactly true. Thus, given a large enough sample size, and/or given enough RCTs comparing T versus C, every T could eventually be declared statistically significantly better than any C. Accordingly, in recent years the costs of exclusive emphasis on statistical significance have been recognized. There is now growing emphasis on reporting effect sizes that are clinically interpretable (Altman et al. 2001; Grissom and Kim 2005; Kraemer and Kupfer 2006).

