When designing a clinical trial or interpreting its results, the first issue that must be considered is the objective of the trial - specifically, what question is the trial is intended to answer? Max [12, 13] emphasized the importance of distinguishing between pragmatic and explanatory clinical trials . Pragmatic clinical trials have the objective of answering practical questions about patient care; for example, are tricyclic antidepressants (TCA) useful for relieving pain in patients with phantom limb pain? These trials are typically designed to reflect clinical practice to the greatest extent possible, and decisions about various features of the trial are guided by the clinical situation that the results of the trial are intended to inform. The goal of an explanatory clinical trial, however, is to answer a question about the mode of action of a treatment, the etiology of a condition, or both. The methodologic features of an explanatory trial are therefore selected to maximize the likelihood that the trial will answer a specific question about the mechanisms of disease or treatment and without regard to the realities of the clinical situation. In pragmatic trials, the clinical context, tolerability of the treatment, and generaliz-ability of the results are all vitally important, whereas controlling variables and ensuring that a sufficiently large dosage is given become more important considerations in explanatory trials. Of course, answering questions about the likely efficacy of a treatment in clinical practice and about the mechanism of its action are not mutually exclusive. However, these two different objectives generally require different outcome measures, and studies with both goals must be carefully planned to ensure that the objectives and outcomes do not interfere with each other.
In considering clinical trials, a distinction is often made between efficacy and effectiveness trials , although some clinical trials combine elements of both. Efficacy trials test the hypothesis of whether or not there are beneficial effects of treatment in a group of patients, and the methods and procedures are tightly controlled and standardized. In such studies, threats to the internal validity of the study (e.g. the integrity of the double blind or the inclusion and exclusion criteria) are minimized to the greatest extent possible so that treatment effects or biologic mechanisms can be evaluated accurately. Effectiveness trials, on the other hand, are conducted to test the value of a treatment as applied in the "real world" of clinical practice, in which, for example, some patients do not take all the medication they are prescribed. Because of the increased variability, such trials are typically larger than efficacy trials and there is often less control of methods and procedures. In effectiveness studies, external validity and general-izability are emphasized and the trial is designed so that conclusions can be drawn about the value of the treatment as it is actually used. A simple example of this distinction would be two medications that are found to have equivalent efficacy but differ in effectiveness because one is taken less consistently as a result of its greater side effects.
In general, cohort studies can demonstrate association but not causation; that is, regardless of the findings of a cohort study, it cannot be concluded that an intervention caused the observed outcomes. Cohort trials lack randomization, which is the most effective method of creating a valid comparison group. As such, cohort trials cannot distinguish the effects of the intervention from other factors that can affect outcomes, such as natural history (e.g. spontaneous remissions), regression to the mean, and placebo effects. Comparisons of outcomes in a treatment cohort with pretreatment values or with historical controls can therefore provide inaccurate or even misleading estimates of treatment benefits. Cohort studies also generally lack blinding, and treatment endpoints can be biased by the expectations of patients and investigators, especially when subjective outcomes such as pain are assessed.
Although the results of cohort trials cannot be used to establish the efficacy of an intervention, they can be useful in providing pilot data showing whether the treatment appears to have a beneficial effect and in demonstrating its safety and tolerability. For example, if no RCTs evaluating an intervention exist but a cohort study demonstrates tolerability, clinicians may feel somewhat reassured that an intervention is likely to be associated with acceptable tolerabil-ity. Moreover, large cohort studies are often the best method of detecting rare, serious adverse events , and can be used for confirmation of safety in samples much larger and more representative of the general population than those studied in RCTs.
Randomized clinical trials: general considerations
Randomized clinical trials are generally considered the best design for determining whether an intervention is efficacious. Successful randomization of a large group of patients controls for baseline factors, resulting in groups that are essentially identical except for the study treatment. RCTs are therefore the only type of clinical trial for which inferences of causality are appropriate. For example, outcome differences between an active treatment and a placebo group in a large, well-designed placebo-controlled RCT can be inferred to have been caused by the intervention. In general, the results of RCTs should be considered to overrule contradictory findings from other types of studies; an exception to this statement is that most RCTs of treatments for chronic pain are not adequately powered to detect between-group differences in uncommon adverse events.
Investigations of treatments for chronic pain have typically compared the efficacy, tolerability, and safety of a single treatment with placebo. Few RCTs have compared different treatments [17, 18] and even fewer trials have examined whether combinations of treatments are superior to the component treatments examined separately [19, 20]. Studies of combination treatments can use a 2 X 2 factorial design in which patients are randomized to the combination of two treatments, each of the treatments administered alone (with a placebo matching the other treatment), or double placebo. Such a factorial design not only makes it possible to evaluate the efficacy of the combination, but can also provide a "head-to-head" comparison of the two individual treatments. Given how common combination therapy for patients with chronic pain is in clinical practice, additional combination studies of chronic pain treatments must be conducted to determine which combinations are efficacious and well tolerated and which are not.
Even rarer than RCTs of combinations of different medications are studies examining the benefits of combining different modes of treatments, for example, a medication combined with cognitive-behavioral therapy compared with the medication and cognitive-behavioral therapy each administered alone. Such trials have been a major focus of research on the treatment of various psychiatric disorders for many years, and it is unfortunate that so little effort has been devoted to this type of clinical trial in research on the treatment of patients with chronic pain.
Typically, RCTs examining treatments for chronic pain have been designed to determine if one or more different treatments (or different dosages of a single treatment) are superior to placebo with respect to pain reduction and other outcomes. One of the reasons that head-to-head trials of chronic pain treatments have rarely been performed is that the sample size required to show that one efficacious intervention is superior to another would typically be much larger than that required to show that an intervention is superior to placebo. RCTs can also be designed to demonstrate that one treatment is either equivalent to or not inferior to another (typically, first-line) treatment [21-23]. However, equivalence and non-inferiority trials have generally not been conducted for chronic pain treatments, probably because until recently there have been few treatments for chronic pain that have such well-established efficacy that they could be considered standards with which another treatment can be compared. Such trials typically require fewer subjects than one intended to show that one efficacious treatment provides greater benefit than another, making it possible to demonstrate that two treatments have comparable efficacy but that one offers advantages over another in, for example, cost, convenience or tolerability.
In addition, even chronic pain treatments with well-established efficacy may sometimes fail to be superior to placebo in a given trial. When a standard treatment cannot be considered reliably superior to placebo, then an RCT demonstrating that a new treatment is equivalent or noninferior to the standard treatment may simply reflect that in this particular trial, neither the standard treatment nor the new treatment was efficacious. This lack of assay sensitivity in trials that do not have a placebo group is well recognized [24, 25]. For this reason, equivalence and noninferiority trials of chronic pain treatments would still require a placebo group to demonstrate that the standard treatment was superior to placebo. Only if the standard treatment is shown to be superior to placebo does it become possible to conclude that the new treatment is equivalent or noninferior to a standard efficacious treatment.
The critical importance of randomization is demonstrated by the observation that interventions that are shown to be effective in nonrandomized trials have been found to be not effective in randomized trials . There are two primary goals of randomizing subjects. The first is to eliminate both intentional and unintentional bias in the allocation of treatments, which historically has been a significant source of bias in clinical trials. Investigator allocation bias is eliminated by prespecifying a randomization protocol that removes the investigators from the process of selecting which subjects receive which interventions.
The second goal of randomization is to create subject groups that are equivalent in every way except for the intervention. On average, randomization will disperse subject variability evenly between the treatment groups, including both measured variables, such as age and sex, and unmeasured variables, such as pain-relevant genetic polymorphisms that have not yet been identified. The likelihood that randomized groups are truly similar is dependent on the sample size. Smaller groups are more likely to differ in potentially important ways, both among measured and unmeasured variables, whereas between-subject variability is more likely to be dispersed evenly between groups with large sample sizes.
With cross-over designs in which each subject receives more than one intervention, subjects are randomized to different treatment orders, not different treatments. For example, to provide a valid comparison between interventions A and B, an equal number of subjects should be randomized to receive intervention A first and to receive intervention B first. Randomizing by treatment order serves to spread treatment order-related variables evenly between the interventions; these may include differences related to the order of treatment, carry-over effects, or the natural history of the condition.
Two additional aspects of randomization are blocking and stratification. Blocking is a method for ensuring that small groups of subjects are randomized evenly. For example, if a block size of four is chosen in a study with two treatment groups, the first four subjects could be randomized in any potential combination that would produce an even number of subjects in the two groups (e.g. ABAB, BAB A, or BBAA). After the first block is complete, the next four subjects would be assigned to interventions via a newly randomized sequence. Blocked randomization ensures that randomization does not result in substantially different numbers of subjects being allocated to the different interventions by chance. Stratification refers to dividing subjects into groups according to factors associated with treatment response prior to randomization. For example, if depression is thought to affect treatment response, subjects may be separated into those who are and are not depressed prior to randomization; this reduces the likelihood that a greater number of depressed subjects would be randomized to one intervention than the other due to chance, which might affect estimates of overall treatment response.
Publications reporting the results of clinical trials should include a description of the procedures used for randomization, but many do not . The method of randomization has been included in a scoring system for rating trial quality . In this scoring system, random number generation is considered an appropriate method for randomization, whereas randomization based on patient factors, such as date of birth, hospital number or date of exposure, is considered to potentially introduce bias.
A double-blind RCT is one in which the identity of the interventions is concealed from both the subjects and the investigators; typically, the placebo in studies of medications is inert but appears identical to the active medication in color, shape, size, taste, and even odor. This is the best way to reduce potential bias related to knowledge of the intervention. Unblinded or "open-label" studies typically overestimate treatment effects, and interventions that appear highly efficacious in unblinded studies have been shown to be ineffective in blinded studies . The importance of blinding in estimating the magnitude of treatment effects in RCTs should not be underestimated. The average response in the patients receiving placebo, for example, is often greater than the difference between the average response in the placebo and active treatment groups.
Even within double-blind trials, sometimes subjects and investigators can accurately guess which intervention they are receiving, for example, because of the development of characteristic side effects or the effectiveness of the treatment in reducing symptoms. Following completion of participation, subjects and investigators should be asked which intervention they believe was received (or, in the case of cross-over trials, what the treatment sequence was) and what is the basis of their guesses . In a clinical trial of an effective treatment, patients being able to tell which group they were in because of beneficial effects is evidence of treatment efficacy and not an indication of compromised blinding. It is only when patients are able to correctly guess their group based on factors that are unrelated to efficacy, such as side effects, that the adequacy of the blinding and the potential of bias must be considered.
In order to improve the blinding within trials, many chronic pain RCTs have employed "active placebos," which are nonanalgesic medications (rather than inert placebos) with side effects that mimic those of the analgesic medication being studied [19, 29, 30]. The use of active placebos in chronic pain RCTs can be an effective strategy for maintaining the double-blind feature of a clinical trial, particularly in cross-over trials where each subject receives multiple interventions and may therefore be more likely to correctly guess when they are receiving an inert placebo. The use of active placebos, however, remains somewhat controversial. It has recently been argued that "the available evidence does not provide a compelling case for the necessity of an active placebo" in studies of antidepressant medications in patients with depression . Given the difficulty of identifying active placebos for many of the medications used in the treatment of chronic pain, it would be important to determine whether active placebos are necessary in chronic pain trials.
As with randomization, the adequacy of the description of blinding procedures is considered a critical feature when evaluating the quality of published RCTs .
Parallel group trials are performed by randomizing each eligible subject to only one of two or more treatment groups (also termed treatment "arms"), and differences between groups in treatment outcomes are evaluated. Parallel group designs are considered by many to be the most informative type of clinical trial because they have the fewest limitations, provided that the sample size is large enough to provide an adequate test of the study's primary hypothesis.
In situations where the treatment effect has a relatively short and predictable duration and the condition being treated remains constant, a cross-over design can be used in which each subject receives each intervention. For example, in a cross-over trial comparing a new medication with placebo, subjects would be randomized to one of two treatment sequences, either medication first followed by placebo or vice versa. Subjects therefore receive either medication or placebo in the first treatment period, which is typically followed by a "washout" period during which subjects receive no treatment, and then subjects receive in the second treatment period whichever intervention they were not administered in the first period. In this manner, each subject serves as his or her own control. At the end of the trial, the responses of the patients when they were treated with the active medication can be compared to their responses during whichever period they received placebo.
The major advantage of cross-over trials is that they are extremely efficient in terms of sample size. Compared to a two-arm parallel group trial, a two-period cross-over design could require as few as one-quarter the number of subjects to show the same size treatment effect because variability is reduced when subjects serve as their own controls. An additional advantage of cross-over designs when two or more treatments are compared is the ability to evaluate treatment response and other outcomes within the same subjects. For example, are the subjects who have the best responses to one treatment also the ones who respond best to a different treatment ?
One of the central assumptions of cross-over trials is that the outcomes in the two (or more) treatment periods are not affected by the order of treatment. This assumption can be violated in different ways. If the natural history of the disease being studied is such that change during the trial is likely, or if a treatment alters the natural course of the disease, then the outcomes during later treatment periods can be expected to differ from outcomes during earlier periods. Another important concern about crossover trials is the potential for "carry-over effects," that is, the continued effects of an earlier treatment on the outcomes of later periods. The duration of washout periods between treatment periods is often selected not only so that the medication from the first treatment period will have been eliminated before the beginning of the next period but also so that its effects will have disappeared, because such effects can persist longer than the presence of a medication. Carry-over effects can result in different types of error. Overestimation of the pain relief provided by the second treatment can occur if analgesic effects from the first treatment persist and are added to the true effects of the second treatment. On the other hand, overestimation of the side effects of the second treatment can result if side effects from the first treatment persist and are added to the side effects of the second treatment.
Although the relative impact of each of these effects can be mitigated by the random assignment of treatment order (i.e. approximately equal numbers of subjects will get each of the treatments first in the sequence), the assessment of treatment effects and tolerability will be inaccurate in the presence of carry-over or period effects. There are statistical tests that can detect the presence of treatment-by-period interactions and carry-over effects but these tests will generally be underpowered to adequately exclude the presence of such effects.
Nevertheless, the results of cross-over trials have provided a great deal of information about the treatment of chronic pain. For many types of chronic pain, knowledge of natural history supports the assumption of minimal change in pain during the course of the trial. Cross-over trials examining a variety of different medications have found little evidence of carry-over or treatment-by-period effects [17-19, 30]. It is important to recognize, however, that the statistical analysis of cross-over trials is typically a "completer" analysis (i.e. analyzing the responses of subjects who completed the entire trial) rather than the intention-to-treat (ITT) analysis that is typically used in parallel group studies; this can make comparing the results of parallel group and cross-over trials challenging, as will be further discussed below.
Was this article helpful?
Are You Depressed? Heard the horror stories about anti-depressants and how they can just make things worse? Are you sick of being over medicated, glazed over and too fat from taking too many happy pills? Do you hate the dry mouth, the mania and mood swings and sleep disturbances that can come with taking a prescribed mood elevator?