Rating Instruments

Because in psychiatry we do not yet have reliable biological measures (e.g., a serum lipid panel) to assess efficacy, we rely on objective rating instruments applied by an observer or on self-reporting. Before being used in a study on drug efficacy, the instruments need to be assessed for both validity and clinical relevancy, and reliability and consistency must be established in the population to be sampled and in the hands of the RCT research staff. This is particularly true in multisite RCTs, where special efforts are necessary to ensure that the primary outcome measure is the same across the sites. Lack of reliability, lack of consistency between raters, or lack of consistency across sites in a multisite RCT is likely to lead to a failed RCT, one unable to establish statistically significant differences between T and C.

An RCT should have only one primary outcome being measured—or very few—and all the decisions in study design, measurement, and analysis are directed toward generating valid and powerful tests for that primary outcome. Other outcomes can be assessed in the study, as long as they do not compromise assessment of the primary outcome and do not impose undue burden on patients and research staff that would lead to dropouts and diminished reliability of measures. Outcome assessments that amplify or elucidate the primary outcome results are generally listed as secondary outcomes. Finally, it is necessary at baseline to collect enough information to well characterize the sample, both sociodemographically and clinically, to check the adequacy of randomization in producing two groups assigned to T and C that are comparable at baseline.

Such baseline data also are valuable in post hoc exploratory analyses to assess the possibility of moderators of treatment response (identification of subpopulations that have different effect sizes) (Kraemer 2008; Kraemer et al. 2006, 2008).

During treatment, there may be repeated assessments of the primary and secondary outcomes, which can be used in analysis of results using methods such as hierarchical modeling that generally deal much more effectively with dropouts and missing data; such methods usually increase the power to detect T versus C effects without requiring an increase in the sample size. Events or changes that happen during the treatment may also be used in post hoc analysis to identify mediators (possible mechanisms) that indicate how and why T works better than C (or does not), which can then lead to improved treatments for evaluation in future studies.

Such descriptive statistics or the results of post hoc analyses lead not to conclusions but rather to hypotheses that would be tested in future RCTs, but they are essential to the advancement of treatments.

In seeking approval for an indication, observer-rated instruments are generally preferred over self-reports. However, some studies employ self-report measures to test secondary hypotheses. Over time, a commonly used measure may be supplanted by another that is believed to more accurately reflect specific symptoms to be studied. Some of these scales are quite different from each other. For example, the Montgomery-Asberg Depression Rating Scale (MADRS) has been used in antidepressant trials as an alternative to the Hamilton Rating Scale for Depression (Ham-D) because the former focuses less on anxiety symptoms and more on core depression. Other scales may be expansions on existing ones, adding items to focus on specific symptoms in a particular disorder. An example is the use of the Positive and Negative Syndrome Scale (PANSS) for schizophrenia. The scale is an expansion of the classic Brief Psychiatric Rating Scale (BPRS). Detailed descriptions of rating scales can be found in a source textbook edited by Rush et al. (2008).

Application of a validated and reliable instrument still requires that raters be trained and that a high degree of agreement (interrater reliability) be found among raters, both cross-sectionally and over time. This generally requires formally training raters and then testing multiple raters on patient vignettes to establish interrater reliability. In many cases, patient interviews might be videotaped and randomly selected videotapes submitted to independent experts to assess the reliability and consistency of ratings. Again, this is particularly important with multisite RCTs.

Nicotine Support Superstar

Nicotine Support Superstar

Stop Nicotine Addiction Is Not Easy, But You Can Do It. Discover How To Have The Best Chance Of Quitting Nicotine And Dramatically Improve Your Quality Of Your Life Today. Finally You Can Fully Equip Yourself With These Must know Blue Print To Stop Nicotine Addiction And Live An Exciting Life You Deserve!

Get My Free Ebook


Post a comment