Parametric vs Nonparametric

Parametric tests are so named because they use parameters, such as mean and standard deviation, in their computations. Student's t-test and the analysis of variance (ANOVA) are examples of this type of test. Parametric tests assume that the data sets are normally distributed and that the various groups have roughly equivalent variance between members of the group. They are prone to falsely indicating a significant difference between the groups if these assumptions are not met. In that case, nonparametric tests provide an alternative. Nonparametric tests rely on ranking of samples relative to each other, and make fewer numerical assumptions. They are less likely to commit type I error, but usually also have less power to reveal true significant differences.

Researchers need to evaluate data with descriptive statistics in order to decide whether the assumptions of the more powerful parametric tests are met or whether nonparametric tests would be the more appropriate choice. Some assumptions apply to both types of test and must also be considered. In that regard, both parametric and nonparametric tests assume that samples included in the analysis are independent of each other. How does one judge whether these assumptions are met, and what should one do about it if they are not?

Independence To judge whether samples are independent, one needs to ask if within any one group of data included in the analysis, the value of one data point is somehow related to another. If the answer is yes, these samples are not independent. For example, if biomarker expression is measured in two separate tissue samples three separate times each and you enter n = 6 data items into the analysis, you have violated the independence criterion. The three replicates of each sample are related to one another (from the same tissue sample); there are actually only two independent measures, not six. Since n = 2 is not enough samples to analyze statistically, potential plans to redress this analysis error could include either omitting statistical analysis in favor of descriptive results only, or redesigning the experiment to include additional tissue samples.

Often, experiments are designed to measure biomarker expression repeatedly in a single subject, such as in the case of pharmacodynamic measures at different time points following administration of a drug in clinical trial patients. In this case, the independence assumption is not violated, since the related measures will be analyzed as different groups rather than being included within one group. Repeated-measure statistical tests are then applicable, such as the paired t-test (for two measures) or repeated-measure ANOVA (for more than two measures).

To reiterate, then, if samples are not independent of one another and they do not constitute repeated measures, the correct statistical approach is to redesign your experiment. Although more complex analysis strategies do exist for clustering correlated samples within groups, there is no manipulation when using simple tests of significance that will correct for samples in one data set being selected such that the choice of one sample depends on another sample being picked.

Normally Distributed Data How do you judge if your data are normally distributed? The first step is to perform descriptive statistics and plot the data out as a histogram. In a normal distribution, about 68% of the data falls within 1 standard deviation (1 SD) of the mean, and 95% within 2 SD of the mean. Many statistics programs will compute normality for you using statistical tests. The Kolmogorov-Smirnoff test, for instance, has nothing to do with martinis, being instead one such test for the normal distribution of data. If you have around n = 25 or more in each group, you can safely apply the simple eyeball test to your data, accepting data as normal if points fall into a roughly bell-shaped curve when plotted as a histogram .

What if you are not sure whether your data are normally distributed? If you have small data sets, for instance, normality can be difficult to determine visually. Similarly, statistical tests do not have much power to make the determination in this case. At this point, one's own personal choice comes into play: If you are conservative, you might choose to play safe with a nonparametric test; if not, you might choose a parametric test. As with other life choices, there are risks associated with either: With parametric tests you risk error of type I, whereas with nonparametric ones you risk type II error. The best solution might be to collect reasonably large data sets so as to avoid this uncertainty.

Assuming you do have a large enough data set to be certain, and what you are certain of is that the data are not distributed normally, well, what then? One solution might be to transform data mathematically such that the transformed data sets become normally distributed, in which case the more powerful parametric tests can then be used. Some commonly used transformations and appropriate types of data for their use [35-38] are listed in Table 1. If data are still not normally distributed with equal variances following mathematical transformation, a nonparametric test should be chosen for analysis. Some appropriate nonparametric equivalents of parametric tests are shown in Table 2 .

Equivalent Variances Between Groups As with the question of whether data are normally distributed, equivalence of variance between groups needs to be evaluated prior to choosing appropriate tests of significance. Sometimes, homogeneity of variance from one group of data to another can be amply judged by plotting a frequency histogram of each group and seeing how spread out the data are, or calculating the variance for each group and qualitatively assessing if the numbers are very different or appear close. It is simple enough, however, to apply the appropriate statistical test, the F-test for two groups, or the Fmax-test for more than two.

TABLE 1 Common Data Transformations

Type of Data

Example

Transformation

Proportional data

Poisson distributed (counts of random events)

Variance proportional to mean squared (i.e., standard deviation proportional to mean)

Ratio data (normalized to internal standard on a per-assay basis)

Highly variable quantities, where variance is proportional to the mean to the fourth power (i.e., standard deviation proportional to mean squared)

Number of cells of a particular type in a given volume of blood

Serum cholesterol levels in patients

Blots normalized to a housekeeping gene

Serum creatinine levels in patients

Arcsin of the square-root transformation

Square-root transformation

Log transformation Log transformation

Reciprocal transformation

TABLE 2 Nonparametric Equivalents of Parametric Tests

Parametric

Nonparametric

Student's f-test (unpaired) f-test (paired)

f-test (paired), in cases where data are not symmetrical around the median ANOVA (without repeated measures) ANOVA (with repeated measures)

Mann-Whitney test (Wilcoxon rank sum) Wilcoxon signed - rank test Sign test

Kruskal-Wallis test Friedman's test

I f variances differ between groups and are not rendered equivalent by mathematical transformation (see Table 1), you may still be able to use parametric statistics. If, for example, the t-test is appropriate to your experimental design, and variances are unequal between groups but not extremely disparate, you may choose to use the t -test for unequal variances (in the case of extreme differences in variance, however, a nonparametric test is the recommended alternative).

If your intended parametric test is ANOVA, on the other hand, the assumption of equivalent variance between test groups is essential . If variances are not equivalent and transformation also does not render them equivalent, you must test statistically using a nonparametric equivalent (see Table 2). Lower Your Cholesterol In Just 33 Days

Discover secrets, myths, truths, lies and strategies for dealing effectively with cholesterol, now and forever! Uncover techniques, remedies and alternative for lowering your cholesterol quickly and significantly in just ONE MONTH! Find insights into the screenings, meanings and numbers involved in lowering cholesterol and the implications, consideration it has for your lifestyle and future!

Get My Free Ebook