## Information From Data

Parameter Estimation and the Mean Squared Error

The mean squared error (MSE) is just what it says: the average of the squared difference between the estimated parameter and the true parameter. The square root of the MSE is often referred to by engineers as the root mean square (RMS). In general,

The accent mark over the parameter means that it is estimated from the data; a parameter without the accent mark is a theoretical, or unknown, quantity.

The MSE shows up in statistical estimation theory in the Cramer-Rao inequality [19-21]:

The second term, I( -) in this inequality is called the Fisher information. The statistician tries to construct an estimator of the unknown parameter 0 to make this product as close to 1 as possible. An estimator that makes the product equal to 1 is an unbiased, minimum-variance estimator (UMV) and derives the maximum Fisher information from the data. Unfortunately, these estimators do not always exist for a given situation. The estimation strategy is usually to minimize the MSE when the UMV estimator is not available. There may be situations where a biased estimator will produce an MSE smaller than the unbiased approach.

The maximum likelihood (ML) estimator is probably the most common estimation procedure used today, although it requires that a model for the probability can be written explicitly. It has the nice property that a function of an ML estimate is the ML estimate of the function, allowing the ML parameter estimates to be plugged directly into the function. This property may not be true for other estimation procedures. Since the mathematics is easier when a sample size (n) is large (a euphemism for asymptotic, i.e., when n is "near " infinity), ML estimates are favored because for large samples, they are also Gaussian and UMV even when the underlying distribution of the data is not Gaussian. However, an extremely large n may be needed to get these properties, a fact often overlooked in statistical applications. This is particularly true when estimating some parameter other than the mean, such as the standard deviation or the coefficient of variation (CV).

For the analysis of experiments, ordinary least squares (OLS) estimation is used most often. Analysis of variance (ANOVA) and ordinary regression are special cases of OLS estimation. OLS estimation is unbiased if the correct model is chosen. If the residuals (difference between the observation and the model) are Gaussian, independent, and have the same variance, then OLS is equivalent to ML and UMV. When the variance is not constant, such as in experiments where the analytical variation (measurement error) is related to the size of the measurement, various methods, such as a logarithm transformation, need to be used to stabilize the variance; otherwise, OLS does not have these good properties, and signals can be missed or falsely detected.

### Fisher Information

Fisher information (I) provides a limit on how precisely a parameter can be known from a single measurement. This is a form of the uncertainty principle seen most often in physics. In other words, what statistic gives you the best characterization of your biomarker? When measurements are independent, such as those measured on different experimental units, the information is additive, making the total information equal to nl. If you can afford an unlimited number of independent measurements, which is infinite information, the parameter can be known exactly, but this is never the case. For dynamic measurements (i.e., repeated measurements over time on the same experimental unit), although the measurement errors are generally independent, the measurements are not. It is usually cheaper to measure the same subject at multiple times than to make the same number of measurements, one per subject. In addition, information about the time effects within a person cannot be obtained in the latter case. This is key information that is not available when the measurements are not repeated in time. As illustrated in the examples below, this additional information can have major effects on the characteristics of the signal and the ability to detect it.

The Fisher information is generally a theoretical quantity for the lower bound of the MSE that involves a square matrix of the expectations of pair-wise second- order partial derivatives of the log- l ikelihood. For those interested, this theory can be found in many books [19-21] and is not covered here. However, some of the results given below for the OU model are used to illustrate the information gain. For this purpose, an ANOVA model will be used for the means at equally spaced time points | - In such an experimental design, time changes can be detected using specific orthogonal (independent) contrasts. These can be written as follows:

What you need to know aboutâ€¦ Project Management Made Easy! Project management consists of more than just a large building project and can encompass small projects as well. No matter what the size of your project, you need to have some sort of project management. How you manage your project has everything to do with its outcome.

Get My Free Ebook