## Example Experiments In Vitro Experiments

This section is intended to explore the behavior of an OU process when only a few experiments are run but the response can be measured at many equally spaced time points. Assay development is not discussed here but can be found in many papers and textbooks, such as that of Burtis and Ashwood [33]. All the experiments that follow assume that the assay has been "optimized" according to the existing standards. One point that needs to be stressed here is the relationship between the true value of the biomarker ^ and the variation of the measurement error oe, assuming that no assay bias is present. Many use the concept of the CV = oe/|j. being constant. If this relationship really holds, the statistical analyst needs to know that and approximately what the constant is. Prior to using a biomarker, it is important to define the mathematical relationship between oe and ^ so that the most information can be gained by the modeling. It should also be noted the since ^ is a function of time, the measurement variance will also be a function of time. Most statistical procedures assume that the measurement variance is constant, including those in these examples. This assumption will tend to hide signals.

In the laboratory, it is relatively easy to obtain measurements at frequent, equally spaced time intervals. If a good biological model is available, this is the place to define the best (maximum Fisher information) mathematical model of the response. If such a model is not known, it should be explored first. This section looks at two simulation experiments comparing the size of a (3 vs. 0.03) with 100% response. Each has n = 5 experimental units per group, and each unit has m = 30 follow-up measurements. These could represent chemical reaction experiments, cell measurements, well measurements, or any similar in vitro biological model for a biomarker. The raw data are shown in Figures 5 and 6 , and the statistics are shown in Tables 7a and 7b.

When a is large, the measurements are effectively independent and there should be no relationship to the previous measurements, including the baseline. However, if the baselines do not come from the same distribution, the comparability of the results is called into question. Here in Table 7a, only models 1, 5, and 7 gave reasonable estimates of a , In the strong autocorrelation case, Table 7b shows that models 2, 5, 7, 10, 12, 15, 17, and 20 gave estimates of a that, although not very precise, were the correct order of magnitude.

In Table 7a, every model detected the signal and the polynomial and the quadratic effect intended. However, there are many p-values showing significance where there should not be any. This can lead to spurious conclusions about the nature of the biomarker. In the experiment shown in Table 7b, several models gave reasonable indications for the magnitude of the autocorrelation but did not always find the underlying signal. In the first case, all the biases show overestimates of the time effect, while in the second case, the biases are in the opposite direction and have a larger magnitude, making the MSE larger—less information. The larger and negative bias is exacerbated by the fact that the time model parameter estimates need to be divided by 1 - e-a(t-s), which is always less than 1. This correction was not done because the estimates of a are not good enough to make the correction reasonable to use. Additionally, the analyst would not be estimating a, making these results more representative of current practice. When t - s is not constant, software for the estimation procedure is not readily available.

### In Vivo Experiments

This section is meant to illustrate experiments of the size used in animal studies but may also apply to early clinical development. Here there are only m = 4 follow-up measurement, but the number (n = 36) per group was scaled up so that approximately the same number of observations are available. This results in comparable degrees of freedom for the model errors.

In Tables 8a and 8b, the results are analogous to those in Tables 7a and 7b. However, in Table 8a the estimates of a are generally all bad, leading to the conclusion that autocorrelation is present, when it really is not. The quadratic

 Model a" Error df o All Poly Linear Quad Cubic Quartic Var Bias