From External Validation To Clinical Trials The Importance Of Cohort Technical And Computational Factors

As biomarker candidates continue to pour out of research laboratories, it has become increasingly evident that validation is much more difficult and complex than discovery. There are a multitude of general and specific considerations and obstacles to address in order to validate biomarker candidates clinically, some of which were discussed earlier and some of which we discuss now.

Translation of a candidate biomarker from a discovery phase into the validation phase of clinical trials imposes huge organizational, monetary, and technical hurdles that need to be considered at the onset. The team needs to meet requirements in sample number and feature distribution (cohort factors), sample quality, accuracy and precision in handling and processing the samples (technical factors), and in the analysis (computational factors). As we will see, each of these steps is extremely important and very challenging at the same time.

Cohort Factors

During any phases of external or clinical validation of biomarkers, bias can be introduced unintentionally. This is a major concern in the design, conduct, and interpretation of any biomarker study [35,36] . Starting with the population selection process, variations at a biologically level can lead to discernible differences in body fluid and tissue compositions and biomarker measurements [35]. As such, basic criteria such as gender, age, hormonal status, diet, race, disease history, and severity of the underlying condition are all potential sources of variability [35]. Moreover, patient cohort characteristics of the validation phase of a candidate biomarker must be representative of all patients for which the biomarker is developed. To reduce a bias requires the inclusion of hundreds of patients per treatment or disease arm. Not every patient is willing to give biosamples for a biomarker study, however, and only 30 to 50% of the patients in a clinical trial may have signed the informed consent for the biomarker analysis, presenting an important concern for statisticians. There is a risk for population bias, since a specific subset of patients may donate biosamples, hence skewing the feature distribution. Another risk factor that needs to be dealt with in some instances is the lack of motivation of clinical centers to continue a study and collect samples. Although it is sometimes possible to compensate for a smaller sample size through the use of different statistical methods, such as the use of local pooled error for microarray analysis, the analysis team needs to ensure that there is no patient selection bias by the center selection: Affluent centers and their patients may have different characteristics from those with other social backgrounds.

The lack of available samples or patient enrolment may ultimately translate to a decrease in biomarker confidence or the generalizability of the intended biomarker. In reality, collaboration will probably make biomarker validation more robust and economically feasible than working independently [37]. Since the issue of intellectual property (IP) agreement is minimal for the biomarker validation process, as it is not patentable, open interactions among steering committees of large trials or cohort studies should be encouraged [37,38].

Technical Factors

Sample Collection, Preparation, and Processing Even with the establishment of a relatively homogeneous cohort for external validation or clinical trials, results from the intended biomarker assays are valid only if sample integrity is maintained and reproducible from sample collection through analysis -22] - Major components of assay and technical validation are:

• Reference materials

• Quality controls

• Quality assurance

• Sensitivity

• Specificity

Prior to the start of a validation phase, the team needs to decide on the sampling procedure, sample quality parameters, and sample processing. Reproducibility is the key to successful biomarker validation. It is important that standardized operating protocols (SOPs) for sample collection, processing, and storage be established to provide guidance for the centers and the laboratory teams. Nurses and technicians should be trained to minimize the variability in sample collection and handling.

Most biomarkers are endogenous macromolecules which can be measured in human biological fluids or tissues - 39] - The collection process for these specimens seems straightforward. However, depending on the type of bio-marker (genomic, metabolomic, or proteomic) and the collection methods, various factors may need to be taken into account when designing the SOPs [22]. Several examples are given in Table 1.

Processing the samples in a core facility reduces the risk of handling bias, since the same personnel would handle all samples in the most reproducible way possible. Once the samples are collected, systematic monitoring of their quality over time should also be established. Random tests can be conducted to ensure the short-term, benchtop, or long-term stability of the samples to uphold the integrity of the biolibrary [38].

TABLE 1 Sample Considerations

Collection (Biological Fluids or Tissues)

Preparation and Processing


Type of needle Type of collection tube or fixation Location of collection Time of collection Status of patient


Plasma or serum isolation

Temperature of processing Reagents used

Type of storage containers Temperature of storage

Duration of storage

Unfortunately, during external and clinical validations, it is likely that the independent collaborators utilize a completely different set of SOPs. Even though this could ultimately contribute to the robustness of the validation process, it may be useful to establish specific guidelines that would help minimize site-to-site variability. There are several ways to achieve this. Provision to the laboratories and centers of kits that contain all chemicals needed for the processing of the sample can potentially reduce the risk of batch effects. Similarly, as pointed out in Figure 5 , it may be feasible to collect a separate set of samples internally (to minimize sample collection variability) but send them to a collaborator's site for external validation using independent processing and analytical SOPs.

It is of utmost importance that any deviation from the SOPs is noted. These features can then later be accommodated by statisticians for better modeling. Any available information on bias or deviation from protocols or batch processing is useful in the computational process. Excluding them may influence the decision as to whether a biomarker was or was not validated.

Sustained Quality Assurance, Quality Control, and Validation of the Biomarker Tests or Assays The aforementioned cohort (i.e., patient selection) and technical factors (i.e., sample collection, processing, and storage) can all have a significant impact on any of the phased strategies to biomarker validation shown in Figure 1. However, in reality, validation of the biomarker test is just as important as the validation of the biomarker itself. To improve the chance of successful translation from external and clinical validation results to patient care, the analytical validity of the test (does the test measure the biomarker of interest correctly and reliably?) should be closely monitored along with the clinical validity of the biomarker (does the biomarker correlate with the clinical presentation?) [8,12]. Furthermore, to sustain the quality assurance and quality control between different stages of biomarker development, it may be necessary to carry out multiple analytical or technical validations when more than one platform is used.

Computational Factors

In addition to the statistical and bioinformatical factors in the validation process described earlier, the team must also be aware that a vast number of samples impose a huge computational burden on software and hardware. This is especially true when high-throughput and high-performance technologies or high-density arrays need to be used for the validation process. Not considering this potential issue may ultimately be costly in terms of money and time. To deal with the massive amount of data generated from these technologies, new statistical techniques are continuously being developed. Statistical methods should include those that were used successfully in the discovery phase. If the performance is not as good in prior analyses, the refinement of algorithms will be necessary ("adaptive statistics").

Project Management Made Easy

Project Management Made Easy

What you need to know about… Project Management Made Easy! Project management consists of more than just a large building project and can encompass small projects as well. No matter what the size of your project, you need to have some sort of project management. How you manage your project has everything to do with its outcome.

Get My Free Ebook

Post a comment