In Silico ADME Tools


The breath and predictive power of in silico ADME tools has increased rapidly during the last 10 years. The quality of many models is such that they can successfully influence decision making in drug discovery and development. In drug discovery they can influence decision related to synthesis of compounds and in development they can influence decisions to perform certain clinical trials or the design of the trial.


10.1 Abbreviations 183

10.2 Basic Concepts 183

10.3 Structure-Based Models 184

10.4 Physiologically Based Pharmacokinetic Models 189

References 191

Additional Reading 191


CAT Compartmental absorption and transit model

PBPK Physiologically based pharmacokinetic model

P450 Cytochrome P450

PLS Partial least squares

SAR Structure-activity relationship

SVM Support vector machine


In silico tools have become remarkably powerful and predictive in ADME sciences and are beginning to play a key role in drug discovery and development. They can be employed prior to synthesis

S.C. Khojasteh et al., Drug Metabolism and Pharmacokinetics 183 Quick Guide, DOI 10.1007/978-1-4419-5629-3_10, © Springer Science+Business Media, LLC 2011

of compounds to improve the likelihood of identifying compounds with acceptable ADME properties. Moreover, if a poor ADME property is predicted with confidence, the in silico data can be used to reduce the number of compounds going through a particular assay. In silico models can also be used during lead optimization to help predict human pharmacokinetics. Finally, in silico models are being used in drug development to predict (1) the likelihood of encountering drug-drug interactions (such models can also influence the design of these drug-drug interaction studies), (2) formulation and physical form effects, (3) food effects, or (4) effects of different dosing regimens. In silico ADME models can simplistically be divided into two categories:

• Quantitative structure-activity (or property) relationship (SAR) models for specific in vitro ADME assays (such as metabolic stability in microsomes) based on a training set and a range of molecular descriptors describing the structure of the compounds in the training set.

• Physiologically based pharmacokinetic (PBPK) models reflecting an integrated system.


A range of software packages are available to predict basic properties such as pKa, log P, log D, and TPSA using the structure of the compound as the input. Nevertheless, some basic properties, such as solubility, are still remarkably hard to predict accurately. Specific ADME models are commercially available as well (see Table 10.1). However, the most successful ADME models tend to be built in-house because pharmaceutical companies have large data archives, and these data were acquired using the same assay format (Gao et al. 2008; Gleeson et al. 2007; Lee et al. 2007; Stoner et al. 2006). First, a large, structurally diverse training set is used to build the models. It is critical that the training set contains compounds covering the whole range of the ADME property under investigation. Inputs for the model are the structure of the compounds and the measured in vitro ADME parameter (e.g., metabolic stability in microsomes, plasma protein binding). A large number of molecular descriptors are calculated for each compound in the training set, ranging from simple parameters such as molecular weight, log P, log DpH, and TPSA to much more complex parameters reflecting the electronics and/or three-dimensional nature of the compounds. Next, an analysis is performed to identify those descriptors that most strongly correlate with the measured parameters. Multiple modeling methods are available to facilitate the model building and optimization process. Models can be defined in most cases by their output: classification or numerical. The result of a classification method falls in a number of bins while for regression methods the output is numerical (although converting it to bins may be more appropriate to prevent over interpretation). The most common statistic methodologies used to build DMPK models are

• Regression methods

- Partial least squares (PLS)

• Bayesian methods

• Supervised learning methods:

- Decision trees (random forest)

- Support vector machine (SVM)

• Neural networks

Finally, one or more models is built using those descriptors that correlate best with the measured parameter. (The number of descriptors should be kept limited to prevent over fitting.) To validate the models, a second, independent dataset is used. The calculated values of the parameter of interest are compared with the measured values for this validation set, and the most predictive model is selected. The whole process is illustrated in Fig. 10.1.

Table 10.1. Most common commercially available software to predict various ADME properties using built-in models or the ability to build new models



ADMET Predictor

Simulations Plus

ADME Suite

ACD Labs

Discovery Studio







Strand Life Sciences




Molecular Discovery

. Model Validation and Publication

Figure 10.1. Flowchart depicting the building and validation of in silico AD ME models. PLS = partial least squares; SVM = support vector machine.

Some models may have a continuous output, but frequently the output is based on categories (e.g., compounds are predicted to be metabolically stable, moderately stable, or labile). If the output is continuous, the correlation coefficient between the calculated and measured values for the validation set should be calculated to illustrate the predictive power of the model. If the output is based on categories (i.e., a classification model), the % false positives and negatives can be calculated. The output of each model should also include the confidence in the prediction, which is usually derived from (1) the structural similarity to compounds in the training set and (2) the number of nearest neighbors. Successful models have been built to predict ADME properties such as metabolic stability in microsomes or hepatocytes, cytochrome P450 (P450) competitive or time-dependent inhibition, permeability, plasma protein binding, and microsomal binding.

The following aspects should be considered when using in silico ADME models:

• It is possible to build global models based on a large, structurally diverse training set, but local models (based on data obtained for one particular project or chemotype) may be more predictive.

• It is important that the data used to build the model are obtained under identical or very similar experimental conditions.

• The whole range of the parameters should be covered by the training set and, preferably, to a similar extent.

• For a dynamic model, the calculated versus measured values should be monitored continuously.

10.3.1 Software to Predict Sites of Metabolism

The most widely used software packages to predict the most likely sites of metabolism are META, Meteor, MetabolExpert, StarDrop, and MetaSite (see also Table 10.2).

Table 10.2. Most common commercially available software to predict the most likely sites of metabolism


Meteor META MetabolExpert MetaSite SMARTCyp StarDrop


Lhasa Multicase CompuDrug Molecular Discovery University of Copenhagen Optibrium

1. META, Meteor, and MetabolExpert are rule-based systems built on a large compilation of biotransformation reactions presented in the literature. Predictions are based on substructure specific metabolism "rules" derived from the database, while ignoring the three-dimensional structure of the cytochrome P450 enzyme and the substrate.

2. The prediction of the site of metabolism in StarDrop is based on two factors: (1) the intrinsic reactivity of each potential site to oxidation by P450 enzymes and (2) the accessibility of the site of metabolism to the active oxy-heme species of P450, which is influenced by the orientation of the substrate in the active site and steric hindrance by nearby groups in the substrate. The reactivity component is the same for every P450 isoform as the reaction mechanism is believed to be similar for all isoforms of P450. However, the orientation and steric accessibility contributions vary between isoforms, reflecting the different binding pockets. The intrinsic reactivity is calculated by estimating the activation energy for the rate limiting step of the oxidation reaction using AMI, a semi-empirical quantum mechanical method. In the case of metabolism at an aliphatic carbon (leading to aliphatic hydroxylation, N- and O-dealkylation) the rate limiting step is hydrogen abstraction, and for aromatic sites formation of a tetrahedral intermediate between the substrate and the oxy-heme is rate limiting. The steric accessibility and orientation effects are estimated as contributions to the activation energy using models based on ligand structure that were trained using a large number of substrates for each P450 isoform (CYP2C9, CYP2D6, and CYP3A4). The final activation energies are then used to calculate the relative rates of product formation at the different sites of metabolism and, hence, the predicted regioselectivity.

3. MetaSite uses a slightly different computational procedure than StarDrop and considers the computed three-dimensional structure of the compound and GRID-based representations of P450 enzymes (1A2, 2C9, 2C19, 2D6, and 3A4). Descriptors are calculated for both, and the fingerprints of both are compared. The comparison provides two key parameters: (1) the accessibility of all molecular features of the drug in the active site of the P450 enzyme toward the heme group and (2) the reactivity of the molecular substructure (based on molecular orbital calculations and fragment recognition). Prediction of the site of metabolism is based on a probabilistic calculation taking both proximity to the reactive oxygen species in the P450 binding pocket and reactivity into consideration (Cruciani et al. 2005). Although valuable and quite predictive, MetaSite provides only the site of metabolism, but not the metabolic pathway.

These in silico models have been successfully integrated in the metabolite identification process and can facilitate or speed up interpretation of data (while keeping in mind that the models will not be 100% predictive, especially if novel biotransformation pathways are involved). However, these models are usually of limited use in predicting the absolute importance of individual metabolic pathways, and StarDrop and MetaSite cannot predict non-P450 mediated metabolism.


Physiologically based pharmacokinetic (PBPK) models (Lave et al. 2007) are more sophisticated than allometric approaches or simple in vitro-in vivo extrapolation for PK prediction. PBPK models are built around a wide range of parameters that describe the normal physiology of the human or animal body. These models are made up of multiple compartments, each representing a predefined tissue or organ, and are connected via blood or lymph flow, as is depicted in Fig. 10.2. Parameters included in these models are those related to human physiology, such as blood flow to organs, weight of organs, drug metabolizing enzymes in the liver and elsewhere in the body, and drug transporters in the body. Oral absorption is frequently modeled using augmented versions of the Compartmental Absorption and Transit (CAT) model. These models describe the various compartments of the intestinal tract and can include details such as:

• Rate constants for gastric emptying

• Intestinal compartmental transit time

• Local permeability of intestinal wall

• pH of each compartment

• Volume and surface area of each compartment

• Enterocytic blood flow

These models integrate the given data to determine the extent of dissolution, absorption, and metabolism (e.g., hepatic first pass, enterocyte metabolism) for each intestinal compartment. A large

Represents blood flows and clearances

Figure 10.2. Physiological compartments incorporated in physiologically based pharmacokinetic (PBPK) models.

Represents blood flows and clearances

Figure 10.2. Physiological compartments incorporated in physiologically based pharmacokinetic (PBPK) models.

amount of data tends to be required for PBPK models. The input for PBPK models are parameters that specifically describe the compound of interest: pKa, lipophilicity, solubility in various media, in vitro permeability, in vitro metabolic stability, and P450 inhibition, among others. In addition, the dose, dosing route, and dosing regimen need to be specified. Commercially available PBPK models are listed in Table 10.3.

Table 10.3. M°stcommon Software Vendor commercially available -

PBPK models Cloe PK Cyprotex

GastroPlus Simulations Plus

PK-Sim Bayer

Simcyp Simcyp Consortium

The output of the PBPK model is a concentration-time profile. For some models, it is possible to provide a population range instead of an average profile. PBPK models are quite powerful, and they are extensively used for predicting and understanding the factors influencing the following phenomena:

• Human and animal pharmacokinetics

• Formulation effects

• Dosing regimen effects

• Drug-drug interactions via competitive and time-dependent P450 inhibition and P450 induction


Cruciani G, Carosati E, De Boeck B et al (2005) MetaSite: Understanding metabolism in human cytochromes from the perspective of the chemist. J Med Chem 48:6970-6979 Gao H, Yao L, Mathieu HW et al (2008) In silico modeling of nonspecific binding to human liver microsomes. Drug Metab Dispos 36:2130-2135 Gleeson MP, Davis AM, Chohan KK (2007) Generation of in-silico cyto-chrome P450 1A2, 2C9, 2C19, 2D6, and 3A4 inhibition QSAR models. J Comput Aided Mol Des 21:559-573 Lave T, Parrott N, Grimm HP et al (2007) Challenges and opportunities with modeling and simulation in drug discovery and drug development. Xeno-biotica 37:1295-1310 Lee PH, Cucurull-Sanchez L, Lu J et al (2007) Development of in silico models for human liver microsomal stability. J Comput Aided Mol Des 21:665-673

Stoner CL, Troutman M, Gao H et al (2006) Moving in silico screening into practice: A minimalist approach to guide permeability screening. Lett Drug Des Discov 3:575-581

Additional Reading

Espie P, Tytgat D, Sargentini-Maier M-L et al (2009) Physiologically based pharmacokinetics (PBPK). Drug Metab Rev 41:391-407 Hou T, Wang J (2008) Structure-ADME relationship: still a long way to go.

Expert Opin Drug Metab Toxicol 4:759-770 Kharkar PS (2010) Two-dimensional (2D) in silico models for absorption, distribution, metabolism, excretion and toxicity (ADME/T) in drug discovery. Curr Top Med Chem 10:116-126

Was this article helpful?

0 0

Post a comment