Constantine Kreatsoulas1 Stephen K Durham1 Laura L Custer1 and Greg M Pearl1

'Bristol-Myers Squibb Princeton, New Jersey 08453

§Delaware Valley College Department of Chemistry Doylestown, PA 18901

Table of Contents

Introduction 304

Role of Predictive Toxicology in Drug Discovery 304

Toxicological Endpoints most Amenable to Modeling Efforts 306

Strengths and Weaknesses of Various Methodologies 307

Methods 308

Compound Selection 308

SOS Chromotest Assay 309

Computational Assessment Tools 310



MultiCASE 310

Statistical Definitions Used 311

Model Construction 312

Results 313

Thiophenes 314

Polycyclic Aromatic Compounds 315

Aromatic and Secondary Amines 316

Discussion 318

References 321

List of Abbreviations

ADAPT Automated Determination of Associations and Patterns Toolkit

CASE Computer Automated Structure Evaluation

CU CASE Unit of activity (scaled activity value for models built in MultiCASE; 0-inactive; 99-highly active)

DEREK Deductive Estimation Relationships from Existing Knowledge

ES Expert System

FN False Negative (an assay positive response assessed to be a negative response)

FP False Positive (an assay negative response assessed to be a positive response)

IVS Independent Vendor Software

LUMO Lowest unoccupied molecular orbital

PS Prediction Set (set of compounds a model is naïve to

- neither in the training or validation set)

QSAR Quantitative Structure Activity Relationship

SOS SOS Chromotest for Genotoxicity

TN True Negative (an assay negative response assessed to be a negative response)

TP True Positive (an assay positive response assessed to be a positive response) TS Training Set (set of compounds used to develop a model)

VS Validation Set (set of compounds used to assess the quality of a model)

Key Words

ADME/TOX, Computational Toxicology, Expert System, Genotoxicity, Informatics, Lead Optimization, Mutagenicity, Predictive Toxicology, Quantitative Structure Activity Relationship, Structure Activity Relationship


As the pace of pharmaceutical drug discovery quickens and greater numbers of preclinical candidates are identified using combinatorial and other high throughput methods, the demand on safety assessment assays increases. As most in vitro toxicology assays are, at best, medium throughput, it is readily apparent that rapid in silico assessment protocols must be developed and validated for their use in the early discovery phase. No strangers to the increased demand for accurate safety assessments of candidate compounds and the additional constraints imposed by limited resources, regulatory agencies have long been at the forefront of utilizing and championing computational methods. As regulatory databases of safety information are populated and legacy data incorporated, methods to utilize this data to extract meaningful information must be developed and validated. As this is not intended to be an exhaustive review of all in silico tools for toxicology assessment, the reader is referred to a number of recent articles which do an outstanding job of summarizing the algorithms, benefits and shortcomings of many of the commercial packages available (Pearl, Livingston-Carr et al. 2001; Greene 2002; Snyder, Pearl et al. 2004).

Role of Predictive Toxicology in Drug Discovery

During the development process, there are many competing factors which influence the success of a candidate compound. Physical properties such as lipophilicity, solubility and permeability are closely monitored during the development process and in vitro and in silico models are continuously developed and refined on a per chemotype basis. Empirical metrics such as Lipinski's rule of five (Lipinski, Lombardo et al. 2001) are routinely used in profiling the drug likeness of candidate compounds. As we strive to increase the success rates of compounds both in the pre-clinical and post-NCE, we must be increasingly stringent on the criteria used to determine if a compound will be successful. Safety assessment is one of the most inflexible of these criteria. Treating safety liabilities as simply another property, in this case one to be minimized, has been a paradigm shift in the pharmaceutical industry over the last decade.

What remains, though, is that differing stages of the development process have dissimilar needs for such a design algorithm. Ideally, one would have an assessment protocol in place which would correctly partition all of the harmful from the innocuous compounds in both the training and test/validation sets (TS and VS, respectively). Furthermore, the most usable model is one which can successfully extrapolate beyond either the training or validation sets when encountering novel compounds in a prediction set (PS). This type of model exists only in an idealized context, as neither the data nor our understanding of the biology is usually of sufficient detail to allow us to construct models of such quality. As such, we are often left with three classes of models, those with high overall predicitivity (concordance), high specificity (ability to partition the inactive subset) or high sensitivity (ability to partition the active subset). Most automated algorithms strive to achieve the first goal, usually achieving a high TS concordance, most easily achieved by skewing the model to identify the majority population in the TS. Subjective algorithms, usually involving human intervention and biasing have been used to achieve high sensitivity or specificity, almost always at the expense of concordance.

A simple question arises: why would one want to bias a model toward high sensitivity or specificity? Three issues determine the form of the biasing: a) the predictive ability of the confirmatory assay; b) the stage in the discovery pipeline of the assessment being performed; and, c) the cost of each assay performed compared to the cost of a missed assessment. If the second tier assay has a high specificity, then a simple approach to achieve high overall process concordance is to have a first-tier, in silico method which has high sensitivity (the compounds assessed in the confirmatory assay are those in which the in vitro assay has the highest predictive ability), as shown in Figure 1. As models become more widely used by bench scientists throughout a research organization, the needs of the various research entities using these models are an influencing factor in how to appropriately bias the models. Scientists involved in the final stages of safety assessment (e.g. before an NCE filing) or formulation tend to be more cautious with models that have a minimal rate of false negative assessments. Scientists involved in scaffold development or lead identification, would typically desire a model with as few false positive assessments as possible, so as not to limit the

o o

• • • • •

o o

• • im


• •

• • nn?

• •

• •

• • • • •

• •

• • Iffl

• • • • •

• •

• • —Kg»

• • • • •

• •

• • jffif

• • • • •

• •

• • IIm

• • • • •

• •

• • Mli

• • • • •

• •

• • Jgggf

• • • • •


3b ^

0 0

Post a comment