SAR by NMR: Putting the Pieces Together
Abstract
It has been nearly ten years since the introduction of SAR by NMR and the advent of fragment-based drug design. During this time, we have gained a tremendous amount of knowledge about protein druggability, the limits of chemical diversity, and crafting high-affinity ligands from low molecular weight, weakly binding leads. This review will describe the concept of fragment-based drug design, discuss why it works, and illustrate the power of the approach with two case studies on the design of potent inhibitors of matrix metalloproteinases and Bcl-2 family proteins.
Introduction
The goal of all pharmaceutical research is to modulate the activity of a particular target or targets such that a therapeutically beneficial response is evoked—that is, molecular intervention. This is, of course, an exceedingly complex process that can be reduced to two very fundamental questions. First, what are the targets that, if sufficiently modulated, will alter either the onset or progression of human disease? And second, what targets can be modulated with known therapeutic approaches? These questions can be recast as the attempt to identify molecular targets that are both disease modifying and “druggable” (i.e., able to bind to a drug with high specificity) (1). Although the notion of druggability is certainly expanding with the advent of protein and RNA therapeutics (2, 3), the bulk of drug discovery research is dedicated to the identification of small, organic molecules that can interfere with protein function. Early drug discovery entailed the in vivo evaluation of a relatively small number of organic compounds. This, in fact, was a rigid a priori requirement that any molecular targets involved in the observed phenotype were in fact both disease modifying and druggable. This systems-centered or holistic approach (Figure 1⇓) was the basis for the discovery of many highly successful drugs, including the blockbluster (i.e., billion-dollar selling) valproic acid for bipolar disorder. Such a strategy, however, placed severe requirements not only on the number of compounds that could be evaluated (limiting the chemical diversity that could be sampled), but also on the properties of the compounds (e.g., bioavailability, solubility, potency, etc.) that can otherwise be addressed with medicinal chemistry. In addition, it was often the case that the molecular target responsible for the observed phenotype remained unknown, hampering efforts to understand the biological mechanism of action of the therapeutic agent under study. In fact, the enzymes targeted by valproic acid (e.g., histone deacetylases) have only recently been identified (4, 5).
Two different paradigms for drug discovery. The systems-centric (or holistic) approach shown on the left, and the target-centric (or reductionist) approach on the right. In systems-centric drug discovery, compounds are evaluated directly in relevant in vivo models of disease, by necessity requiring intervention with disease modifying, druggable targets. In a target-centric paradigm (denoted on the right with blue lines), a discrete molecular target that is thought to be disease modifying is isolated, and compounds are assessed for their ability to modulate the function of the target in vitro. It is only then that selected functional modulators of the target (agonists or antagonists) are considered for evaluation in vivo.
The advent of modern molecular genetics and high-throughput approaches to synthetic chemistry have revolutionized the way that drug discovery is conducted today (6, 7). The ability to produce, through recombinant technology,, almost any protein (human or otherwise) has led to a paradigm shift in which a reductionist approach to drug discovery is taken (Figure 1⇑). In this target-centric approach, the first line of molecular interrogation is whether the activity of an isolated protein can be modulated with a small molecule in vitro—independent of the requirement for the compound to be useful in vivo. In parallel, combinatorial chemistry and high-throughput organic synthetic approaches have dramatically expanded the numbers of compounds that can be evaluated for biological activity. Both of these advances have necessitated the creation and standardization of high-throughput screening (HTS) platforms that are capable of testing millions of unique chemical entities for their ability to modulate the activity of discrete molecular functions (8). Such platforms are widespread throughout the pharmaceutical industry today and form the backbone for modern lead discovery. Although this approach has certainly led to the discovery of several novel drugs (e.g., HIV protease inhibitors and Gleevec®, among others), there has been significant criticism recently that, overall, the return on investment in these technologies has been meager and disappointing (9, 10). The question is: Why? Is the reductionist approach to drug discovery inherently flawed? Have we not been able to reliably identify disease-modifying targets? Are the targets that are chosen simply not druggable with compounds that meet the requirements for use in humans? Or do we simply not have the right compounds to do the job, even with the radical expansions in chemical diversity space that we are able to access with modern synthetic approaches? Although this short review cannot address all of these questions, our recent work with fragment-based approaches to drug design has shed significant light on our ability to potently and specifically modulate protein targets with small organic molecules. Thus, answers to the fundamental questions of what makes a protein druggable—and whether we can ever identify potent small molecules—are becoming attainable.
To Drug Or Not To Drug
Under the reductionist or target-centric drug discovery paradigm, once a discrete molecular target has been prepared, an HTS of a relatively large (~106) compound collection typically serves as the entry point to lead identification and optimization. As noted above, however, this approach is failing to produce high-quality clinical candidates at a rate commensurate with the resources that have been dedicated to these efforts. There are several very specific reasons for these trends. First, the quality of the leads that have resulted from these large-scale screens tend to lack the appropriate physicochemical properties for clinical success. For example, typical leads tend to be large and lipophilic—significantly reducing the chances that oral bioavailability can be achieved (11–13). Thus, many discovery programs have been hampered by the overzealous pursuit of potency at the expense of other characteristics such as compound solubility, stability, and oral absorption. These observations form the basis for the recent industry-wide movement to prioritize “lead-like” compounds (e.g., lower mass and hydrophobicity) that have a reasonable chance of being “drug-like” after the optimization process (14). In addition, the high-throughput monitoring of ADME (absorption, distribution, metabolism, and excretion) properties in vitro is becoming a critical and early component of lead optimization (15). Both the focus on lead-likeness and the prudent incorporation of high-throughput (HT) ADME into the discovery process promise to increase the success rate of these programs in the future. However, a second reason for failure is more difficult to overcome. It is often the case that an HTS fails to produce any chemical matter that is suitable for further discovery efforts. In fact, a recent survey of high-throughput screens at GSK indicated that only about 50% of all screens produce quality leads that are worthy of further evaluation (16), which is consistent with our own internal studies (Y. Martin, personal communication). There are two distinct possibilities for such a result that are important to differentiate. The first possibility is that the chemical matter required for potent inhibition of the target protein simply did not exist in the compound collection that was screened. In the study by GlaxoSmithKline (16), screening additional (i.e., larger) compounds libraries did in fact improve screening success rates—supporting the need to improve and increase the size of compound collections. This situation is an acknowledged fact by most pharmaceutical companies and has resulted in extensive efforts to expand both the depth and breadth of the chemical diversity contained in their corporate repositories (17, 18). However, a second and more profound possibility is that the protein under study cannot be targeted with small organic molecules. The ability to distinguish between these two outcomes can enable decisions as to whether small-molecule discovery efforts against a particular target should be redirected or terminated.
All of these concerns have contributed, at least in part, to the growing field of fragment-based screening and fragment-based drug design (19, 20). Fragment-based screening utilizes very low molecular weight compounds (50–250 Da) in an attempt to independently dissect the binding requirements within the different regions of a ligand-binding site. This is shown graphically in Figure 2⇓, where individual “building blocks” are identified that bind to distinct subsites within the pocket. One of the greatest advantages of fragment-based screening as compared to conventional HTS is the substantially greater coverage of chemical diversity space (i.e., the universe of compounds that can potentially exist) that can be achieved. Even though the size of the compound library used in fragment-based screening is relatively small (typically on the order of 104 compounds), the total size of the chemical diversity space for fragments is at least ten orders of magnitude smaller than the potential diversity space for larger compounds (Figure 3A⇓) (21, 22). Thus, even small fragment libraries represent a substantially larger fraction of diversity space than conventional compound repositories. Another factor contributing to this increased diversity is that the ultimate goal is to link or tether the individual fragments found at the multiple subsites, as described below. This means that the effective number of ligands that is virtually assessed in a fragment screen is a power function of the number of subsites evaluated and the number of different ways to link the fragments together (23). Figure 3B⇓ gives one estimate for these values for a binding site that contains two independent subsites and 10 linking possibilities. In this example, a fragment library of 104 compounds virtually represents 109 compounds—far larger than all corporate HTS repositories combined. These dramatic increases in the coverage of chemical diversity space allow a fragment screen to be a robust indicator of protein druggability. In the scenario described above, obtaining no hits from conventional HTS suggests either an undruggable target or the lack of appropriate chemical matter. In contrast, obtaining no hits from a fragment screen is an excellent predictor that the protein simply cannot be targeted with small molecules. This ability of a fragment screen to predict protein druggabililty has been borne out experimentally (Figure 3C⇓) (24, 25). Obtaining high fragment hit-rates (>0.3%) is an excellent predictor that high-affinity, small-molecule ligands can be identified, whereas low hit rates (<0.1%) strongly suggest an undruggable pocket. Thus, positive results from a fragment screen can be used to reliably redirect discovery resources for those targets that might initially fail conventional HTS approaches.
The SAR by NMR concept. Multiple, low molecular weight compounds (“fragments”) are identified that can simultaneously bind to proximal sites on the protein surface. Based on structural information, these fragments can be linked to produce a ligand whose binding affinity is theoretically the sum of the individual pieces.
Fragment diversity space. A. A plot of the size of the potential chemical diversity space as a function of the maximum molecular weight for the compound collection (extrapolated from 21 and 22). Note that the y-axis is in log-scale and ranges from 106 to 1018. The approximate regions of diversity space for fragment libraries and conventional HTS collections are denoted by green and yellow ovals, respectively. B. A plot of the virtual number of ligands assessed as a function of the actual number of compounds tested for fragment screening (red line) and conventional high-throughput screening (black line). For HTS, as no linking of the individual hits is proposed, the number of ligands assessed is exactly equal to the number tested. However, in a linked fragment approach, the number of virtual ligands is M*NP, where N is the number of compounds tested, P is the number of independent but proximal subsites on the protein, and M is the number of linking possibilities (23). The example shown here is for (P=2, M=10) as a function of N. C. The observed frequency of being able to inhibit protein targets with potent, non-peptde, non-covalent small molecules as a function of the experimental hit rate from an NMR fragment screen (24).
Working With Building Blocks
Although screening libraries of building blocks can give reliable information as to the druggability of a protein target, that is certainly not the major aim. The goal of fragment-based screening is to enable the rapid development of highly potent, novel inhibitors for protein targets. As the fragment leads are intentionally very small, they typically bind with only weak affinity to the protein target (KD values typically >100 μM). These weak binding affinities require special consideration for both detecting and manipulating the hits. The first report of the successful detection and utilization of fragment leads in drug design was the initial description of SAR by NMR (Structure-Activity Relationships by Nuclear Magnetic Resonance) (26). In this work, two-dimensional, isotope-edited NMR spectroscopy was used to detect two fragment leads that bound to proximal sites on a protein surface. Using three-dimensional structural information on the bound ligands, the fragments were successfully joined together to produce a high affinity ligand. This strategy is known as the linked-fragment approach, as two distinct entities are joined to form a single molecule. The elegance (and the challenge) of the linked fragment approach is that if the two pieces are tethered properly, the binding energy of the linked compound should be the sum of the binding affinity of the individual pieces (Figure 2⇑), with potential additional gains coming from entropy (27). Thus, linking two leads with KD values of 100 μM each (ΔG = −5.5 kcal/mol) should result in a 10 nM lead (ΔG ≤ −11 kcal/mol). There are now numerous examples of the linked-fragment approach in the literature (19). Significantly, in several cases, potent leads were produced using fragment-based drug design even though a conventional high-throughput screen failed to produce productive leads. This underscores the points made above, that existing repositories may lack chemical matter for certain targets, but that a fragment approach covers sufficient diversity space such that new leads can be constructed.
Case Study 1: Matrix Metalloproteinase Inhibitors
In order to illustrate the power of SAR by NMR, two case studies will be described. The first example (and in fact our very first internal application of SAR by NMR) is in the design of potent, orally bioavailable inhibitors of matrix metalloproteinases (28, 29). Matrix metalloproteinases are a family of zinc-dependent endopeptidases that are implicated in a variety of diseases, including arthritis and tumor metastasis. We were initially interested in targeting matrix metalloprotein-3 (MMP-3, also called stromelysin), and our attempts to identify non-peptide inhibitor leads against this protein using a conventional high-throughput activity screen failed. Thus, early medicinal chemistry efforts were directed at peptidomimetics. In parallel, we performed a fragment screen against MMP-3 and discovered that acetohydroxamate (a zinc-chelating moiety with a KD value of 11 mM for the protein) could bind to the protein simultaneously with a number of biaryl compounds (with KD values in the 20–100 μM range) (Figure 4A⇓). The three-dimensional structure of a ternary complex clearly revealed that these two fragments could be linked. In fact, one of the first linked compounds exhibited an IC50 value of 57 nM and bound to the protein as designed (Figure 4A⇓). This was a breakthrough for the project, and lead optimization then began in earnest in order to improve the potency and oral bioavailability of the series. In the meantime, it was becoming clear that MMP-3 was likely not the relevant biological target, and efforts were refocused to direct potency against MMP-2 and MMP-9 (29, 30). These efforts culminated in ABT-518 (Figure 4⇓), which exhibited excellent oral anti-tumor efficacy in animal trials and was approved for Phase 1 clinical trials. This story illustrates the power of fragment-based approaches to succeed in the de novo development of high-affinity leads, and to augment ongoing lead optimization programs.
SAR by NMR on matrix metalloproteinases and Bcl-2 proteins. In each case, matrix metalloproteinases (A) (28–30) and Bcl-2 proteins (B) (31, 32), the identified fragment leads are shown with cyan carbons, whereas the linked compounds are denoted with green carbon atoms. All structures were experimentally determined by NMR. The measured potencies for the individual fragments and the linked compounds are given in cyan and green spheres, respectively. The chemical structure of the final optimized inhibitors (both of which exhibited sub-nM potency in vitro) are shown to the right.
Case Study 2: Bcl-2 Inhibitors
A second (and our most recent) application of SAR by NMR is in the development of inhibitors against the Bcl-2 family of proteins (31). Bcl-2 family members exhibit both pro- and anti-apoptotic activity, and many cancer cells overexpress the anti-apoptotic family members Bcl-2 and Bcl-xL to evade programmed cell death. Similar to our results with MMP-3, our initial attempts at conventional high-throughput screening against Bcl-xL failed to yield productive leads. However, a fragment-based screen again revealed that small organic molecules could occupy proximal binding sites on the protein surface. This is illustrated in Figure 4B⇑, where a series of biaryl carboxylates (with KD values in the range of 300–1000 μM) could bind to the peptide binding groove in the presence of tetrahydronaphthols and related compounds (with KD values in the range of 2–6 mM). Again, the structure of the ternary complex clearly revealed that a single molecule could be designed that could span both sites, and a molecule was produced that in fact occupied both pockets and exhibited sub-μM potency for the protein. Extensive medicinal chemistry optimization ultimately yielded ABT-737, which also exhibits potent anti-tumor affects in animal models (31). It should be noted that whereas the linking process was quite straightforward for the MMP illustration described above, the tethering process for Bcl-xL was much more difficult, requiring multiple synthetic strategies in order to identify the proper compound (32). Thus, although fragment screening can identify the individual building blocks and their preferred binding orientation in the protein pocket, it can sometimes be a difficult design challenge to link the two moieties while maintaining allowed binding modes for each piece. This sometimes requires creative approaches, and surprises (such as the “bent-back” conformation observed for the Bcl-xL inhibitor, Figure 4B⇑) can and do occur. The prudent scientist will always allow for some degree of serendipity in the design process, as the unexpected result often provides new opportunities for further research.
Conclusions
Although simple in concept, SAR by NMR (and fragment-based screening in general) requires the careful and reliable identification and structural characterization of weakly binding leads. High-resolution NMR spectroscopy on isotopically labeled proteins is uniquely suited to this role. Since the initial report on SAR by NMR in 1996 (26), NMR-driven linked-fragment strategies have been described for at least eight protein targets (33, 34). In addition, new strategies for identifying and effectively utilizing fragment leads have been described, such as the fragment-elaboration and fragment-merging techniques, where X-ray crystallography is playing a larger role (35–37). These new strategies have resulted in more than forty different applications of fragment-based drug design described in the current literature. All of this is ample evidence of the power of fragment-based drug design to rapidly deliver potent, novel chemical matter that can fuel successful lead optimization programs.
- © American Society for Pharmacology and Experimental Theraputics 2006
References

Philip J. Hajduk, PhD, did his undergraduate work at the University of Illinois–Urbana, where he received a BS in Chemistry in 1989. He completed his graduate work under the supervision of Laura Lerner at the University of Wisconsin–Madison, where he received his PhD in Chemistry in 1993. He then began work as a post-doctoral fellow at Abbott Laboratories under Stephen Fesik, and was converted to Staff Scientist in 1996. During his post-doctoral research under Fesik, he spearheaded the early development and application of SAR by NMR. In his current role as head of the Protein NMR group at Abbott, PJH continues to apply and expand the utility of structure-based and fragment-based approaches to drug design, along with other applications of NMR spectroscopy to enable and expedite the drug discovery process. PJH has authored or co-authored more than 65 scientific publications and is a co-inventor on seven US patents. E-mail: philip.hajduk{at}abbott.com; fax: (847) 938-2478.

 
                     
                  






