<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art><ui>1471-2105-11-316</ui><ji>1471-2105</ji><fm>
<dochead>Research article</dochead>
<bibl>
<title>
<p>A novel chemogenomics analysis of G protein-coupled receptors (GPCRs) and their ligands: a potential strategy for receptor de-orphanization</p>
</title>
<aug>
<au id="A1" ce="yes"><snm>van der Horst</snm><fnm>Eelke</fnm><insr iid="I1"/><email>e.van.der.horst@lacdr.leidenuniv.nl</email></au>
<au id="A2" ce="yes"><snm>Peironcely</snm><mi>E</mi><fnm>Julio</fnm><insr iid="I1"/><email>peyron@gmail.com</email></au>
<au id="A3"><snm>IJzerman</snm><mi>P</mi><fnm>Adriaan</fnm><insr iid="I1"/><email>ijzerman@lacdr.leidenuniv.nl</email></au>
<au id="A4"><snm>Beukers</snm><mi>W</mi><fnm>Margot</fnm><insr iid="I1"/><email>beukers@lacdr.leidenuniv.nl</email></au>
<au id="A5"><snm>Lane</snm><mi>R</mi><fnm>Jonathan</fnm><insr iid="I1"/><email>jrlane@lacdr.leidenuniv.nl</email></au>
<au id="A6"><snm>van Vlijmen</snm><mi>WT</mi><fnm>Herman</fnm><insr iid="I1"/><email>hvvlijme@its.jnj.com</email></au>
<au id="A7"><snm>Emmerich</snm><mi>TM</mi><fnm>Michael</fnm><insr iid="I2"/><email>emmerich@liacs.nl</email></au>
<au id="A8"><snm>Okuno</snm><fnm>Yasushi</fnm><insr iid="I3"/><email>okuno@pharm.kyoto-u.ac.jp</email></au>
<au ca="yes" id="A9"><snm>Bender</snm><fnm>Andreas</fnm><insr iid="I1"/><insr iid="I4"/><email>bendera@lacdr.leidenuniv.nl</email></au>
</aug>
<insg>
<ins id="I1"><p>Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University, Einsteinweg 55, 2333CC, The Netherlands</p></ins>
<ins id="I2"><p>Leiden Institute for Advanced Computer Science, University of Leiden, The Netherlands</p></ins>
<ins id="I3"><p>Department of PharmacoInformatics, Center for Integrative Education of Pharmacy Frontier, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan</p></ins>
<ins id="I4"><p>Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK</p></ins>
</insg>
<source>BMC Bioinformatics</source>
<issn>1471-2105</issn>
<pubdate>2010</pubdate>
<volume>11</volume>
<issue>1</issue>
<fpage>316</fpage>
<url>http://www.biomedcentral.com/1471-2105/11/316</url>
<xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-11-316</pubid><pubid idtype="pmpid">20537162</pubid></pubidlist></xrefbib>
</bibl>
<history><rec><date><day>4</day><month>3</month><year>2010</year></date></rec><acc><date><day>10</day><month>6</month><year>2010</year></date></acc><pub><date><day>10</day><month>6</month><year>2010</year></date></pub></history>
<cpyrt><year>2010</year><collab>van der Horst et al; licensee BioMed Central Ltd.</collab><note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec>
<st>
<p>Abstract</p>
</st>
<sec>
<st>
<p>Background</p>
</st>
<p>G protein-coupled receptors (GPCRs) represent a family of well-characterized drug targets with significant therapeutic value. Phylogenetic classifications may help to understand the characteristics of individual GPCRs and their subtypes. Previous phylogenetic classifications were all based on the sequences of receptors, adding only minor information about the ligand binding properties of the receptors. In this work, we compare a sequence-based classification of receptors to a ligand-based classification of the same group of receptors, and evaluate the potential to use sequence relatedness as a predictor for ligand interactions thus aiding the quest for ligands of orphan receptors.</p>
</sec>
<sec>
<st>
<p>Results</p>
</st>
<p>We present a classification of GPCRs that is purely based on their ligands, complementing sequence-based phylogenetic classifications of these receptors. Targets were hierarchically classified into phylogenetic trees, for both sequence space and ligand (substructure) space. The overall organization of the sequence-based tree and substructure-based tree was similar; in particular, the adenosine receptors cluster together as well as most peptide receptor subtypes (<it>e.g</it>. opioid, somatostatin) and adrenoceptor subtypes. In ligand space, the prostanoid and cannabinoid receptors are more distant from the other targets, whereas the tachykinin receptors, the oxytocin receptor, and serotonin receptors are closer to the other targets, which is indicative for ligand promiscuity. In 93% of the receptors studied, de-orphanization of a simulated orphan receptor using the ligands of related receptors performed better than random (AUC &gt; 0.5) and for 35% of receptors de-orphanization performance was good (AUC &gt; 0.7).</p>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>We constructed a phylogenetic classification of GPCRs that is solely based on the ligands of these receptors. The similarities and differences with traditional sequence-based classifications were investigated: our ligand-based classification uncovers relationships among GPCRs that are not apparent from the sequence-based classification. This will shed light on potential cross-reactivity of GPCR ligands and will aid the design of new ligands with the desired activity profiles. In addition, we linked the ligand-based classification with a ligand-focused sequence-based classification described in literature and proved the potential of this method for de-orphanization of GPCRs.</p>
</sec>
</sec>
</abs>
</fm><bdy>
<sec>
<st>
<p>Background</p>
</st>
<p>G protein-coupled receptors (GPCRs) comprise a large family, more than 800 in human <abbrgrp>
<abbr bid="B1">1</abbr>
</abbrgrp>, of cell surface receptors that consist of seven transmembrane (TM) helices. These receptors are activated by a variety of external stimuli, including light, ions, small molecules, lipids, and proteins; moreover, the majority of therapeutic drugs act on GPCRs <abbrgrp>
<abbr bid="B2">2</abbr>
</abbrgrp>. Because of the limited number of target crystal structures <abbrgrp>
<abbr bid="B3">3</abbr>
<abbr bid="B4">4</abbr>
<abbr bid="B5">5</abbr>
<abbr bid="B6">6</abbr>
</abbrgrp>, GPCR drug design relies largely on ligand-based approaches <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp> such as property-based methods <abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp>, pharmacophore models <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp>, and substructure methods <abbrgrp>
<abbr bid="B10">10</abbr>
</abbrgrp>. These methods do not require any knowledge about the target protein; however, combining them with target information often increases their potential. The resulting so-called 'chemogenomics' approaches thus involve both ligand-based and target-based aspects <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp>. They do not focus on a single group of ligands and one individual target, but rather on groups of ligands against groups of targets. The central idea is that similar targets have similar ligands <abbrgrp>
<abbr bid="B12">12</abbr>
<abbr bid="B13">13</abbr>
</abbrgrp>. Therefore, relationships between targets from the sequence side can be exploited to search for novel receptor ligands on the chemical structure side.</p>
<p>Traditionally, the GPCR superfamily has been classified based on sequence homology of the receptors. Kolakowski grouped all seven transmembrane (7-TM) proteins into classes A to F for receptors proven to bind G-proteins and class O for the other 7-TM proteins <abbrgrp>
<abbr bid="B14">14</abbr>
</abbrgrp>. Class A receptors resemble rhodopsin and form the largest cluster. Later, Fredriksson <it>et al. </it>proposed a more elaborate classification for known and predicted human GPCRs <abbrgrp>
<abbr bid="B1">1</abbr>
</abbrgrp>. Surgand <it>et al. </it>presented a sequence-based phylogenetic classification of GPCRs viewed from a ligand perspective <abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>. By selecting residues pointing inwards into the generic binding pocket of GPCRs, the authors assembled a set of 30 residues most likely to be accessible for ligand binding. Based on these residues, phylogenetic clustering was performed. Although only a subset of residues was used, the classification was similar to classifications based on the full sequence. Applications of a grouping such as proposed by Surgand <it>et al. </it>constitute ligand design for related receptors, as well as de-orphanization of GPCRs <abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>. However, the study by Surgant <it>et al. </it>is somewhat limited by the scarcity of structural protein data where the identification of binding site residues was solely based on the structure of bovine rhodopsin. It could not yet take into account recent advances that yielded three pharmacologically relevant X-ray crystal structures, namely those of the human &#946;<sub>2 </sub>and turkey &#946;<sub>1 </sub>adrenoceptors, as well as of the human adenosine A<sub>2A </sub>receptor <abbrgrp>
<abbr bid="B3">3</abbr>
<abbr bid="B5">5</abbr>
<abbr bid="B6">6</abbr>
<abbr bid="B16">16</abbr>
</abbrgrp>. Building further on Surgand's work, Gloriam <it>et al. </it>proposed an extended set of ligand-accessible residues, derived from visual inspection of the newly available X-ray GPCR crystal structures, from supporting mutagenesis data and from the evaluation of previously established residue sets <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>. The resulting set of 44 residues was then applied to cluster class A GPCRs into a phylogenetic tree, which reflected similarities in binding site of the receptors.</p>
<p>Complementary to these sequence-based classifications are the ligand-based classifications of GPCRs. Approaches that use ligand similarity measures for target classification have been previously described <abbrgrp>
<abbr bid="B18">18</abbr>
<abbr bid="B19">19</abbr>
</abbrgrp>. Keiser <it>et al. </it>related targets by pair-wise comparison of their ligands <abbrgrp>
<abbr bid="B20">20</abbr>
</abbrgrp>. From a set of 65 k ligands, a network was constructed connecting almost all 246 targets through sequential linkage. From this, previously unknown antagonism of methadone on the muscarinic M<sub>3 </sub>receptor and of emetine on the &#945;<sub>2</sub>-adrenoceptor was identified.</p>
<p>While sequence-based similarity relies on comparison of the residues at certain positions in the sequence, there is no unambiguously defined method to measure ligand-based similarity. One way of defining ligand similarity is to consider the overlap of substructures in the molecules. Frequent substructure mining is a method for finding the most common substructures in a set of molecules <abbrgrp>
<abbr bid="B21">21</abbr>
<abbr bid="B22">22</abbr>
<abbr bid="B23">23</abbr>
</abbrgrp>. It evaluates all possible substructures, not only discrete fragments that are present in the molecules; it is therefore an exhaustive approach, resulting in a more complete view on the structural features in the set.</p>
<p>In this study, we employ frequent substructure mining to determine the similarity between groups of ligands in a thorough and unbiased manner. This substructural similarity is then used for classification of GPCRs according to relatedness of substructure profiles of their ligands. The substructure-based classification of GPCRs visualizes relatedness of receptors in the form of a phylogenetic tree, which is then compared to the sequence-based phylogenetic classifications of GPCRs. The differences in tree organization are examined with methods that visualize changes in target position. Taken together, we present a (GPCR) classification from the small molecule (ligand) perspective, which facilitates analysis of target similarities and differences in ligand-binding behavior. In addition, we explore the potential of our ligand-based classification in receptor de-orphanization, <it>i.e</it>. the prediction of new ligands for orphan receptors.</p>
</sec>
<sec>
<st>
<p>Results and Discussion</p>
</st>
<sec>
<st>
<p>Sequence-based classification</p>
</st>
<p>Three types of sequence-based phylogenetic trees were built, namely: one tree that was based on the full 7-TM sequence, one tree employing 30 residues described by Surgand <it>et al. </it>
<abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>, and one tree which was based on the set of 44 residues described by Gloriam <it>et al. </it>
<abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>. Note that the three sequence-based trees presented here are different from those published in the referenced original work <abbrgrp>
<abbr bid="B1">1</abbr>
<abbr bid="B15">15</abbr>
<abbr bid="B17">17</abbr>
</abbrgrp>, since in the current study orphan receptors, receptors with a low number of ligands, and singleton receptors were left out. Singleton receptors are receptors that are the only (available) member in their respective subfamily. Due to the chemogenomic nature of this study, we focus on the phylogenetic tree based on the set of Gloriam <it>et al. </it>since it represents the ligand perspective best; this set is referenced as the GSK set <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>. The two other trees are provided for reference purposes in Additional file <supplr sid="S1">1</supplr> - Phylogenetic trees based on 7TM domain and selected residues. The tree that was built based on the multiple sequence alignment of the GSK set is shown in Figure <figr fid="F1">1</figr>. The GPCR subtypes in this tree are grouped as branches in the tree according to subfamily and target since it resembles the sequence-based phylogenetic tree on which GPCR classification is based <abbrgrp>
<abbr bid="B1">1</abbr>
</abbrgrp>. For instance, the opioid receptor subtypes &#948;, &#954;, &#956;, and NOP cluster together, as well as the &#945;- and &#946;-adrenoceptor subtypes. The fact that clustering follows the receptor classification is expected since the classification of GPCRs was based on sequence similarity <abbrgrp>
<abbr bid="B24">24</abbr>
<abbr bid="B25">25</abbr>
</abbrgrp>. Four clusters are clearly defined in the tree: the aminergic receptors, the adenosine receptors, the prostanoid receptors, and the peptide-binding receptors.</p>
<suppl id="S1">
<title>
<p>Additional file 1</p>
</title>
<text>
<p>
<b>Phylogenetic trees based on 7TM domain and selected residues</b>. Phylogenetic trees based on 7TM domain and selected residues. Two sequence-based phylogenetic trees for the set of Class A GPCRs used in this study: the phylogenetic tree based on the multiple sequence alignment of the 7TM domain and the phylogenetic tree based on 30 selected residues described in Surgand <it>et al. </it>
<abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>. Subfamilies are color-coded according to ligand type whereby the broad ligand types applied by in Gloriam <it>et al. </it>
<abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp> were used. Legend: red - receptor with aminergic ligands; pink - peptide ligands; green - lipid ligands; dark blue - purinergic P2Y ligands; light blue - adenosine ligands; brown - melatonin ligands.</p>
</text>
<file name="1471-2105-11-316-S1.PDF">
   <p>Click here for file</p>
</file>
</suppl>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>Phylogenetic tree of human Class A GPCRs based on sequence information (44 residues of the GSK set)</p></caption><text>
   <p><b>Phylogenetic tree of human Class A GPCRs based on sequence information (44 residues of the GSK set)</b>. Human Class A GPCRs are clustered based on the 44 ligand-binding residues as defined in the GSK set. Subfamilies are color-coded according to ligand type whereby the broad ligand types applied by Gloriam <it>et al. </it><abbrgrp><abbr bid="B17">17</abbr></abbrgrp> were used. red - receptor with aminergic ligands; pink - peptide ligands; green - lipid ligands; dark blue - purinergic P2Y ligands; light blue - adenosine ligands; brown - melatonin ligands.</p>
</text><graphic file="1471-2105-11-316-1" hint_layout="double"/></fig>
</sec>
<sec>
<st>
<p>Ligand-based classification</p>
</st>
<p>The ligand-based receptor classification, which we will compare to the sequence-based classification, is provided in Figure <figr fid="F2">2</figr>. Subfamilies in this tree are more scattered; however, most subfamilies cluster together. For instance, except for the two purinergic receptors (P2Y<sub>1 </sub>and P2Y<sub>12</sub>) and the two glycoprotein hormone receptors (FSH and LH), all other receptors represented by only two subtypes, such as the melatonin or the leukotriene B<sub>4 </sub>receptors, are clustered together. The adenosine receptors A<sub>1 </sub>(ADORA1), A<sub>2A </sub>(ADORA2A), A<sub>2B </sub>(ADORA2B), and A<sub>3 </sub>(ADORA3) group together, indicating overlap in ligand profiles. This may imply that ligands for these receptor subtypes are non-selective, such as the adenosine receptor antagonists caffeine and theophylline. Additionally, receptor selectivity may vary with relatively small changes in ligand structure: an 8-cycloalkyl substituent on theophylline confers A<sub>1 </sub>receptor selectivity, whereas a phenylstyryl substituent on the same position in caffeine renders these compounds selective for the A<sub>2A </sub>receptor. The purinergic receptor P2Y<sub>12 </sub>is found near the adenosine receptors owing to the purine core typical for ligands of both these subfamilies. In agreement with the ligand selectivity reported for the &#945;<sub>1</sub>-, &#945;<sub>2</sub>-, and &#946;-adrenoceptor subfamilies, these receptors form three distinct clusters <abbrgrp>
<abbr bid="B26">26</abbr>
</abbrgrp>; furthermore, the &#945;<sub>1B </sub>and &#945;<sub>1D </sub>receptors are the closest in the distance matrix. The muscarinic acetylcholine receptors M<sub>1</sub>, M<sub>3</sub>, M<sub>4</sub>, and M<sub>5 </sub>(CHRM1/3/4/5, in Figure <figr fid="F2">2</figr>) cluster together as one group, supporting the low subtype selectivity of muscarinic antagonists <abbrgrp>
<abbr bid="B27">27</abbr>
</abbrgrp>. However, the acetylcholine receptor M<sub>2 </sub>is found more distant from this cluster. This indicates the presence of distinct chemical classes in the ligand set of the M<sub>2 </sub>receptor, which may be the result of inclusion of allosteric ligands. For instance, gallamine is an allosteric modulator of the muscarinic M<sub>2 </sub>receptor <abbrgrp>
<abbr bid="B28">28</abbr>
</abbrgrp> that is also present in the GLIDA database <abbrgrp>
<abbr bid="B29">29</abbr>
</abbrgrp>, classified as an M<sub>2 </sub>antagonist. In general, the remaining aminergic receptors (serotonergic, dopaminergic, histaminergic and cholinergic) are more scattered throughout the substructure tree. This means that targets share ligands or ligand substructures among subfamilies/subtypes, which is in line with the high level of polypharmacology observed for these aminergic GPCRs <abbrgrp>
<abbr bid="B30">30</abbr>
</abbrgrp>. For instance, the serotonin receptor 5-HT<sub>1A </sub>clusters together with the D<sub>2 </sub>dopamine receptor, which fits with reports on antipsychotic compounds combining dopamine D<sub>2 </sub>receptor antagonism and serotonin 5-HT<sub>1A </sub>receptor agonism <abbrgrp>
<abbr bid="B31">31</abbr>
<abbr bid="B32">32</abbr>
</abbrgrp>. Structurally similar ligands may act on diverse targets, for instance, when ligands have a GPCR-privileged structure at their core <abbrgrp>
<abbr bid="B33">33</abbr>
<abbr bid="B34">34</abbr>
</abbrgrp>. The grouping of the eight prostanoid receptors (Figure <figr fid="F2">2</figr>) indicates similarity in substructure profiles of the ligands. This is based on the fact that most prostanoid receptor ligands are direct derivatives of the endogenous ligands <abbrgrp>
<abbr bid="B35">35</abbr>
<abbr bid="B36">36</abbr>
</abbrgrp>, the so-called eicosanoids. These ligands are highly similar, all consisting of large aliphatic, lipophilic alkyl chains. The presence of the leukotriene and cannabinoid receptors in this lipid cluster may seem strange at first. Leukotrienes are however also eicosanoids, which clarifies the position of the leukotriene B<sub>4 </sub>and cysteinyl-leukotriene receptors in this cluster <abbrgrp>
<abbr bid="B37">37</abbr>
<abbr bid="B38">38</abbr>
</abbrgrp>. In addition, arachidonic acid is the common precursor for eicosanoids and two derivatives of arachidonic acid, anandamide and 2-arachidonylglycerol, both of which are endogenous ligands ('endocannabinoids') of the cannabinoid receptors.</p>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Phylogenetic tree of human Class A GPCRs based on ligand information (frequent substructure mining)</p></caption><text>
   <p><b>Phylogenetic tree of human Class A GPCRs based on ligand information (frequent substructure mining)</b>. Human Class A GPCRs are clustered based on the frequent substructure analysis. Subfamilies are color-coded according to ligand type whereby the broad ligand types applied by Gloriam <it>et al. </it><abbrgrp><abbr bid="B17">17</abbr></abbrgrp> were used. red - receptor with aminergic ligands; pink - peptide ligands; green - lipid ligands; dark blue - purinergic P2Y ligands; light blue - adenosine ligands; brown - melatonin ligands.</p>
</text><graphic file="1471-2105-11-316-2" hint_layout="double"/></fig>
<p>The relationship between target clustering in the substructure tree (Figure <figr fid="F2">2</figr>) and ligand promiscuity suggests that the substructure tree may be used to identify possible side effects on receptors that are close neighbors in this tree. For instance, off-target activity of ligands can be identified. If inspection reveals a ligand to bind to receptor(s) that are phylogenetically related to the target of interest, a more detailed experimental follow-up with respect to receptor selectivity would be worthwhile.</p>
</sec>
<sec>
<st>
<p>Tree comparison</p>
</st>
<p>Visual comparison of the sequence tree (Figure <figr fid="F1">1</figr>) with the substructure tree (Figure <figr fid="F2">2</figr>) reveals that the overall phylogenetic organization is similar. For instance, with the exclusion of the glycoprotein, P2Y, angiotensin, and bradykinin receptors, all other receptors represented by two subtypes occur in pairs in both the ligand tree and the sequence tree. This is also true for receptors with three subtypes present in the dataset, <it>e.g</it>. the three members of the &#945;<sub>1</sub>, the &#945;<sub>2</sub>, and the &#946;<sub>1 </sub>adrenoceptors, as well as the bombesin receptors. Exceptions to this rule are the neuropeptide Y and vasopressin receptors. In addition, the prostanoid receptors largely group together in both trees, as do most of the aminergic receptors.</p>
<p>The clear distinction between the two dopamine receptor types, i.e. D<sub>1 </sub>and D<sub>5 </sub>(D<sub>1</sub>-like) versus D<sub>2</sub>, D<sub>3</sub>, and D<sub>4 </sub>(D<sub>2</sub>-like), exists both in the sequence-based classification and ligand-based classification. This is in agreement with a previous study <abbrgrp>
<abbr bid="B39">39</abbr>
</abbrgrp> and also known from drugs on the market such as the benzazepines that favor D<sub>1</sub>-like over D<sub>2</sub>-like dopamine receptors. Similarly, antipsychotics such as chlorpromazine have a higher affinity for the D<sub>2</sub>-like subtypes than D<sub>1</sub>-like receptors <abbrgrp>
<abbr bid="B40">40</abbr>
</abbrgrp>.</p>
<p>The fact that many clusters arise in both trees indicates that the receptors in these clusters have similar sequences and similar ligands, that is, ligands with substantially overlapping substructure sets. However, there are also receptor targets for which this is clearly not the case. The (qualitative) similarities and differences among sequence and substructure trees are discussed in the following. A delta-delta plot was constructed to compare how pairs of receptors change. This plot, provided in Figure <figr fid="F3">3</figr> (and described in detail in the Materials and Methods section), visualizes how receptor distances deviate between the sequence-based tree and the ligand-based classification of receptors. In sequence space, receptor distances indicate the (dis)similarly between protein sequences, while in ligand space, receptor distances reflect the overlap in structural features found in ligands for these receptors. For each receptor, the mean distance to all other receptors is plotted. From the delta-delta plot, it becomes apparent that the prostanoid receptors and P2Y<sub>1 </sub>receptor are on average the most distant receptors from the rest of the classes. The distances of the purine P2Y<sub>1 </sub>receptor, the prostanoid FP receptor, and leukotriene receptor CysLT<sub>2 </sub>towards the other classes are all larger in substructure space than in sequence space, implicating that overall their ligands show little resemblance with ligands of the other GPCRs. In contrast, for most aminergic receptors, <it>e.g</it>. for the &#945;<sub>2B</sub>-adrenoceptors and the 5-HT<sub>2B </sub>serotonin receptor in Figure <figr fid="F3">3</figr>, distances are smaller in substructure space compared to sequence space. This, again, corresponds with the high polypharmacology found for aminergic ligands, such as for most atypical antipsychotics <abbrgrp>
<abbr bid="B41">41</abbr>
</abbrgrp>, with clozapine as a prominent example <abbrgrp>
<abbr bid="B42">42</abbr>
</abbrgrp>. With the exception of a few targets (FSH, LH), the distribution of targets in the delta-delta plot is more scattered along the x-axis (substructure space) than the y-axis (sequence space). This may be a reflection of the evolutionary relationship between sequences, which results in coverage of a small region of the overall sequence space. The ligands for these targets do not have such a direct relationship and thus cover a broader range in overall substructure space.</p>
<fig id="F3"><title><p>Figure 3</p></title><caption><p>Delta-delta plot visualization of receptor distances in sequence and substructure space</p></caption><text>
   <p><b>Delta-delta plot visualization of receptor distances in sequence and substructure space</b>. The delta-delta plot visualizes how target distances differ between sequence-based classification (GSK set, y-axis) and substructure-based classification (x-axis). The average distance towards the other targets is plotted for sequence and substructure space. A few targets are highlighted in the plot to serve as examples. These are marked by a black dot and a label that denotes the gene symbol. Targets that are, on average, more distant from the rest are plotted further away from the origin; targets plotted above the diagonal are more distant in sequence space, while targets plotted below the diagonal are more distant in substructure space. For example, the FSH receptor (FSHR) is positioned relatively far from the origin and above the diagonal. This indicates that this receptor is, in general, more distant from the other receptors, most prominent in sequence space.</p>
</text><graphic file="1471-2105-11-316-3" hint_layout="double"/></fig>
<p>The difference between ligand-based and target-based classifications may be due to convergent evolution <abbrgrp>
<abbr bid="B43">43</abbr>
</abbrgrp>. Functional convergence denotes how proteins that differ in sequence may fulfill the same protein function. The protein sequence of GPCR subtypes will be similar in parts that are involved in the endogenous ligand recognition but may be different in other parts, for instance those parts that play a role in recognition of other, exogenous, ligands (<it>e.g</it>. synthetic drugs). These may therefore have a different selectivity profile compared to the endogenous ligand.</p>
</sec>
<sec>
<st>
<p>Validation</p>
</st>
<p>To validate how well our method performed as a chemogenomics method, <it>i.e</it>. how well it connects sequence space with small molecule space and how applicable the relationship is in practice, we conducted a 'virtual de-orphanization exercise'. For each receptor in the dataset, we pretended not to know any of its ligands by excluding them from the datasets (we 'orphanized' the receptor in this particular run of the protocol). We next predicted its ligands by considering a model derived from the closest neighbors of the receptor in sequence space (we attempted to 'de-orphanize' the receptor whose ligands we omitted from the study in the previous step). For this calculation, the distance matrix for the GSK residue set was used. The cumulative number of correctly identified ligands of every receptor is plotted against the number of closest neighbors (sequences) included to find these ligands. The (relative) area under the curve (AUC) and shape of the curve are measures of the performance of our method. In 93% of the studied receptors, de-orphanization of the pretended orphan receptor using the ligands of related receptors performed better than random (AUC &gt; 0.5) and for 35% of receptors de-orphanization performance was good (AUC &gt; 0.7). All AUC plots could be divided into four categories according to curve shape and AUC (the complete set of plotted scores is available as additional material in Additional file <supplr sid="S2">2</supplr> - Plotted scores for the <it>leave-one-out </it>validation). Typical examples of the four categories are given in Figure <figr fid="F4">4</figr>. The first category is most abundant and consists of curves with a convex shape and an AUC above 0.5, marking good performance. An example of this category is the muscarinic acetylcholine receptor M<sub>1 </sub>(CHRM1 in Figure <figr fid="F4">4</figr>) with an AUC of 0.7990. Curves of the second category display a gradual rise that is approximately equal to the diagonal of the plot. These plots have an AUC near 0.5, indicating performance that is equal to random prediction. An example is the plot of the angiotensin receptor AT<sub>1 </sub>(AGTR1 in Figure <figr fid="F4">4</figr>) with an AUC of 0.5120. Curves of the third category perform worse than random and are characterized by a concave shape and an AUC below 0.5. Clearly the worst example is the P2Y<sub>1 </sub>purinoceptor with an AUC value of 0.0857 (P2RY1 in Figure <figr fid="F4">4</figr>). In contrast to the first three categories, curves of the fourth category do not have a clear AUC range. This category consists of curves that are divided into several discrete parts of alternating rises and plateaus, as shown in the plot of bombesin receptor BB<sub>3 </sub>(BRS3 in Figure <figr fid="F4">4</figr>), with an AUC of 0.8145. Performance varies from good (BRS3) to worse than random, depending on the value of the AUC. An example of such a plot with an AUC value below 0.5 is the FSH receptor (not shown, see: Additional file <supplr sid="S2">2</supplr> - Plotted scores for the <it>leave-one-out </it>validation) with an AUC of 0.4428. The steep rises are caused by a few receptors identifying the majority of ligands. Some of these curves are steeply rising at the start, which suggests that part of its ligand set could be readily identified even though this is not reflected in the AUC. The poor performance concerning the P2Y<sub>1 </sub>receptor is probably due to the nature of its ligands: this set consists of a small number of highly similar ligands that all possess a phosphate group, a feature not found in other ligands in the database. The number of features (substructures) shared with ligands of this receptor and other receptors is therefore small. Interestingly, the adenosine A<sub>1 </sub>and A<sub>3 </sub>receptors, which are also purinergic, identify most (28 out of 42) of the P2Y<sub>1 </sub>ligands. However, in sequence space these receptors are at great distance (at positions 91 and 92, respectively).</p>
<suppl id="S2">
<title>
<p>Additional file 2</p>
</title>
<text>
<p>
<b>Plotted scores for the <it>leave-one-out </it>validation</b>. Plotted scores for the <it>leave-one-out </it>validation. The complete set of plotted scores of identified ligands per number of closest neighbors (sequences). For each plot, receptors are ordered along the x-axis (labeled "Number of included receptors") in order of increasing distance in sequence space to the receptor under study. The y-axis (labeled "Ligands identified") indicates the cumulative number of retrieved ligands, normalized linearly to the interval [0;1]. The red curve indicates the number of active ligands that are retrieved when including all (closest) receptors that are listed along the x-axis up to that point. More specifically, the number of correctly predicted ligands is plotted against the number of closely related receptors on which the prediction was based. For example, the plot of the muscarinic acetylcholine receptor M<sub>1 </sub>(CHRM1, third row, third plot from the left) displays a steeply rising curve near the origin, indicating that many of its ligands are retrieved using a small number of closest receptors. The blue diagonal illustrates recovery of ligands when performance is equal to random prediction. The relative area under the curve (AUC) of the red curve is stated at the bottom of each plot. An AUC above 0.5 indicates good performance, while poor performance is indicated by an AUC of 0.5 or below. The plots are sorted according to decreasing (relative) AUC.</p>
</text>
<file name="1471-2105-11-316-S2.PDF">
   <p>Click here for file</p>
</file>
</suppl>
<fig id="F4"><title><p>Figure 4</p></title><caption><p>Examples of plotted scores for the <it>leave-one-out </it>validation</p></caption><text>
   <p><b>Examples of plotted scores for the <it>leave-one-out </it>validation</b>. Example plots expressing the performance of the simulated receptor de-orphanization. Performance plots for the following receptors are provided (from left to right and from top to bottom): CHRM1 - muscarinic acetylcholine receptor M<sub>1 </sub>(first category); AGTR1 - angiotensin receptor AT<sub>1 </sub>(second category); P2RY1 - P2Y<sub>1 </sub>purinoceptor (third category); BRS3 - bombesin receptor BB<sub>3 </sub>(fourth category). These examples are discussed in the text. The full set of plotted scores is provided in Additional file <supplr sid="S2">2</supplr> - Plotted scores for the <it>leave-one-out </it>validation. For each plot, receptors are ordered along the x-axis (labeled "Number of included receptors") in order of increasing distance in sequence space to the receptor under study. On the y-axis (labeled "Ligands identified"), the cumulative number of retrieved ligands is depicted, normalized linearly to the interval [0;1]. The red curve indicates the number of active ligands that are retrieved when including all (closest) receptors that are listed along the x-axis up to that point. For example, the plot of the muscarinic acetylcholine receptor M<sub>1 </sub>(CHRM1) displays a steeply rising curve near the origin, indicating that many of its ligands are retrieved using a small number of closest receptors. The blue diagonal illustrates recovery of ligands when performance is equal to random prediction. The relative area under the curve (AUC) of the red curve is stated at the bottom of each plot. An AUC above 0.5 indicates good performance, while poor performance is indicated by an AUC of 0.5 or below.</p>
</text><graphic file="1471-2105-11-316-4" hint_layout="double"/></fig>
<p>Overall, our method proves useful for receptor de-orphanization, since for 93% of receptors studied de-ophanization performed better than random selection (AUC &gt; 0.5) and for 35% of receptors de-orphanization performed well (AUC &gt; 0.7).</p>
</sec>
<sec>
<st>
<p>Limitations of the work</p>
</st>
<p>In the present study, some targets were excluded due to insufficient availability of ligand data in the source databases. The absence of a receptor may influence the order of other receptors in the trees. Scarcity of ligand data is reflected in the substructure profiles, thereby influencing the correlations among receptors. The issue of data (in) completeness and its effect on interaction networks was recently discussed by Mestres <it>et al. </it>
<abbrgrp>
<abbr bid="B44">44</abbr>
</abbrgrp>. Using three datasets of increasing complexity (more connections) that linked ligands to targets based on full chemical identity, the authors showed that an increase in the number of connections rapidly leads to shifts in connection patterns. However, our study linked targets based on overlap in substructures; as a consequence sharing of substructures rather than of ligands is sufficient for targets to be identified as related. Bender <it>et al. </it>and Keiser <it>et al. </it>already showed that overlapping ligands are not necessary to predict whether targets are close in ligand space <abbrgrp>
<abbr bid="B19">19</abbr>
<abbr bid="B20">20</abbr>
</abbrgrp>. In addition, our method employs an exhaustive approach to analyze the structural features of ligands. Frequent substructure mining considers all possible substructures that occur in the ligands and is therefore unbiased, <it>i.e</it>. all possible substructures were evaluated, not only those intuitive to chemists, such as functional groups, ring systems (e.g. a phenyl ring), and linkers <abbrgrp>
<abbr bid="B45">45</abbr>
</abbrgrp>. However, in the present study less 'obvious' substructures such as ethyl or isobutyl are also considered <abbrgrp>
<abbr bid="B21">21</abbr>
</abbrgrp>. For a complete discussion on substructure generation and evaluation, see ref. <abbrgrp>
<abbr bid="B46">46</abbr>
</abbrgrp>. Our method is not limited to GPCRs alone; it is easily extended to other protein families for analysis of the differences between subfamily phylogenies, given that sufficient ligand information is available. For instance, it can be applied to the realm of enzymes to complement other chemogenomics analyses <abbrgrp>
<abbr bid="B47">47</abbr>
</abbrgrp>.</p>
</sec>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>In this work, we presented a ligand-based phylogenetic classification that complements the well-established sequence-based classification of proteins, and applied our method to classification of GPCRs. This alternate view may contribute to our understanding of GPCR classification since it reveals relationships that are unnoticed with conventional phylogeny. Targets were analyzed based on the substructure profiles of their ligands using an unbiased approach. The overall organization of the sequence tree and the substructure tree was similar; however, substantial differences were also discovered. In the substructure tree, several clusters of subtypes were identified. For instance, it was found that the adenosine receptors group together, and that certain GPCR subfamilies that do not share sequence homology cluster because of ligand similarity. Thus, receptor similarities that signal for potential off-target effects, such as for the serotonergic receptors, are readily identified. In addition, combined with sequence-based classification, the ligand-based classification presented has proven potential (93% of receptors with AUC &gt; 0.5 and 35% with AUC &gt; 0.7) for de-orphanization of receptors.</p>
</sec>
<sec>
<st>
<p>Methods</p>
</st>
<sec>
<st>
<p>Datasets</p>
</st>
<sec>
<st>
<p>Ligands</p>
</st>
<p>Ligands for human GPCRs were collected from three publicly available data sources: the StARLITe database, as made available by ChEBI (EMBL-EBI) as part of the ChEMBL database <abbrgrp>
<abbr bid="B48">48</abbr>
</abbrgrp>, GLIDA <abbrgrp>
<abbr bid="B29">29</abbr>
</abbrgrp>, and KiDB <abbrgrp>
<abbr bid="B49">49</abbr>
</abbrgrp>. ChEMBL consists of a collection of more than 500,000 small molecules annotated with activity. Here, only activity values measured directly from binding studies were included. Compounds with K<sub>i</sub>, IC<sub>50</sub>, or EC values below 10 &#956;M were considered active. GLIDA provides biological information on GPCRs (sequences) and chemical information about ligand structures. It has links to several external databases, GPCRDB <abbrgrp>
<abbr bid="B25">25</abbr>
</abbrgrp>, UniProt <abbrgrp>
<abbr bid="B50">50</abbr>
</abbrgrp>, PubChem <abbrgrp>
<abbr bid="B51">51</abbr>
</abbrgrp>, and DrugBank <abbrgrp>
<abbr bid="B52">52</abbr>
</abbrgrp>. A reported affinity in one of these source databases classifies a compound as active, independent of the reported binding affinity. Ligands are annotated with an activity type, namely: full agonist, partial agonist, agonist, antagonist or inverse agonist. In the present study, we focused only on binding affinity and not on the activity type. This allowed us to merge the set with the rest of the data. KiDB provides information on drugs and molecular compounds that interact with GPCRs, ion channels, transporters, and enzymes. The entries in KiDB are annotated with ligand, K<sub>i </sub>value, radiolabeled ligand, receptor name, source &amp; tissue, species, and PubMed link to the publication(s). Our dataset consisted of ligands from all three sources, by selecting human GPCR ligands with a molecular weight between 50 and 700 Da. Only targets that had 20 or more ligands listed were used. In this study, we focused on class A (rhodopsin-like) GPCRs since the majority of targets are from class A and only a minor part from class C; combining both classes would have negatively affected homogeneity of the phylogenetic trees, thereby hampering comparison. For the same reason, we removed two singleton targets (targets that are the only member in a subfamily), the gonadotrophin-releasing hormone receptor and the ghrelin receptor. The final set consisted of 102 targets (provided in Table 1 of Additional file <supplr sid="S3">3</supplr> - List of GPCRs used in this study) with 37350 unique ligands in total.</p>
<suppl id="S3">
<title>
<p>Additional file 3</p>
</title>
<text>
<p>
<b>List of GPCRs used in this study</b>. List of GPCRs used in this study. The list of GPCRs used in this study (Class A, excluding singletons). Only receptors that are human, non-olfactory, and not orphan, were used. For each receptor, the respective (sub) family, gene symbol, official IUPHAR name, and number of ligands are provided.</p>
</text>
<file name="1471-2105-11-316-S3.PDF">
   <p>Click here for file</p>
</file>
</suppl>
</sec>
<sec>
<st>
<p>Sequences</p>
</st>
<p>The multiple sequence alignment of (specific residues of) the 7-TM domain was obtained from GPCRDB <abbrgrp>
<abbr bid="B25">25</abbr>
<abbr bid="B53">53</abbr>
</abbrgrp>. Only human receptors that were non-olfactory and not orphan were used.</p>
</sec>
</sec>
<sec>
<st>
<p>Tree generation</p>
</st>
<sec>
<st>
<p>Frequent Substructure Mining</p>
</st>
<p>For the ligands of each receptor, the most frequently occurring substructures were determined. This was accomplished by using the frequent subgraph-mining algorithm <abbrgrp>
<abbr bid="B54">54</abbr>
</abbrgrp>, which finds all frequent substructures in a set of molecular graphs <abbrgrp>
<abbr bid="B23">23</abbr>
</abbrgrp>. For a description and a quantitative comparison of recent substructure mining algorithms, see <abbrgrp>
<abbr bid="B55">55</abbr>
</abbrgrp>. Briefly, starting from the smallest substructure, namely the single atoms, the algorithm finds the number of molecules in which the substructure occurs. If this occurrence is above a user-defined minimum, the minimum support value, the substructure is stored. Stored substructures are stepwise extended, and tested in a systematic manner, with the aim of testing all possible substructures that have at least one of the stored substructures as their basis. The algorithm seeks ways to test only those substructures that actually occur in the set, and that have a frequency above the set minimum. An important concept of frequent substructure mining is the <it>a priori </it>principle, originating from frequent item set mining <abbrgrp>
<abbr bid="B56">56</abbr>
</abbrgrp>. Algorithms based on the <it>a priori </it>principle exploit that the frequency of a substructure will be equal or lower than the frequency of the substructures it contains. Therefore, whenever the occurrence of a substructure is below the minimum support, all extensions of that substructure are discarded.</p>
<p>Structures were represented as labeled graphs with a special type for aromatic bonds. In this study, the minimum support value was set to 30% of the number of ligands in each activity set. At this value, the algorithm provided a large group of substructures while still being computationally feasible to work with. In addition, molecular structures were sorted in ascending order according to the number of bonds. This allowed the algorithm to prune scarce, complicated substructures that consisted of a large number of bonds, thereby reducing memory requirements. If the set of generated substructures is disproportionately large (more than 1000 times larger) compared to the majority of the other classes, the generated substructures are discarded except for those that also occur in other classes. This step was performed in order to prevent single targets from dominating the analysis. Since in practice most classes generated sets of less than 1000 substructures, a cut-off of 1 M substructures was used. Substructures with molecular weight below 50 Dalton were discarded. The frequent substructures of all classes were merged into one set, removing any duplicates. For all substructures in this set, the frequency in each subfamily was determined. To calculate the correlation between two targets, we used the substructure frequencies as features for that target. A correlation matrix was constructed by calculating the Pearson correlation coefficient for each pair of targets. Finally, a distance matrix was constructed by subtracting the values of the correlation matrix from unity and normalizing the results linearly to the interval [0;1].</p>
</sec>
<sec>
<st>
<p>Phylogenetic Trees</p>
</st>
<p>To study receptor organization, receptors were clustered into a phylogenetic tree using the Neighbor-Joining (NJ) method (Neighbor from the PHYLIP package <abbrgrp>
<abbr bid="B57">57</abbr>
</abbrgrp>). This method infers phylogenies from the pair-wise distances between receptors. Phylogenetic trees built from distance matrices facilitate tree comparison across domains. In addition, NJ clusters each domain equally well since it does not involve an 'evolutionary clock', a concept rooted in evolutionary biology. Two distance matrices represented the similarities of the receptors: according to the frequent substructures of their ligands and the 7-TM domain sequence alignment, both were visualized as a phylogenetic tree, with receptors as leaves of the tree. The number of branches between two leaves in the tree grows with dissimilarity of these two leaves.</p>
<p>The protein distances between the aligned sequences were calculated with Protdist from the PHYLIP package version 3.6. using the Jones-Taylor-Thornton matrix (default) <abbrgrp>
<abbr bid="B57">57</abbr>
</abbrgrp>. Both the sequence-based and ligand-based phylogenetic trees were constructed using the neighbor.exe program from the PHYLIP package. Tree construction might be influenced by the order in which targets are provided to the tree constructor. To minimize the influence on the resulting phylogenetic tree, target input order was randomized 10 times and 10 new trees were generated. From these, a consensus tree was built. MEGA4 <abbrgrp>
<abbr bid="B58">58</abbr>
</abbrgrp> was used for editing the layout of the trees and for visualization. Trees were rooted on the mid-points, that is, a root is placed at the mid-point of the longest distance between two taxa of the unrooted tree. Taxa were arranged for balanced shape and trees were visualized as circular trees showing only topology, <it>i.e</it>. branch lengths do not reflect evolutionary distance in a quantitative manner.</p>
</sec>
</sec>
<sec>
<st>
<p>Tree comparison</p>
</st>
<p>For the comparison of trees, several methods and visualizations are available; however, there is not a single definitive measure for tree difference. To visualize how the receptor positions change between two trees we employed a delta-delta plot.</p>
<sec>
<st>
<p>Delta-Delta plots</p>
</st>
<p>The delta-delta plot reveals how receptor locations behave globally with respect to the median of all receptors. It was used to visualize the differences in location of each receptor in sequence space and in substructure space. This plot is an adaptation from the delta-delta plot in Garr <it>et al. </it>
<abbrgrp>
<abbr bid="B59">59</abbr>
</abbrgrp>. It is a new way of tree comparison, which visualizes the differences among trees graphically, as opposed to the sole calculation of a numerical distance between two trees which is not trivial to interpret. For each receptor, the mean distance of that receptor to all other receptors was calculated. This value was plotted in a scatter plot, with each axis representing the mean distance of the respective node in one of the trees. The interpretation of this plot is as follows. Along both axes, receptors plotted far from the origin are, on average, more distant from the rest of the group, while receptors plotted close to the origin were closer to the rest of receptors. Receptors plotted near the diagonal do not change much in their mean distance to other receptors when going from one tree to the other (since they are close to the X = Y diagonal). Receptors plotted above or below the diagonal have different average distance to the other receptors between trees. For instance, consider a delta-delta plot that plots a substructure tree along the x-axis and a sequence tree along the y-axis. If a receptor is plotted above the diagonal, the mean distance of that receptor to the other receptors is larger in the sequence tree than the substructure tree; for receptors plotted below the diagonal, the opposite is true.</p>
</sec>
</sec>
<sec>
<st>
<p>Validation</p>
</st>
<sec>
<st>
<p>Leave-one-out validation</p>
</st>
<p>This experiment is repeated for every receptor (the 'orphan receptor') by temporarily removing ligands of this receptor from the dataset and predicting the position of molecules of this class in the substructure tree. A molecule from the left-out class is a hit when it is predicted to belong to one of the closest classes in sequence space. The closest classes in sequence space are found using the distance matrix from the multiple sequence alignment. Prediction of the class of a molecule is based on the Euclidean distance in substructure space. This distance is calculated as follows: for each substructure, the square of the difference between the relative frequency in a class and the molecule is calculated. The relative frequency of a substructure in a molecule is either 0 for absence, or 1 for presence of the substructure. The square root of the sum of all squared differences is the Euclidean distance between a molecule and a class. The area under the curve (AUC) of the receiver operating characteristic (ROC) plot served as a quality measure of the predictions for a class.</p>
<p>Instead of repeating the substructure mining for every left-out class, a lookup table of substructure occurrence was used. This table related all generated substructures with all molecules in which they occurred. Substructures that had a frequency just above the support threshold in the left-out class were not considered when analysis was performed for molecules of this class.</p>
</sec>
</sec>
</sec>
<sec>
<st>
<p>Authors' contributions</p>
</st>
<p>EH carried out the sequence alignments, frequent substructure mining, analysis and validation, and drafted the manuscript. JEP participated in design of the study and visualization methods, and implementation of analyses. MWB, JRL, and HWTV assisted in study design, interpretation of results, and drafting the manuscript. MTME was involved in algorithm design and data analysis. YO was involved in acquisition of data in GLIDA. APIJ and AB participated in study design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.</p>
</sec>
</bdy><bm>
<ack>
<sec>
<st>
<p>Acknowledgements</p>
</st>
<p>The authors thank all members of the Division of Medicinal Chemistry of the Leiden/Amsterdam Center for Drug Research at Leiden University for helpful discussions. In addition, the authors thank Bas Vroling from the CMBI, Radboud University, for his help with the sequence alignments.</p>
<p>
<it>Funding</it>: This work was supported by the Dutch Top Institute Pharma, project number: D1-105.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>The G-Protein-Coupled Receptors in the Human Genome Form Five Main Families. Phylogenetic Analysis, Paralogon Groups, and Fingerprints</p></title><aug><au><snm>Fredriksson</snm><fnm>R</fnm></au><au><snm>Lagerstrom</snm><fnm>MC</fnm></au><au><snm>Lundin</snm><fnm>L-G</fnm></au><au><snm>Schioth</snm><fnm>HB</fnm></au></aug><source>Molecular Pharmacology</source><pubdate>2003</pubdate><volume>63</volume><issue>6</issue><fpage>1256</fpage><lpage>1272</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1124/mol.63.6.1256</pubid><pubid idtype="pmpid" link="fulltext">12761335</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>The 7 TM G-Protein-Coupled Receptor Target Family</p></title><aug><au><snm>Jacoby</snm><fnm>E</fnm></au><au><snm>Bouhelal</snm><fnm>R</fnm></au><au><snm>Gerspacher</snm><fnm>M</fnm></au><au><snm>Seuwen</snm><fnm>K</fnm></au></aug><source>Chem Med Chem</source><pubdate>2006</pubdate><volume>1</volume><issue>8</issue><fpage>760</fpage><lpage>782</lpage></bibl><bibl id="B3"><title><p>The 2.6 Angstrom Crystal Structure of a Human A2A Adenosine Receptor Bound to an Antagonist</p></title><aug><au><snm>Jaakola</snm><fnm>V-P</fnm></au><au><snm>Griffith</snm><fnm>MT</fnm></au><au><snm>Hanson</snm><fnm>MA</fnm></au><au><snm>Cherezov</snm><fnm>V</fnm></au><au><snm>Chien</snm><fnm>EYT</fnm></au><au><snm>Lane</snm><fnm>JR</fnm></au><au><snm>IJzerman</snm><fnm>AP</fnm></au><au><snm>Stevens</snm><fnm>RC</fnm></au></aug><source>Science</source><pubdate>2008</pubdate><fpage>1164772</fpage></bibl><bibl id="B4"><title><p>G protein-coupled receptor drug discovery: Implications from the crystal structure of rhodopsin</p></title><aug><au><snm>Ballesteros</snm><fnm>J</fnm></au><au><snm>Palczewski</snm><fnm>K</fnm></au></aug><source>Curr Opin Drug Discovery Dev</source><pubdate>2001</pubdate><volume>4</volume><issue>5</issue><fpage>561</fpage><lpage>574</lpage></bibl><bibl id="B5"><title><p>High-Resolution Crystal Structure of an Engineered Human &#946;<sub>2</sub>-Adrenergic G Protein Coupled Receptor</p></title><aug><au><snm>Cherezov</snm><fnm>V</fnm></au><au><snm>Rosenbaum</snm><fnm>DM</fnm></au><au><snm>Hanson</snm><fnm>MA</fnm></au><au><snm>Rasmussen</snm><fnm>SGF</fnm></au><au><snm>Thian</snm><fnm>FS</fnm></au><au><snm>Kobilka</snm><fnm>TS</fnm></au><au><snm>Choi</snm><fnm>H-J</fnm></au><au><snm>Kuhn</snm><fnm>P</fnm></au><au><snm>Weis</snm><fnm>WI</fnm></au><au><snm>Kobilka</snm><fnm>BK</fnm></au><etal/></aug><source>Science</source><pubdate>2007</pubdate><volume>318</volume><issue>5854</issue><fpage>1258</fpage><lpage>1265</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1150577</pubid><pubid idtype="pmcid">2583103</pubid><pubid idtype="pmpid" link="fulltext">17962520</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Structure of a &#946;<sub>1</sub>-adrenergic G-protein-coupled receptor</p></title><aug><au><snm>Warne</snm><fnm>T</fnm></au><au><snm>Serrano-Vega</snm><fnm>MJ</fnm></au><au><snm>Baker</snm><fnm>JG</fnm></au><au><snm>Moukhametzianov</snm><fnm>R</fnm></au><au><snm>Edwards</snm><fnm>PC</fnm></au><au><snm>Henderson</snm><fnm>R</fnm></au><au><snm>Leslie</snm><fnm>AGW</fnm></au><au><snm>Tate</snm><fnm>CG</fnm></au><au><snm>Schertler</snm><fnm>GFX</fnm></au></aug><source>Nature</source><pubdate>2008</pubdate><volume>454</volume><issue>7203</issue><fpage>486</fpage><lpage>491</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature07101</pubid><pubid idtype="pmpid" link="fulltext">18594507</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Drug Design Strategies for Targeting G-Protein-Coupled Receptors</p></title><aug><au><snm>Klabunde</snm><fnm>T</fnm></au><au><snm>Hessler</snm><fnm>G</fnm></au></aug><source>Chem Bio Chem</source><pubdate>2002</pubdate><volume>3</volume><issue>10</issue><fpage>928</fpage><lpage>944</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">12362358</pubid></xrefbib></bibl><bibl id="B8"><title><p>Property-Based Design of GPCR-Targeted Library</p></title><aug><au><snm>Balakin</snm><fnm>KV</fnm></au><au><snm>Tkachenko</snm><fnm>SE</fnm></au><au><snm>Lang</snm><fnm>SA</fnm></au><au><snm>Okun</snm><fnm>I</fnm></au><au><snm>Ivashchenko</snm><fnm>AA</fnm></au><au><snm>Savchuk</snm><fnm>NP</fnm></au></aug><source>J Chem Inf Comput Sci</source><pubdate>2002</pubdate><volume>42</volume><issue>6</issue><fpage>1332</fpage><lpage>1342</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">12444729</pubid></xrefbib></bibl><bibl id="B9"><title><p>2,4,6-Trisubstituted Pyrimidines as a New Class of Selective Adenosine A<sub>1 </sub>Receptor Antagonists</p></title><aug><au><snm>Chang</snm><fnm>LCW</fnm></au><au><snm>Spanjersberg</snm><fnm>RF</fnm></au><au><snm>von Frijtag Drabbe-K&#252;nzel</snm><fnm>JK</fnm></au><au><snm>Mulder-Krieger</snm><fnm>T</fnm></au><au><snm>van den Hout</snm><fnm>G</fnm></au><au><snm>Beukers</snm><fnm>MW</fnm></au><au><snm>Brussee</snm><fnm>J</fnm></au><au><snm>IJzerman</snm><fnm>AP</fnm></au></aug><source>J Med Chem</source><pubdate>2004</pubdate><volume>47</volume><issue>26</issue><fpage>6529</fpage><lpage>6540</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/jm049448r</pubid><pubid idtype="pmpid" link="fulltext">15588088</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Privileged Structures in GPCRs</p></title><aug><au><snm>Bywater</snm><fnm>R</fnm></au></aug><source>GPCRs: From Deorphanization to Lead Structure Identification</source><publisher>Springer-Verlag</publisher><editor>Bourne H, Horuk R, Kuhnke J, Michel H</editor><pubdate>2007</pubdate><fpage>75</fpage><lpage>92</lpage><xrefbib><pubid idtype="doi">full_text</pubid></xrefbib></bibl><bibl id="B11"><title><p>Chemogenomics: Looking at biology through the lens of chemistry</p></title><aug><au><snm>Doddareddy</snm><fnm>MR</fnm></au><au><snm>Westen</snm><fnm>GJPv</fnm></au><au><snm>Horst</snm><fnm>Evd</fnm></au><au><snm>Peironcely</snm><fnm>JE</fnm></au><au><snm>Corthals</snm><fnm>F</fnm></au><au><snm>IJzerman</snm><fnm>AP</fnm></au><au><snm>Emmerich</snm><fnm>M</fnm></au><au><snm>Jenkins</snm><fnm>JL</fnm></au><au><snm>Bender</snm><fnm>A</fnm></au></aug><source>Statistical Analysis and Data Mining</source><pubdate>2009</pubdate><volume>2</volume><issue>3</issue><fpage>149</fpage><lpage>160</lpage><xrefbib><pubid idtype="doi">10.1002/sam.10046</pubid></xrefbib></bibl><bibl id="B12"><title><p>Chemogenomic data analysis: Prediction of small-molecule targets and the advent of biological fingerprints</p></title><aug><au><snm>Bender</snm><fnm>A</fnm></au><au><snm>Young</snm><fnm>DW</fnm></au><au><snm>Jenkins</snm><fnm>JL</fnm></au><au><snm>Serrano</snm><fnm>M</fnm></au><au><snm>Mikhailov</snm><fnm>D</fnm></au><au><snm>Clemons</snm><fnm>PA</fnm></au><au><snm>Davies</snm><fnm>JW</fnm></au></aug><source>Comb Chem High Throughput Screening</source><pubdate>2007</pubdate><volume>10</volume><issue>8</issue><fpage>719</fpage><lpage>731</lpage><xrefbib><pubid idtype="doi">10.2174/138620707782507313</pubid></xrefbib></bibl><bibl id="B13"><title><p>Chemogenomic approaches to drug discovery: similar receptors bind similar ligands</p></title><aug><au><snm>Klabunde</snm><fnm>T</fnm></au></aug><source>Br J Pharmacol</source><pubdate>2007</pubdate><volume>152</volume><issue>1</issue><fpage>5</fpage><lpage>7</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/sj.bjp.0707308</pubid><pubid idtype="pmcid">1978276</pubid><pubid idtype="pmpid" link="fulltext">17533415</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>GCRDb: a G-protein-coupled receptor database</p></title><aug><au><snm>Kolakowski</snm><fnm>LFJ</fnm></au></aug><source>Recept Channels</source><pubdate>1994</pubdate><volume>2</volume><fpage>1</fpage><lpage>7</lpage><xrefbib><pubid idtype="pmpid">8081729</pubid></xrefbib></bibl><bibl id="B15"><title><p>A chemogenomic analysis of the transmembrane binding cavity of human G-protein-coupled receptors</p></title><aug><au><snm>Surgand</snm><fnm>J-S</fnm></au><au><snm>Rodrigo</snm><fnm>J</fnm></au><au><snm>Kellenberger</snm><fnm>E</fnm></au><au><snm>Rognan</snm><fnm>D</fnm></au></aug><source>Proteins: Struct, Funct, Bioinf</source><pubdate>2006</pubdate><volume>62</volume><issue>2</issue><fpage>509</fpage><lpage>538</lpage><xrefbib><pubid idtype="doi">10.1002/prot.20768</pubid></xrefbib></bibl><bibl id="B16"><title><p>Crystal structure of the human &#946;<sub>2 </sub>adrenergic G-protein-coupled receptor</p></title><aug><au><snm>Rasmussen</snm><fnm>SGF</fnm></au><au><snm>Choi</snm><fnm>H-J</fnm></au><au><snm>Rosenbaum</snm><fnm>DM</fnm></au><au><snm>Kobilka</snm><fnm>TS</fnm></au><au><snm>Thian</snm><fnm>FS</fnm></au><au><snm>Edwards</snm><fnm>PC</fnm></au><au><snm>Burghammer</snm><fnm>M</fnm></au><au><snm>Ratnala</snm><fnm>VRP</fnm></au><au><snm>Sanishvili</snm><fnm>R</fnm></au><au><snm>Fischetti</snm><fnm>RF</fnm></au><etal/></aug><source>Nature</source><pubdate>2007</pubdate><volume>450</volume><issue>7168</issue><fpage>383</fpage><lpage>387</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature06325</pubid><pubid idtype="pmpid" link="fulltext">17952055</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>Definition of the G Protein-Coupled Receptor Transmembrane Bundle Binding Pocket and Calculation of Receptor Similarities for Drug Design</p></title><aug><au><snm>Gloriam</snm><fnm>DE</fnm></au><au><snm>Foord</snm><fnm>SM</fnm></au><au><snm>Blaney</snm><fnm>FE</fnm></au><au><snm>Garland</snm><fnm>SL</fnm></au></aug><source>J Med Chem</source><pubdate>2009</pubdate><volume>52</volume><issue>14</issue><fpage>4429</fpage><lpage>4442</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/jm900319e</pubid><pubid idtype="pmpid" link="fulltext">19537715</pubid></pubidlist></xrefbib></bibl><bibl id="B18"><title><p>"Bayes Affinity Fingerprints" Improve Retrieval Rates in Virtual Screening and Define Orthogonal Bioactivity Space: When Are Multitarget Drugs a Feasible Concept?</p></title><aug><au><snm>Bender</snm><fnm>A</fnm></au><au><snm>Jenkins</snm><fnm>JL</fnm></au><au><snm>Glick</snm><fnm>M</fnm></au><au><snm>Deng</snm><fnm>Z</fnm></au><au><snm>Nettles</snm><fnm>JH</fnm></au><au><snm>Davies</snm><fnm>JW</fnm></au></aug><source>J Chem Inf Model</source><pubdate>2006</pubdate><volume>46</volume><issue>6</issue><fpage>2445</fpage><lpage>2456</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/ci600197y</pubid><pubid idtype="pmpid" link="fulltext">17125186</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>Analysis of Pharmacology Data and the Prediction of Adverse Drug Reactions and Off-Target Effects from Chemical Structure</p></title><aug><au><snm>Bender</snm><fnm>A</fnm></au><au><snm>Scheiber</snm><fnm>J</fnm></au><au><snm>Glick</snm><fnm>M</fnm></au><au><snm>Davies</snm><fnm>JW</fnm></au><au><snm>Azzaoui</snm><fnm>K</fnm></au><au><snm>Hamon</snm><fnm>J</fnm></au><au><snm>Urban</snm><fnm>L</fnm></au><au><snm>Whitebread</snm><fnm>S</fnm></au><au><snm>Jenkins</snm><fnm>JL</fnm></au></aug><source>ChemMedChem</source><pubdate>2007</pubdate><volume>2</volume><issue>6</issue><fpage>861</fpage><lpage>873</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/cmdc.200700026</pubid><pubid idtype="pmpid" link="fulltext">17477341</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>Relating protein pharmacology by ligand chemistry</p></title><aug><au><snm>Keiser</snm><fnm>MJ</fnm></au><au><snm>Roth</snm><fnm>BL</fnm></au><au><snm>Armbruster</snm><fnm>BN</fnm></au><au><snm>Ernsberger</snm><fnm>P</fnm></au><au><snm>Irwin</snm><fnm>JJ</fnm></au><au><snm>Shoichet</snm><fnm>BK</fnm></au></aug><source>Nat Biotech</source><pubdate>2007</pubdate><volume>25</volume><issue>2</issue><fpage>197</fpage><lpage>206</lpage><xrefbib><pubid idtype="doi">10.1038/nbt1284</pubid></xrefbib></bibl><bibl id="B21"><title><p>Substructure Mining of GPCR Ligands Reveals Activity-Class Specific Functional Groups in an Unbiased Manner</p></title><aug><au><snm>van der Horst</snm><fnm>E</fnm></au><au><snm>Okuno</snm><fnm>Y</fnm></au><au><snm>Bender</snm><fnm>A</fnm></au><au><snm>IJzerman</snm><fnm>AP</fnm></au></aug><source>J Chem Inf Model</source><pubdate>2009</pubdate><volume>49</volume><issue>2</issue><fpage>348</fpage><lpage>360</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/ci8003896</pubid><pubid idtype="pmpid">19434836</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>Mining Molecular Fragments: Finding Relevant Substructures of Molecules</p></title><aug><au><snm>Borgelt</snm><fnm>C</fnm></au><au><snm>Berthold</snm><fnm>MR</fnm></au></aug><source>Proceedings of the 2002 IEEE International Conference on Data Mining: 2002</source><publisher>IEEE Computer Society</publisher><pubdate>2002</pubdate><fpage>51</fpage><lpage>58</lpage><xrefbib><pubid idtype="doi">full_text</pubid></xrefbib></bibl><bibl id="B23"><title><p>A quickstart in frequent structure mining can make a difference</p></title><aug><au><snm>Nijssen</snm><fnm>S</fnm></au><au><snm>Kok</snm><fnm>JN</fnm></au></aug><source>Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining: 2004</source><publisher>ACM Press, New York, USA</publisher><pubdate>2004</pubdate><fpage>647</fpage><lpage>652</lpage><xrefbib><pubid idtype="doi">full_text</pubid></xrefbib></bibl><bibl id="B24"><title><p>International Union of Pharmacology. XLVI. G Protein-Coupled Receptor List</p></title><aug><au><snm>Foord</snm><fnm>SM</fnm></au><au><snm>Bonner</snm><fnm>TI</fnm></au><au><snm>Neubig</snm><fnm>RR</fnm></au><au><snm>Rosser</snm><fnm>EM</fnm></au><au><snm>Pin</snm><fnm>J-P</fnm></au><au><snm>Davenport</snm><fnm>AP</fnm></au><au><snm>Spedding</snm><fnm>M</fnm></au><au><snm>Harmar</snm><fnm>AJ</fnm></au></aug><source>Pharmacol Rev</source><pubdate>2005</pubdate><volume>57</volume><issue>2</issue><fpage>279</fpage><lpage>288</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1124/pr.57.2.5</pubid><pubid idtype="pmpid" link="fulltext">15914470</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>GPCRDB information system for G protein-coupled receptors</p></title><aug><au><snm>Horn</snm><fnm>F</fnm></au><au><snm>Bettler</snm><fnm>E</fnm></au><au><snm>Oliveira</snm><fnm>L</fnm></au><au><snm>Campagne</snm><fnm>F</fnm></au><au><snm>Cohen</snm><fnm>FE</fnm></au><au><snm>Vriend</snm><fnm>G</fnm></au></aug><source>Nucl Acids Res</source><pubdate>2003</pubdate><volume>31</volume><issue>1</issue><fpage>294</fpage><lpage>297</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkg103</pubid><pubid idtype="pmcid">165550</pubid><pubid idtype="pmpid" link="fulltext">12520006</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>The selectivity of &#946;-adrenoceptor antagonists at the human &#946;<sub>1</sub>, &#946;<sub>2 </sub>and &#946;<sub>3 </sub>adrenoceptors</p></title><aug><au><snm>Baker</snm><fnm>JG</fnm></au></aug><source>Br J Pharmacol</source><pubdate>2005</pubdate><volume>144</volume><issue>3</issue><fpage>317</fpage><lpage>322</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/sj.bjp.0706048</pubid><pubid idtype="pmcid">1576008</pubid><pubid idtype="pmpid" link="fulltext">15655528</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>Muscarinic receptors and drugs in cardiovascular medicine</p></title><aug><au><snm>Van Zwieten</snm><fnm>PA</fnm></au><au><snm>Doods</snm><fnm>HN</fnm></au></aug><source>Cardiovascular Drugs and Therapy</source><pubdate>1995</pubdate><volume>9</volume><issue>1</issue><fpage>159</fpage><lpage>167</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1007/BF00877757</pubid><pubid idtype="pmpid">7786837</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><title><p>Allosteric site on muscarinic acetylcholine receptors: identification of two amino acids in the muscarinic M2 receptor that account entirely for the M2/M5 subtype selectivities of some structurally diverse allosteric ligands in N-methylscopolamine-occupied receptors</p></title><aug><au><snm>Voigtl&#228;nder</snm><fnm>U</fnm></au><au><snm>J&#246;hren</snm><fnm>K</fnm></au><au><snm>Mohr</snm><fnm>M</fnm></au><au><snm>Raasch</snm><fnm>A</fnm></au><au><snm>Tr&#228;nkle</snm><fnm>C</fnm></au><au><snm>Buller</snm><fnm>S</fnm></au><au><snm>Ellis</snm><fnm>J</fnm></au><au><snm>H&#246;ltje</snm><fnm>H-D</fnm></au><au><snm>Mohr</snm><fnm>K</fnm></au></aug><source>Molecular Pharmacology</source><pubdate>2003</pubdate><volume>64</volume><issue>1</issue><fpage>21</fpage><lpage>31</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1124/mol.64.1.21</pubid><pubid idtype="pmpid" link="fulltext">12815157</pubid></pubidlist></xrefbib></bibl><bibl id="B29"><title><p>GLIDA: GPCR ligand database for chemical genomics drug discovery database and tools update</p></title><aug><au><snm>Okuno</snm><fnm>Y</fnm></au><au><snm>Tamon</snm><fnm>A</fnm></au><au><snm>Yabuuchi</snm><fnm>H</fnm></au><au><snm>Niijima</snm><fnm>S</fnm></au><au><snm>Minowa</snm><fnm>Y</fnm></au><au><snm>Tonomura</snm><fnm>K</fnm></au><au><snm>Kunimoto</snm><fnm>R</fnm></au><au><snm>Feng</snm><fnm>C</fnm></au></aug><source>Nucl Acids Res</source><pubdate>2008</pubdate><volume>36</volume><issue>suppl_1</issue><fpage>D907</fpage><lpage>912</lpage><xrefbib><pubidlist><pubid idtype="pmcid">2238933</pubid><pubid idtype="pmpid" link="fulltext">17986454</pubid></pubidlist></xrefbib></bibl><bibl id="B30"><title><p>Global mapping of pharmacological space</p></title><aug><au><snm>Paolini</snm><fnm>GV</fnm></au><au><snm>Shapland</snm><fnm>RHB</fnm></au><au><snm>van Hoorn</snm><fnm>WP</fnm></au><au><snm>Mason</snm><fnm>JS</fnm></au><au><snm>Hopkins</snm><fnm>AL</fnm></au></aug><source>Nat Biotech</source><pubdate>2006</pubdate><volume>24</volume><issue>7</issue><fpage>805</fpage><lpage>815</lpage><xrefbib><pubid idtype="doi">10.1038/nbt1228</pubid></xrefbib></bibl><bibl id="B31"><title><p>Towards a New Generation of Potential Antipsychotic Agents Combining D2 and 5-HT1A Receptor Activities</p></title><aug><au><snm>Cuisiat</snm><fnm>S</fnm></au><au><snm>Bourdiol</snm><fnm>N</fnm></au><au><snm>Lacharme</snm><fnm>V</fnm></au><au><snm>Newman-Tancredi</snm><fnm>A</fnm></au><au><snm>Colpaert</snm><fnm>F</fnm></au><au><snm>Vacher</snm><fnm>B</fnm></au></aug><source>J Med Chem</source><pubdate>2007</pubdate><volume>50</volume><issue>4</issue><fpage>865</fpage><lpage>876</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/jm061180b</pubid><pubid idtype="pmpid" link="fulltext">17300168</pubid></pubidlist></xrefbib></bibl><bibl id="B32"><title><p>Optimisation of anti-psychotic therapeutics: a balancing act?</p></title><aug><au><snm>Lawrence</snm><fnm>AJ</fnm></au></aug><source>Br J Pharmacol</source><pubdate>2007</pubdate><volume>151</volume><issue>2</issue><fpage>161</fpage><lpage>162</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/sj.bjp.0707164</pubid><pubid idtype="pmcid">2013954</pubid><pubid idtype="pmpid" link="fulltext">17375084</pubid></pubidlist></xrefbib></bibl><bibl id="B33"><title><p>Recognition of Privileged Structures by G-Protein Coupled Receptors</p></title><aug><au><snm>Bondensgaard</snm><fnm>K</fnm></au><au><snm>Ankersen</snm><fnm>M</fnm></au><au><snm>Thogersen</snm><fnm>H</fnm></au><au><snm>Hansen</snm><fnm>BS</fnm></au><au><snm>Wulff</snm><fnm>BS</fnm></au><au><snm>Bywater</snm><fnm>RP</fnm></au></aug><source>J Med Chem</source><pubdate>2004</pubdate><volume>47</volume><issue>4</issue><fpage>888</fpage><lpage>899</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/jm0309452</pubid><pubid idtype="pmpid" link="fulltext">14761190</pubid></pubidlist></xrefbib></bibl><bibl id="B34"><title><p>Are Target-Family-Privileged Substructures Truly Privileged?</p></title><aug><au><snm>Schnur</snm><fnm>DM</fnm></au><au><snm>Hermsmeier</snm><fnm>MA</fnm></au><au><snm>Tebben</snm><fnm>AJ</fnm></au></aug><source>J Med Chem</source><pubdate>2006</pubdate><volume>49</volume><issue>6</issue><fpage>2000</fpage><lpage>2009</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/jm0502900</pubid><pubid idtype="pmpid" link="fulltext">16539387</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>The utilization of recombinant prostanoid receptors to determine the affinities and selectivities of prostaglandins and related analogs</p></title><aug><au><snm>Abramovitz</snm><fnm>M</fnm></au><au><snm>Adam</snm><fnm>M</fnm></au><au><snm>Boie</snm><fnm>Y</fnm></au><au><snm>Carri&#232;re</snm><fnm>M-C</fnm></au><au><snm>Denis</snm><fnm>D</fnm></au><au><snm>Godbout</snm><fnm>C</fnm></au><au><snm>Lamontagne</snm><fnm>S</fnm></au><au><snm>Rochette</snm><fnm>C</fnm></au><au><snm>Sawyer</snm><fnm>N</fnm></au><au><snm>Tremblay</snm><fnm>NM</fnm></au><etal/></aug><source>Biochim Biophys Acta, Mol Cell Biol Lipids</source><pubdate>2000</pubdate><volume>1483</volume><issue>2</issue><fpage>285</fpage><lpage>293</lpage><xrefbib><pubid idtype="doi">10.1016/S1388-1981(99)00164-X</pubid></xrefbib></bibl><bibl id="B36"><title><p>Antagonism of the prostaglandin D2 receptors DP1 and CRTH2 as an approach to treat allergic diseases</p></title><aug><au><snm>Pettipher</snm><fnm>R</fnm></au><au><snm>Hansel</snm><fnm>TT</fnm></au><au><snm>Armer</snm><fnm>R</fnm></au></aug><source>Nat Rev Drug Discov</source><pubdate>2007</pubdate><volume>6</volume><issue>4</issue><fpage>313</fpage><lpage>325</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nrd2266</pubid><pubid idtype="pmpid" link="fulltext">17396136</pubid></pubidlist></xrefbib></bibl><bibl id="B37"><title><p>A Novel Hepatointestinal Leukotriene B4 Receptor. Cloning and Functional Characterization</p></title><aug><au><snm>Wang</snm><fnm>S</fnm></au><au><snm>Gustafson</snm><fnm>E</fnm></au><au><snm>Pang</snm><fnm>L</fnm></au><au><snm>Qiao</snm><fnm>X</fnm></au><au><snm>Behan</snm><fnm>J</fnm></au><au><snm>Maguire</snm><fnm>M</fnm></au><au><snm>Bayne</snm><fnm>M</fnm></au><au><snm>Laz</snm><fnm>T</fnm></au></aug><source>J Biol Chem</source><pubdate>2000</pubdate><volume>275</volume><issue>52</issue><fpage>40686</fpage><lpage>40694</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1074/jbc.M004512200</pubid><pubid idtype="pmpid" link="fulltext">11006272</pubid></pubidlist></xrefbib></bibl><bibl id="B38"><title><p>A G-protein-coupled receptor for leukotriene B4 that mediates chemotaxis</p></title><aug><au><snm>Yokomizo</snm><fnm>T</fnm></au><au><snm>Izumi</snm><fnm>T</fnm></au><au><snm>Chang</snm><fnm>K</fnm></au><au><snm>Takuwa</snm><fnm>Y</fnm></au><au><snm>Shimizu</snm><fnm>T</fnm></au></aug><source>Nature</source><pubdate>1997</pubdate><volume>387</volume><issue>6633</issue><fpage>620</fpage><lpage>624</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/42506</pubid><pubid idtype="pmpid" link="fulltext">9177352</pubid></pubidlist></xrefbib></bibl><bibl id="B39"><title><p>Dopamine receptors for every species: Gene duplications and functional diversification in Craniates</p></title><aug><au><snm>Le Crom</snm><fnm>S</fnm></au><au><snm>Kapsimali</snm><fnm>M</fnm></au><au><snm>Bar&#244;me</snm><fnm>P-O</fnm></au><au><snm>Vernier</snm><fnm>P</fnm></au></aug><source>Journal of Structural and Functional Genomics</source><pubdate>2003</pubdate><volume>3</volume><issue>1</issue><fpage>161</fpage><lpage>176</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1023/A:1022686622752</pubid><pubid idtype="pmpid" link="fulltext">12836695</pubid></pubidlist></xrefbib></bibl><bibl id="B40"><title><p>Dopamine D1 receptor ligands: where are we now and where are we going</p></title><aug><au><snm>Zhang</snm><fnm>J</fnm></au><au><snm>Xiong</snm><fnm>B</fnm></au><au><snm>Zhen</snm><fnm>X</fnm></au><au><snm>Zhang</snm><fnm>A</fnm></au></aug><source>Med Res Rev</source><pubdate>2009</pubdate><volume>29</volume><issue>2</issue><fpage>272</fpage><lpage>294</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/med.20130</pubid><pubid idtype="pmpid" link="fulltext">18642350</pubid></pubidlist></xrefbib></bibl><bibl id="B41"><title><p>Atypical antipsychotic drug actions: unitary or multiple mechanisms for 'atypicality'?</p></title><aug><au><snm>Roth</snm><fnm>BL</fnm></au><au><snm>Sheffler</snm><fnm>D</fnm></au><au><snm>Potkin</snm><fnm>SG</fnm></au></aug><source>Clinical Neuroscience Research</source><pubdate>2003</pubdate><volume>3</volume><issue>1-2</issue><fpage>108</fpage><lpage>117</lpage><xrefbib><pubid idtype="doi">10.1016/S1566-2772(03)00021-5</pubid></xrefbib></bibl><bibl id="B42"><title><p>General pharmacology of clozapine</p></title><aug><au><snm>Coward</snm><fnm>DM</fnm></au></aug><source>The British Journal of Psychiatry Supplement</source><pubdate>1992</pubdate><issue>17</issue><fpage>5</fpage><lpage>11</lpage><xrefbib><pubid idtype="pmpid">1358127</pubid></xrefbib></bibl><bibl id="B43"><title><p>Convergent Evolution on the Molecular Level</p></title><aug><au><snm>Zakon</snm><fnm>HH</fnm></au></aug><source>Brain, Behavior and Evolution</source><pubdate>2002</pubdate><volume>59</volume><issue>5-6</issue><fpage>250</fpage><lpage>261</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1159/000063562</pubid><pubid idtype="pmpid" link="fulltext">12207082</pubid></pubidlist></xrefbib></bibl><bibl id="B44"><title><p>Data completeness--the Achilles heel of drug-target networks</p></title><aug><au><snm>Mestres</snm><fnm>J</fnm></au><au><snm>Gregori-Puigjane</snm><fnm>E</fnm></au><au><snm>Valverde</snm><fnm>S</fnm></au><au><snm>Sole</snm><fnm>RV</fnm></au></aug><source>Nat Biotech</source><pubdate>2008</pubdate><volume>26</volume><issue>9</issue><fpage>983</fpage><lpage>984</lpage><xrefbib><pubid idtype="doi">10.1038/nbt0908-983</pubid></xrefbib></bibl><bibl id="B45"><title><p>The Properties of Known Drugs. 1. Molecular Frameworks</p></title><aug><au><snm>Bemis</snm><fnm>GW</fnm></au><au><snm>Murcko</snm><fnm>MA</fnm></au></aug><source>J Med Chem</source><pubdate>1996</pubdate><volume>39</volume><issue>15</issue><fpage>2887</fpage><lpage>2893</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/jm9602928</pubid><pubid idtype="pmpid" link="fulltext">8709122</pubid></pubidlist></xrefbib></bibl><bibl id="B46"><title><p>Computational Approaches to Fragment and Substructure Discovery and Evaluation</p></title><aug><au><snm>van der Horst</snm><fnm>E</fnm></au><au><snm>IJzerman</snm><fnm>AP</fnm></au></aug><source>Fragment-Based Drug Discovery: A Practical Approach</source><publisher>West Sussex, U.K.: John Wiley &amp; Sons, Ltd</publisher><editor>Zartler ER, Shapiro J, Chichester M</editor><pubdate>2008</pubdate></bibl><bibl id="B47"><title><p>A Chemogenomic Analysis of the Human Proteome: Application to Enzyme Families</p></title><aug><au><snm>Bernasconi</snm><fnm>P</fnm></au><au><snm>Min</snm><fnm>C</fnm></au><au><snm>Galasinski</snm><fnm>S</fnm></au><au><snm>Popa-Burke</snm><fnm>I</fnm></au><au><snm>Bobasheva</snm><fnm>A</fnm></au><au><snm>Coudurier</snm><fnm>L</fnm></au><au><snm>Birkos</snm><fnm>S</fnm></au><au><snm>Hallam</snm><fnm>R</fnm></au><au><snm>Janzen</snm><fnm>WP</fnm></au></aug><source>J Biomol Screen</source><pubdate>2007</pubdate><volume>12</volume><issue>7</issue><fpage>972</fpage><lpage>982</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1177/1087057107306759</pubid><pubid idtype="pmpid" link="fulltext">17942790</pubid></pubidlist></xrefbib></bibl><bibl id="B48"><title><p>ChEMBL</p></title><url>http://www.ebi.ac.uk/chembl/</url></bibl><bibl id="B49"><title><p>Screening the receptorome to discover the molecular targets for plant-derived psychoactive compounds: a novel approach for CNS drug discovery</p></title><aug><au><snm>Roth</snm><fnm>BL</fnm></au><au><snm>Lopez</snm><fnm>E</fnm></au><au><snm>Beischel</snm><fnm>S</fnm></au><au><snm>Westkaemper</snm><fnm>RB</fnm></au><au><snm>Evans</snm><fnm>JM</fnm></au></aug><source>Pharmacol Ther</source><pubdate>2004</pubdate><volume>102</volume><issue>2</issue><fpage>99</fpage><lpage>110</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.pharmthera.2004.03.004</pubid><pubid idtype="pmpid" link="fulltext">15163592</pubid></pubidlist></xrefbib></bibl><bibl id="B50"><title><p>The Universal Protein Resource (UniProt)</p></title><aug><au><cnm>The UniProt Consortium</cnm></au></aug><source>Nucl Acids Res</source><pubdate>2008</pubdate><volume>36</volume><issue>suppl_1</issue><fpage>D190</fpage><lpage>195</lpage><xrefbib><pubidlist><pubid idtype="pmcid">2238893</pubid><pubid idtype="pmpid" link="fulltext">18045787</pubid></pubidlist></xrefbib></bibl><bibl id="B51"><title><p>Database resources of the National Center for Biotechnology Information</p></title><aug><au><snm>Wheeler</snm><fnm>DL</fnm></au><au><snm>Barrett</snm><fnm>T</fnm></au><au><snm>Benson</snm><fnm>DA</fnm></au><au><snm>Bryant</snm><fnm>SH</fnm></au><au><snm>Canese</snm><fnm>K</fnm></au><au><snm>Chetvernin</snm><fnm>V</fnm></au><au><snm>Church</snm><fnm>DM</fnm></au><au><snm>DiCuccio</snm><fnm>M</fnm></au><au><snm>Edgar</snm><fnm>R</fnm></au><au><snm>Federhen</snm><fnm>S</fnm></au><etal/></aug><source>Nucl Acids Res</source><pubdate>2008</pubdate><issue>36 Database</issue><fpage>D13</fpage><lpage>D21</lpage><xrefbib><pubidlist><pubid idtype="pmcid">2238880</pubid><pubid idtype="pmpid" link="fulltext">18045790</pubid></pubidlist></xrefbib></bibl><bibl id="B52"><title><p>DrugBank: a comprehensive resource for in silico drug discovery and exploration</p></title><aug><au><snm>Wishart</snm><fnm>DS</fnm></au><au><snm>Knox</snm><fnm>C</fnm></au><au><snm>Guo</snm><fnm>AC</fnm></au><au><snm>Shrivastava</snm><fnm>S</fnm></au><au><snm>Hassanali</snm><fnm>M</fnm></au><au><snm>Stothard</snm><fnm>P</fnm></au><au><snm>Chang</snm><fnm>Z</fnm></au><au><snm>Woolsey</snm><fnm>J</fnm></au></aug><source>Nucl Acids Res</source><pubdate>2006</pubdate><volume>34</volume><issue>suppl_1</issue><fpage>D668</fpage><lpage>672</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkj067</pubid><pubid idtype="pmcid">1347430</pubid><pubid idtype="pmpid" link="fulltext">16381955</pubid></pubidlist></xrefbib></bibl><bibl id="B53"><title><p>GPCRDB</p></title><url>http://www.gpcr.org/7tm/</url></bibl><bibl id="B54"><title><p>GASTON</p></title><url>http://www.liacs.nl/~snijssen/gaston/</url></bibl><bibl id="B55"><title><p>A Quantitative Comparison of the Subgraph Miners MoFa, gSpan, FFSM, and Gaston</p></title><aug><au><snm>W&#246;rlein</snm><fnm>M</fnm></au><au><snm>Meinl</snm><fnm>T</fnm></au><au><snm>Fischer</snm><fnm>I</fnm></au><au><snm>Philippsen</snm><fnm>M</fnm></au></aug><source>Knowledge Discovery in Databases: PKDD 2005</source><pubdate>2005</pubdate><fpage>392</fpage><lpage>403</lpage><xrefbib><pubid idtype="doi">full_text</pubid></xrefbib></bibl><bibl id="B56"><title><p>Fast Algorithms for Mining Association Rules in Large Databases</p></title><aug><au><snm>Agrawal</snm><fnm>R</fnm></au><au><snm>Srikant</snm><fnm>R</fnm></au></aug><source>Proceedings of the 20th International Conference on Very Large Data Bases: September 12 - 15 1994</source><publisher>Morgan Kaufmann Publishers, San Francisco, CA</publisher><pubdate>1994</pubdate><fpage>487</fpage><lpage>499</lpage></bibl><bibl id="B57"><title><p>PHYLIP (Phylogeny Inference Package) version 3.6. Distributed by the author. Department of Genome Sciences, University of Washington, Seattle</p></title><aug><au><snm>Felsenstein</snm><fnm>J</fnm></au></aug><pubdate>2005</pubdate></bibl><bibl id="B58"><title><p>MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) Software Version 4.0</p></title><aug><au><snm>Tamura</snm><fnm>K</fnm></au><au><snm>Dudley</snm><fnm>J</fnm></au><au><snm>Nei</snm><fnm>M</fnm></au><au><snm>Kumar</snm><fnm>S</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2007</pubdate><volume>24</volume><issue>8</issue><fpage>1596</fpage><lpage>1599</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msm092</pubid><pubid idtype="pmpid" link="fulltext">17488738</pubid></pubidlist></xrefbib></bibl><bibl id="B59"><title><p>Solution Phase Synthesis of Chemical Libraries for Lead Discovery</p></title><aug><au><snm>Garr</snm><fnm>CD</fnm></au><au><snm>Peterson</snm><fnm>JR</fnm></au><au><snm>Schultz</snm><fnm>L</fnm></au><au><snm>Oliver</snm><fnm>AR</fnm></au><au><snm>Underiner</snm><fnm>TL</fnm></au><au><snm>Cramer</snm><fnm>RD</fnm></au><au><snm>Ferguson</snm><fnm>AM</fnm></au><au><snm>Lawless</snm><fnm>MS</fnm></au><au><snm>Patterson</snm><fnm>DE</fnm></au></aug><source>J Biomol Screen</source><pubdate>1996</pubdate><volume>1</volume><issue>4</issue><fpage>179</fpage><lpage>186</lpage><xrefbib><pubid idtype="doi">10.1177/108705719600100404</pubid></xrefbib></bibl></refgrp>
</bm></art>