SH3 domains are peptide recognition modules that mediate the assembly of diverse biological complexes. its strong preference for an Asp-Tyr motif downstream of the Arg residue IDH-C227 manufacture of the canonical class II motif (Physique 1). Physique 2 The intrinsic specificities of yeast SH3 domains. To assess the specificity contribution from different elements in the binding profiles, we quantified separately the scores for the positions within or outside the core motif for the various specificity profiles (Physique 2B). The core positions for classes I and II only contribute roughly half of the value, with the other half being contributed by other positions that define distinct specificity niches. Analogously, residues outside the core positions contribute approximately the same level of specificity for the unique sets of ligands recognized by Lsb1/Pin3 and Boi1/Boi2 (Physique 2B). IDH-C227 manufacture For class III domains, we found that recognition of proline accounts for approximately 60% of the is the size of the yeast proteome) was decided. We find that approximately 10% of two-hybrid positives rank among the top ten hits predicted by the PWM of the associated SH3 domain name (Physique 4, dashed line). The fraction of yeast two-hybrid hits with peptide sequences ranked IDH-C227 manufacture among the top ten PWM-predicted ligands is usually increased to more than 25% when considering interactions that are captured at least six occasions, suggesting that these interactions have a higher likelihood of representing bona fide SH3 domain name ligands (Physique 4, solid line). The high fraction of yeast two-hybrid positives with high-scoring PWM matches, compared to those predicted for random interactions, suggests that the detailed binding specificity uncovered by phage-derived PWMs was recapitulated using the yeast two-hybrid system. Figure 4 Yeast two-hybrid hits contain high-scoring PWM matches. Generation of a High-Confidence SH3 Domain ProteinCProtein Interaction Network Using Bayesian Integration Each experimental method has different strengths and biases, and the integration of data from independent techniques increases the accuracy of the resulting dataset substantially . We generated a yeast SH3 domain proteinCprotein interaction network and used a statistical approach based on Bayesian networks  to assign each interaction a probability score. This TNFA score is based on the confidence level of the experimental data that defined the interaction benchmarked by the gold-standard set (see Materials and Methods and Table S11). A Bayesian networks formalism was IDH-C227 manufacture chosen for the machine learning because it has been shown previously to perform well at integrating heterogeneous biological data ,. The gold-standard set represents a list of manually curated interactions known to be mediated by a specific SH3 domain, compiled through an exhaustive literature search. Each interaction in the gold-standard set is supported by multiple experiments reported in one or more focused studies, which show the direct binding of the SH3 domain to its target, and its functional relevance. Each technique utilized in our analysis encompasses a quantitative measure: first, the phage-derived PWMs accurately represent relative binding affinities; second, interactions identified by SPOT peptide arrays can be binned and ranked based on intensity (see Materials and Methods); and third, interactions captured multiple times by yeast two-hybrid can be assigned a higher score than those captured only once. Furthermore, the different methods have complementary features. Whereas the phage display and SPOT peptide array signals correlate with and predict binding affinity, the yeast two-hybrid system identifies putative in vivo interactors of SH3 domains. We therefore integrated these datasets into a Bayesian model to identify highly likely SH3 domainCligand interactions. All interactions in the gold-standard set were mapped specifically to an SH3 domain and, where applicable, to the peptide sequence within the interacting partner (see Materials and Methods and Table S12). We generated a negative set using random protein IDH-C227 manufacture pairs under the constraint of never sharing or being in adjacent cellular.