What is PepSite?

Short answer:

Given a peptide sequence and a protein structure, PepSite predicts where on the protein surface the peptide is likely to bind.

Long answer:

Protein-protein interactions are vital for all cellular mechanisms. A major class of such interactions are those where a globular domain in one protein binds to a short peptide stretch in another. Phosphorylation/dephosphorylation events, other post-translational modifications, and dozens of signaling, targetting and trafficking procedures are known to occur by way of this kind of interaction. Often proteins sharing an interaction partner contain a common peptide pattern or linear motif known to mediate the interaction. For example, SH3 domains bind to a PxxP pattern, WW domains to PPxY, 14-3-3 domains to RxSxP, etc. The availability of hundreds of new peptides known or predicted to interact with a particular protein, in addition to the fact that structural genomics initiatives, and the general increased pace of structure determination, have provided representative structures for many if not most globular domains suggests the need for methods that specifically fit a peptide onto the surface of a 3D structure.

PepSite is just such an approach. We have constructed spatial position-specific scoring matrices (PSSMs) capturing the preferred environment for each amino acid when bound as a peptide from a database of protein/peptide structure. These matrices are then used to score the surface of target proteins in order to find candidate binding sites of each residue of a particular peptide, which are then combined to suggest the potential binding site and rough orientation of this peptide. The method performs well in a benchmark, and we have shown that it is capable of identifying the true binding site of peptides and roughly orienting them on protein surfaces.

How can I use PepSite?

Using the form on the left-hand side of the page, provide the PDB code and chain identifier for your protein of interest (or upload your own structure in PDB format), as well as the query peptide sequence. Specify the peptide residues using standard one-letter codes. See below the list of supported residues.

To try PepSite on an example protein-peptide pair, hit Example on the form on the left.

Can I access this web server programmatically?

Yes! Please read the documentation.

What are the supported peptide residues?

PepSite supports all 20 standard amino acid residues, plus three posttranslational modifications. Unknown residues are supported by the original PepSite web server, but not (yet) by the this server.

The table below lists the supported non-standard residues, along with their three-letter codes used by the PDB and one-letter codes used by PepSite.

residue	PDB code	PepSite code
phosphoserine	SEP	J
phosphothreonine	TPO	Z
o-phosphotyrosine	PTR	B

How are molecular visualizations generated?

Molecular visualizations in this website are currently generated using JSmol v14.6.1.

Can I provide a UniProt ID or accession instead of a PDB code?

The standard usage consists in providing a receptor protein structure using a PDB code (or upload a PDB structure). It is also possible to provide a UniProt ID or accession instead, but this option is currently only available via the PepSite API. In the current implementation, if a receptor protein is specify either via a UniProt ID or accession, the corresponding PDB files are identified using data from SIFTS and the structure with the best resolution is chosen. Resolution is currently the only criterium used by PepSite to select the PDB structure.

How should I cite this web server?

When using PepSite 2, please cite:

PepSite: prediction of peptide-binding sites from protein surfaces.
Trabuco LG, Lise S, Petsalaki E, and Russell RB.
Nucleic Acids Res. 2012; 40(Web Server issue):W423-426.

Original PepSite reference:

Accurate prediction of peptide binding sites on protein surfaces.
Petsalaki E, Stark A, GarcÃa-Urdiales E, and Russell RB.
PLoS Comput Biol. 2009 Mar;5(3):e1000335.

Is PepSite merely finding cavities or conserved surfaces or both?

In short, no, or at least not exclusively. We have rigorously tested whether PepSite predictions are simply conserved sites, and this is clearly not the case: PepSite greatly outperforms conservation-based binding site predictors at finding correct binding sites (see Petsalaki et al, PLoS Comp Biol, 2009). This is not that surprising as it is known that binding sites for many peptides are not conserved over the same evolutionary distances as are catalytic or active sites.

Regarding cavities, it is true that many predictions overlap with the deepest cavities on protein surfaces, which is also generally true for many peptide binding sites (e.g., 14-3-3 domains, kinase ligands, etc.). However, this is by no means always the case. Consider, for example, the TPX2 peptide core YDAP with Aurora A kinase (2j4z chain A); here the correct binding site is identified and clearly does not overlap with the deepest cavity on the structure (the ATP binding site). We have done numerous tests where we have electronically mutated the peptide and the binding protein for well known examples and found that these always lead to poorer p-values, different binding sites, or both, indicating that the residue preferences encoding in the PepSite profiles play a clear role in identifying the correct binding site. We plan to describe this in more detail in a forthcoming paper.