A Combined Cheminformatic and Bioinformatic Approach to Address the Proteolytic Stability Challenge in Peptide-Based Drug Discovery

Document Type


Publication Date



We have created models to predict cleavage sites for several human proteases including: caspase-1, caspase-3, caspase-6, caspase-7, cathepsin B, cathepsin D, cathepsin G, cathepsin K, cathepsin L, elastase-2, granzyme A, granzyme B, matrix metallopeptidase-2 (MMP2), MMP7, MMP9, thrombin and trypsin-1. Rather than representing the sequence pattern around the potential cleavage site through a series of flags with each flag representing one of the 20 standard amino acids, we first represent each amino acid by its calculated properties. For these calculated properties, we use validated cheminformatic descriptors, such as molecular weight, logP, and polar surface area, of the individual amino acids. Finally, the cleavage site specific descriptors are calculated through various combinations of the individual amino acid descriptors for the residues surrounding the cleavage site. Some of these combinations do not take into account the location of the residue, as long as it is in a prescribed neighborhood of the potential cleavage site, whereas others are sensitive to the precise order of the residues in the sequence. The key advantage of this approach is that it allows one to perform meaningful calculations with nonstandard amino acids for which little or no data exists. Finally, using both docking and molecular dynamics simulations, we examine the potential for and limitations of protease crystal structures to impact the design of proteolytically stable peptides.