作者
George Khoury,James B. Smadbeck,Chris A. Kieslich,Christodoulos A. Floudas
摘要
•Interplay between accurate protein structure prediction and successful de novo protein design. •Reviews current state-of-the-art structural protein prediction methods and challenges. •Reviews features of successful de novo protein designs. •Biotechnology applications in therapeutics, biocatalysts, and nanomaterials are summarized. In the postgenomic era, the medical/biological fields are advancing faster than ever. However, before the power of full-genome sequencing can be fully realized, the connection between amino acid sequence and protein structure, known as the protein folding problem, needs to be elucidated. The protein folding problem remains elusive, with significant difficulties still arising when modeling amino acid sequences lacking an identifiable template. Understanding protein folding will allow for unforeseen advances in protein design; often referred to as the inverse protein folding problem. Despite challenges in protein folding, de novo protein design has recently demonstrated significant success via computational techniques. We review advances and challenges in protein structure prediction and de novo protein design, and highlight their interplay in successful biotechnological applications. In the postgenomic era, the medical/biological fields are advancing faster than ever. However, before the power of full-genome sequencing can be fully realized, the connection between amino acid sequence and protein structure, known as the protein folding problem, needs to be elucidated. The protein folding problem remains elusive, with significant difficulties still arising when modeling amino acid sequences lacking an identifiable template. Understanding protein folding will allow for unforeseen advances in protein design; often referred to as the inverse protein folding problem. Despite challenges in protein folding, de novo protein design has recently demonstrated significant success via computational techniques. We review advances and challenges in protein structure prediction and de novo protein design, and highlight their interplay in successful biotechnological applications. refers to a process operating on the variables as 3D vectors of x, y, and z coordinates in order to reduce the potential energy of a conformer. a metric for the concentration of compound at half the maximal value on a dose-response curve. The curve is usually sigmoidal. a distance-dependent statistical potential that scores models to aid in selecting near-native conformations of a target protein. It utilizes information about the relative plane orientation of interacting pairs of atoms. this is a metric that approximately represents the percentage of residues located in the correct position after structural alignment. This is a more robust metric than RMSD. key interactions at the interface of a protein–protein complex. Many hot spots include salt bridges where oppositely charged side chains attract, hydrogen bonds, and/or ideal van der Waals interactions subject to shape complementarity. A metric for the half-maximal inhibitory concentration in a competitive binding assay. The curve is usually sigmoidal. structure prediction method using multiple threading alignments to templates and fragment assembly. generates structure predictions using high scoring alignments of a target sequence to a template using information from ten threading programs. protein structure homology modeling program that generates structures satisfying spatial constraints. an algorithm for solving the equations of motion iteratively over time and used to sample conformational space in a physically meaningful way. an algorithm reliant on randomly sampling the sequence or structural space according to a probability distribution. a difficult class of decision problems that have not been proven to be solvable with an algorithm within polynomial time – O(nk). a protein structure prediction program that assembles fragments without any global template information. program that constructs a full protein model using only α-carbon traces. this is a metric that measures the average distance between two structurally aligned sets of atoms. It is often used a metric for the quality of a prediction, and often computed with the α carbon atoms. A predicted structure with RMSD to the native of ≤3 Å is considered to be good enough to perform subsequent computational studies. statistically abundant side-chain conformation. fold recognition method that combines structural information through sequence profiles of structure fragments, secondary structure predictions, and dynamic programming to generate an alignment of a target sequence to a template. refers to a process operating on a reduced set of variables representing torsion angles that control the distance between the first and fourth atom in a series of four atoms in order to reduce the potential energy of a conformer. computed as Value−MeanStdDev, it is a metric denoting the separation of a value from counterparts. It is useful for assessing the significance of top structure predictions compared to the entire population of predictions from other methods.