Several single domain proteins fold without any detectable intermediates crossing a single rate-limiting barrier and are termed as "two-state folders". Since, transition state structures are highly unstable, it is common to decipher the folding mechanism of such proteins using site directed mutagenesis that involves measuring the folding rates of the mutants with respect to the wild-type. However, such experiments are highly time and resource intensive. An alternative is to employ sequence-structure based properties to predict the changes in the rate constants. While it is relatively easy to account for mutational effects on stabilities, it is highly challenging to predict the corresponding effect on rate constants. We propose here a simple knowledge-based methodology to predict the folding rates upon mutations formulated from amino-acid properties using multiple linear regression approach.


We benchmark this method against an experimental database of 790 single point mutations from 23 different two-state-like proteins. Mutants were classified according to secondary structure, accessible surface area and sequential position. Prime three features eliciting best relationship with folding rates change were shortlisted for each class. Moreover, prediction for each class was optimized using a window length to account for influence of neighboring residues on the mutant position. We obtained a self consistency mean absolute error (MAE) of 0.36 s-1 and pearson correlation coefficient (PCC) of 0.81. Jackknife cross validation resulted in MAE of 0.42 s-1 and PCC of 0.73. Our protocol outperforms the only other method available for prediction of folding rates change upon point mutation which shows PCC of 0.54 after 10 fold cross validation. A web server "Folding RaCe" has been developed for prediction purpose.

Start Folding RaCe Start Folding RaCe Start Folding RaCe
Start Folding RaCe