A statistical method to predict protein pK a has been developed by using the 3D structure of a protein and a database of 434 experimental protein pK a values. Each pK a in the database is associated with a fingerprint that describes the chemical environment around an ionizable residue. A computational tool, MoKaBio, has been developed to identify automatically ionizable residues in a protein, generate fingerprints that describe the chemical environment around such residues, and predict pK a from the experimental pK a values in the database by using a similarity metric. The method, which retrieved the pK a of 429 of the 434 ionizable sites in the database correctly, was crossvalidated by leave-one-out and yielded root mean square error (RMSE) = 0.95, a result that is superior to that obtained by using the Null Model (RMSE 1.07) and other well-established protein pK a prediction tools. This novel approach is suitable to rationalize protein pK a by comparing the region around the ionizable site with similar regions whose ionizable site pK a is known. The pK a of residues that have a unique environment not represented in the training set cannot be predicted accurately, however, the method offers the advantage of being trainable to increase its predictive power.
Predicting protein pK(a) by environment similarity
STORCHI, LORIANO;
2009-01-01
Abstract
A statistical method to predict protein pK a has been developed by using the 3D structure of a protein and a database of 434 experimental protein pK a values. Each pK a in the database is associated with a fingerprint that describes the chemical environment around an ionizable residue. A computational tool, MoKaBio, has been developed to identify automatically ionizable residues in a protein, generate fingerprints that describe the chemical environment around such residues, and predict pK a from the experimental pK a values in the database by using a similarity metric. The method, which retrieved the pK a of 429 of the 434 ionizable sites in the database correctly, was crossvalidated by leave-one-out and yielded root mean square error (RMSE) = 0.95, a result that is superior to that obtained by using the Null Model (RMSE 1.07) and other well-established protein pK a prediction tools. This novel approach is suitable to rationalize protein pK a by comparing the region around the ionizable site with similar regions whose ionizable site pK a is known. The pK a of residues that have a unique environment not represented in the training set cannot be predicted accurately, however, the method offers the advantage of being trainable to increase its predictive power.File | Dimensione | Formato | |
---|---|---|---|
22363_ftp.pdf
Solo gestori archivio
Tipologia:
PDF editoriale
Dimensione
297.66 kB
Formato
Adobe PDF
|
297.66 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.