| BindN-RF Prediction of DNA-Binding Residues Using Random Forests | ||
|
BindN-RF applies Random Forests (RFs) to
sequence-based prediction of DNA-binding residues in proteins using
biochemical features and evolutionary information. The RF
classifier has been constructed using the PDNA-62 dataset extracted from the
Protein Data
Bank. For a query sequence, the system performs
a three-iteration PSI-BLAST search against the
UniProtKB
database to derive evolutionary information. Although BindN-RF
runs more slowly than our previous system, BindN, the RF classifier achieves higher accuracy
for DNA-binding site prediction (78.06% sensitivity and
78.22% specificity, estimated from cross-validation). The
performance of BindN-RF has further been verified using two
separate test datasets (TestPDB and TestSP). Please send your comments to liangjw@clemson.edu.
|