BindN-RF Prediction of DNA-Binding Residues Using Random Forests
 
BindN-RF applies Random Forests (RFs) to sequence-based prediction of DNA-binding residues in proteins using biochemical features and evolutionary information.  The RF classifier has been constructed using the PDNA-62 dataset extracted from the Protein Data Bank.  For a query sequence, the system performs a three-iteration PSI-BLAST search against the UniProtKB database to derive evolutionary information.  Although BindN-RF runs more slowly than our previous system, BindN, the RF classifier achieves higher accuracy for DNA-binding site prediction (78.06% sensitivity and 78.22% specificity, estimated from cross-validation).  The performance of BindN-RF has further been verified using two separate test datasets (TestPDB and TestSP).  Please send your comments to liangjw@clemson.edu.

 


Paste your amino acid sequence in FASTA format: 

Predict DNA-binding residues with expected    equal to  %

         

 


Reference:  Wang, L., Yang, M.Q. and Yang, J.Y. (2009) Prediction of DNA-binding residues from protein
sequence information using random forests. BMC Genomics, 10(Suppl 1):S1.

2011 Clemson University Department of Genetics and Biochemistry and Greenwood Genetic Center.