Mittal Meetanshu , Singh Rahul , Raj Singh Tiratha
RNA-binding proteins play a significant role in pattern regulation of gene expression during developmental phases. Therefore in order to facilitate our understanding of organism development there is a continuous need to develop an extensive apriori method for the prediction of RNA-binding protein pockets. We present here a SVM based approach for successful prediction of these pockets. The method employs two datasets: the protein sequences of the RNA binding protein pockets and the non-RNA binding protein pockets, both of which when combined to form the positive and negative datasets to be fed into the Support Vector Machine (SVM) model. Before feeding the data to the SVM, both the datasets were crossed with several steps of sorting with the help of WEKA, which refined the selection process of obtaining ranked features of these datasets. Analysis was applied on 3 different featured datasets viz FPOCKET, Zernike and shell features. The results suggest that the top 10 features of shell are very important and play a pivotal role in the classification and prediction of ligand binding sites in RNA binding proteins. An accuracy of 89.3% was achieved when evaluated. This study demonstrates that it is possible to predict ligand binding sites in RNA binding protein pockets using its sequence.