Volume 6 Number 2 (Feb. 2011)
Home > Archive > 2011 > Volume 6 Number 2 (Feb. 2011) >
JCP 2011 Vol.6(2): 321-328 ISSN: 1796-203X
doi: 10.4304/jcp.6.2.321-328

Protein Remote Homology Detection and Fold Recognition based on Features Extracted from Frequency Profiles

Lei Lin1, 2, Bin Liu3, Xiaolong Wang1, 3, Xuan Wang3, Buzhou Tang3
1School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
2Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
3Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China


Abstract—Protein remote homology detection and fold recognition are central problems in bioinformatics. Currently, discriminative methods based on support vector machine (SVM) are the most effective and accurate methods for solving these problems. The performance of SVM depends on the method of protein vectorization, so a suitable representation of the protein sequence is a key step for the SVM-based methods. In this paper, two kinds of profile-level building blocks of proteins, binary profiles and N-nary profiles, have been presented, which contain the evolutionary information of the protein sequence frequency profile. The protein sequence frequency profiles calculated from the multiple sequence alignments outputted by PSIBLAST are converted into binary profiles or N-nary profiles. The protein sequences are transformed into fixeddimension feature vectors by the occurrence times of each binary profile or N-nary profile and then the corresponding vectors are inputted to support vector machines. The latent semantic analysis (LSA) model, an efficient feature extraction algorithm, is adopted to further improve the performance of our methods. Experiments with protein remote homology detection and fold recognition show that the methods based on profile-level building blocks give better results compared to related methods.

Index Terms—fold recognition; remote homology detection; Support Vector Machine; Latent semantic analysis, frequency profiles

[PDF]

Cite: Lei Lin, Bin Liu, Xiaolong Wang, Xuan Wang, Buzhou Tang, "Protein Remote Homology Detection and Fold Recognition based on Features Extracted from Frequency Profiles," Journal of Computers vol. 6, no. 2, pp. 321-328, 2011.

General Information

ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO,  ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
E-mail: jcp@iap.org
  • Nov 14, 2019 News!

    Vol 14, No 11 has been published with online version   [Click]

  • Mar 20, 2020 News!

    Vol 15, No 2 has been published with online version   [Click]

  • Dec 16, 2019 News!

    Vol 14, No 12 has been published with online version   [Click]

  • Sep 16, 2019 News!

    Vol 14, No 9 has been published with online version   [Click]

  • Aug 16, 2019 News!

    Vol 14, No 8 has been published with online version   [Click]

  • Read more>>