Optimization of Discrete Wavelet Transform Feature Representation and Hierarchical Classification of G-Protein Coupled Receptor Using Firefly Algorithm and Particle Swarm Optimization

Featured Application: This article belongs to the Section Computing and Artificial Intelligence. Ineffective protein feature representation poses problems in protein classification in hierarchical structures. Discrete wavelet transform (DWT) is a feature representation method which generates global...

Full description

Bibliographic Details
Published in:Applied Sciences (Switzerland)
Main Author: Kamal N.A.M.; Bakar A.A.; Zainudin S.
Format: Article
Language:English
Published: MDPI 2022
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85143647233&doi=10.3390%2fapp122312011&partnerID=40&md5=7b93f9a450064a25bbc61a7cbc22abaf
Description
Summary:Featured Application: This article belongs to the Section Computing and Artificial Intelligence. Ineffective protein feature representation poses problems in protein classification in hierarchical structures. Discrete wavelet transform (DWT) is a feature representation method which generates global and local features based on different wavelet families and decomposition levels. To represent protein sequences, the proper wavelet family and decomposition level must be selected. This paper proposed a hybrid optimization method using particle swarm optimization and the firefly algorithm (FAPSO) to choose the suitable wavelet family and decomposition level of wavelet transformation for protein feature representation. The suggested approach improved on the work of earlier researchers who, in most cases, manually selected the wavelet family and level of decomposition based solely on experience and not on data. The paper also applied the virtual class methods to overcome the error propagation problems in hierarchical classification. The effectiveness of the proposed method was tested on a G-Protein Coupled Receptor (GPCR) protein data set consisting of 5 classes at the family level, 38 classes at the subfamily level, and 87 classes at the sub-subfamily level. Based on the result obtained, the most selected wavelet family and decomposition level chosen to represent GPCR classes by FAPSO are Biorthogonal wavelets and decomposition level 1, respectively. The experimental results show that the representation of GPCR protein using the FAPSO algorithm with virtual classes can yield 97.9%, 86.9%, and 81.3% classification accuracy at the family, subfamily, and sub-subfamily levels, respectively. In conclusion, the result shows that the selection of optimized wavelet family and decomposition level by the FAPSO algorithm, and the virtual class method can be potentially used as the feature representation method and a hierarchical classification method for GPCR protein. © 2022 by the authors.
ISSN:20763417
DOI:10.3390/app122312011