Global Vectors Representation of Protein Sequences and Its Application for Predicting Self-Interacting Proteins with Multi-Grained Cascade Forest Model.

Global Vectors Representation of Protein Sequences and Its Application for Predicting Self-Interacting Proteins with Multi-Grained Cascade Forest Model.

Chen, Zhan-Heng;You, Zhu-Hong;Zhang, Wen-Bo;Wang, Yan-Bin;Cheng, Li;Alghazzawi, Daniyal;
genes 2019 Vol. 10
181
chen2019globalgenes

Abstract

Self-interacting proteins (SIPs) is of paramount importance in current molecular biology. There have been developed a number of traditional biological experiment methods for predicting SIPs in the past few years. However, these methods are costly, time-consuming and inefficient, and often limit their usage for predicting SIPs. Therefore, the development of computational method emerges at the times require. In this paper, we for the first time proposed a novel deep learning model which combined natural language processing (NLP) method for potential SIPs prediction from the protein sequence information. More specifically, the protein sequence is de novo assembled by . Then, we obtained the global vectors representation for each protein sequences by using natural language processing (NLP) technique. Finally, based on the knowledge of known self-interacting and non-interacting proteins, a multi-grained cascade forest model is trained to predict SIPs. Comprehensive experiments were performed on and datasets, which obtained an accuracy rate of 91.45% and 93.12%, respectively. From our evaluations, the experimental results show that the use of amino acid semantics information is very helpful for addressing the problem of sequences containing both self-interacting and non-interacting pairs of proteins. This work would have potential applications for various biological classification problems.

Access

Citation

ID: 66008
Ref Key: chen2019globalgenes
Use this key to autocite in SciMatic or Thesis Manager

References

Blockchain Verification

Account:
NFT Contract Address:
0x95644003c57E6F55A65596E3D9Eac6813e3566dA
Article ID:
66008
Unique Identifier:
E924
Network:
Scimatic Chain (ID: 481)
Loading...
Blockchain Readiness Checklist
Authors
Abstract
Journal Name
Year
Title
5/5
Creates 1,000,000 NFT tokens for this article
Token Features:
  • ERC-1155 Standard NFT
  • 1 Million Supply per Article
  • Transferable via MetaMask
  • Permanent Blockchain Record
Blockchain QR Code
Scan with Saymatik Web3.0 Wallet

Saymatik Web3.0 Wallet