Yıl: 2021 Cilt: 45 Sayı: 2 Sayfa Aralığı: 138 - 148 Metin Dili: İngilizce DOI: 10.3906/biy-2009-4 İndeks Tarihi: 29-07-2022

Prediction of host-pathogen protein interactions by extended network model

Öz:
Knowledge of the pathogen-host interactions between the species is essentialin order to develop a solution strategy againstinfectious diseases. In vitro methods take extended periods of time to detect interactions and provide very few of the possible interactionpairs. Hence, modelling interactions between proteins has necessitated the development of computational methods. The main scope ofthis paper is integrating the known protein interactions between thehost and pathogen organisms to improve the prediction success rateof unknown pathogen-host interactions. Thus, the truepositive rate of the predictions was expected to increase.In order to perform thisstudy extensively, encoding methods and learning algorithms of several proteins were tested. Along with human as the host organism,two different pathogen organisms were used in the experiments. For each combination of protein-encoding and prediction method,both the original prediction algorithms were tested using only pathogen-host interactions and the same methodwas testedagain afterintegrating the known protein interactions within each organism. The effect of merging the networks of pathogen-host interactions ofdifferent species on the prediction performance of state-of-the-art methods was also observed. Successwas measured in terms of Matthews correlation coefficient, precision, recall, F1 score, and accuracy metrics. Empirical results showed that integrating the host andpathogen interactions yields better performance consistently in almost all experiments.
Anahtar Kelime: machine learning bioinformatics protein networks host-pathogen interactions Infectious diseases protein-protein interactions

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • Baldi P, Brunak S (2001). Bioinformatics: the Machine Learning Approach. Cambridge, MA, USA: MIT Press.
  • Bhargava N, Sharma G, Bhargava R, Mathuria M (2013). Decision tree analysis on j48 algorithm for data mining. Proceedings of International Journal of Advanced Research in Computer Science and Software Engineering 3 (6).
  • Bhasin M, Raghava GPS (2004). Classification of nuclear receptors based on amino acid composition and dipeptide composition. Journal of Biological Chemistry 279 (22): 23262-23266. doi: 10.1074/jbc.M401932200.
  • Bock JR, Gough DA (2001). Predicting protein--protein interactions from primary structure. Bioinformatics 17 (5): 455-460.
  • Bock JR, Gough DA (2003). Whole-proteome interaction mining. Bioinformatics 19 (1): 125-134.
  • Breiman L (2001). Random forests. Machine Learning 45 (1): 5-32. doi: 10.1023/A:1010933404324.
  • Calderone A, Licata L, Cesareni G (2014). VirusMentha: a new resource for virus-host protein interactions. Nucleic Acids Research 43 (D1): D588-D592. doi: 10.1093/nar/gku830.
  • Chen J, Liu H, Yang J,Chou KC (2007). Prediction of linear B-cell epitopes using amino acid pair antigenicity scale. Amino Acids 33 (3): 423-428. doi: 10.1007/s00726-006-0485-9.
  • Cleary JG, Trigg LE (1995). K*: an instance-based learner using an entropic distance measure. In: Proceedings of the 12th International Conference on Machine Learning; Tahoe City, CA, USA. pp. 108-114.
  • Dasarathy BV (1991). Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Tutorial.
  • Davis J, Goadrich M (2006). The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning; Pittsburgh, PA, USA. pp. 233-240.
  • De Bodt S, Proost S, Vandepoele K, Rouzé P, De Peer Y (2009). Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and coexpression. BMC Genomics 10 (1): 288.
  • Durmuş Tekir S, Çakır T, Ardıç E, Sayılırbaş AS, Konuk G et al. (2013). PHISTO: pathogen--host interaction search tool. Bioinformatics 29 (10): 1357-1358.
  • Dyer MD, Murali TM, Sobral BW (2011). Supervised learning and prediction of physical interactions between human and HIV proteins. Infection, Genetics and Evolution 11 (5): 917-923.
  • Friedman N, Geiger D, Goldszmidt M (1997). Bayesian network classifiers. Machine Learning 29 (2-3): 131-163.
  • Guirimand T, Delmotte S, Navratil V (2015). VirHostNet 2.0: surfing on the web of virus/host molecular interactions data. Nucleic Acids Research 43 (D1): D583-D587.
  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P et al. (2009). The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11 (1): 10-18.
  • Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K (2017). KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Research 45 (D1): D353-D361.
  • Kösesoy I, Gök M, Öz C (2019). A new sequence based encoding for prediction of host–pathogen protein interactions. Computational Biology and Chemistry 78: 170-177. doi: 10.1016/j.compbiolchem.2018.12.001.
  • Kösesoy İ, Gök M, Öz C (2018). PROSES: A web server for sequencebased protein encoding. Journal of Computational Biology 25 (108. doi: 10.1089/cmb.2018.0049.
  • Kshirsagar M, Carbonell JG, Klein-Seetharaman J, Murugesan K (2016). Multitask matrix completion for learning protein interactions across diseases. In: Singh M (editor). Research in Computational Molecular Biology. RECOMB 2016. Lecture Notes in Computer Science, Vol. 9649. Cham, Switzerland: Springer International Publishing, pp. 53-64. doi: 10.1007/978- 3-319-31957-5_4.
  • Kshirsagar M, Carbonell J, Klein-Seetharaman J (2013a). Multisource transfer learning for host-pathogen protein interaction prediction in unlabeled tasks. NIPS Workshop on Machine Learning for Computational Biology (1): 3-6.
  • Kshirsagar M, Carbonell J, Klein-Seetharaman J (2013b). Multitask learning for host-pathogen protein interactions. Bioinformatics 29 (13): i217-i226. doi: 10.1093/bioinformatics/btt245.
  • Martin S, Roe D, Faulon JL (2005). Predicting protein--protein interactions using signature products. Bioinformatics 21 (2): 218-226.
  • Mei S (2013). Probability weighted ensemble transfer learning for predicting interactions between HIV-1 and human proteins. PLoS ONE 8 (11). doi: 10.1371/journal.pone.0079606.
  • Mondal KC, Pasquier N, Mukhopadhyay A, Da Costa Pereira C, Maulik U et al. (2012). Prediction of protein interactions on HIV-1-human PPI data using a novel closure-based integrated approach. In: Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms; Vilamoura, Portugal. pp. 164-173.
  • Mukhopadhyay A, Maulik U, Bandyopadhyay S, Eils R (2010). Mining association rules from HIV-human protein interactions. In: International Conference on Systems in Medicine and Biology; Kharagpur, India. pp. 344-348. doi: 10.1109/ICSMB.2010.5735401.
  • Muralidharan V, Sugumaran V (2012). A comparative study of Naïve Bayes classifier and Bayes net classifier for fault diagnosis of monoblock centrifugal pump using wavelet analysis. Applied Soft Computing 12 (8): 2023-2029. doi: 10.1016/j. asoc.2012.03.021.
  • Naghavi M, Wang H, Lozano R, Davis A, Liang X (2015). Global, regional, and national age-sex specific all-cause and causespecific mortality for 240 causes of death, 1990-2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet 385 (9963): 117-171.
  • Nanni L (2005). Fusion of classifiers for predicting protein--protein interactions. Neurocomputing 68: 289-296.
  • Nourani E, Khunjush F, Durmus S, Durmus S (2015). Computational approaches for prediction of pathogen-host protein-protein interactions. Frontiers in Microbiology 6: 94. doi: 10.3389/ fmicb.2015.00094.
  • Orchard S, Ammari M, Aranda B, Breuza L, Briganti L (2013). The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Research 42 (D1): D358-D363. doi: 10.1093/nar/gkt1115.
  • Qi Y, Tastan O, Carbonell JG, Klein-Seetharaman J, Weston J (2010). Semi-supervised multi-task learning for predicting interactions between HIV-1 and human proteins. Bioinformatics 26 (18): i645-i652. doi: 10.1093/bioinformatics/btq394.
  • Ray S, Mukhopadhyay A, Maulik U (2012). Predicting annotated HIV-1-Human PPIs using a biclustering approach to association rule mining. In: Third International Conference on Emerging Applications of Information Technology; Kolkata, India. pp. 28-31.
  • Shen J, Zhang J, Luo X, Zhu W (2007). Predicting protein–protein interactions based only on sequences information. Proceedings of the National Academy of Sciences of the United States of America 104 (11): 4337-4341. doi: 10.1073/pnas.0607879104.
  • Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S et al. (2016). The STRING database in 2017: quality-controlled protein--protein association networks, made broadly accessible. Nucleic Acids Research 45 (D1): D362-D368. doi: 10.1093/nar/gkw937.
  • Wattam AR, Davis JJ, Assaf R, Boisvert S, Brettin T (2017). Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center. Nucleic Acids Research 45 (D1): D535-D542.
  • Wu X, Zhu L, Guo J, Zhang DY,Lin K (2006). Prediction of yeast protein-protein interaction network: insights from the gene ontology and annotations. Nucleic Acids Research 34 (7): 2137-2150. doi: 10.1093/nar/gkl219.
  • Xu Q, Xiang EW, Yang Q (2010). Protein-protein interaction prediction via collective matrix factorization. In: Proceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); Hong Kong, China. pp. 62-67. doi: 10.1109/BIBM.2010.5706537.
  • You ZH, Chan KCC, Hu P (2015). Predicting protein-protein interactions from primary protein sequences using a novel multi-scale local feature representation scheme and the random forest. PLoS ONE 10 (5): e0125811. doi: 10.1371/ journal.pone.0125811.
  • Zhou H, Jin J, Wong L (2013). Progress in computational studies of host-pathogen interactions. Journal of Bioinformatics and Computational Biology 11 (2): 1230001. doi: 10.1142/ S0219720012300018
APA Kösesoy İ, Gök M, Kahveci T (2021). Prediction of host-pathogen protein interactions by extended network model. , 138 - 148. 10.3906/biy-2009-4
Chicago Kösesoy İrfan,Gök Murat,Kahveci Tamer Prediction of host-pathogen protein interactions by extended network model. (2021): 138 - 148. 10.3906/biy-2009-4
MLA Kösesoy İrfan,Gök Murat,Kahveci Tamer Prediction of host-pathogen protein interactions by extended network model. , 2021, ss.138 - 148. 10.3906/biy-2009-4
AMA Kösesoy İ,Gök M,Kahveci T Prediction of host-pathogen protein interactions by extended network model. . 2021; 138 - 148. 10.3906/biy-2009-4
Vancouver Kösesoy İ,Gök M,Kahveci T Prediction of host-pathogen protein interactions by extended network model. . 2021; 138 - 148. 10.3906/biy-2009-4
IEEE Kösesoy İ,Gök M,Kahveci T "Prediction of host-pathogen protein interactions by extended network model." , ss.138 - 148, 2021. 10.3906/biy-2009-4
ISNAD Kösesoy, İrfan vd. "Prediction of host-pathogen protein interactions by extended network model". (2021), 138-148. https://doi.org/10.3906/biy-2009-4
APA Kösesoy İ, Gök M, Kahveci T (2021). Prediction of host-pathogen protein interactions by extended network model. Turkish Journal of Biology, 45(2), 138 - 148. 10.3906/biy-2009-4
Chicago Kösesoy İrfan,Gök Murat,Kahveci Tamer Prediction of host-pathogen protein interactions by extended network model. Turkish Journal of Biology 45, no.2 (2021): 138 - 148. 10.3906/biy-2009-4
MLA Kösesoy İrfan,Gök Murat,Kahveci Tamer Prediction of host-pathogen protein interactions by extended network model. Turkish Journal of Biology, vol.45, no.2, 2021, ss.138 - 148. 10.3906/biy-2009-4
AMA Kösesoy İ,Gök M,Kahveci T Prediction of host-pathogen protein interactions by extended network model. Turkish Journal of Biology. 2021; 45(2): 138 - 148. 10.3906/biy-2009-4
Vancouver Kösesoy İ,Gök M,Kahveci T Prediction of host-pathogen protein interactions by extended network model. Turkish Journal of Biology. 2021; 45(2): 138 - 148. 10.3906/biy-2009-4
IEEE Kösesoy İ,Gök M,Kahveci T "Prediction of host-pathogen protein interactions by extended network model." Turkish Journal of Biology, 45, ss.138 - 148, 2021. 10.3906/biy-2009-4
ISNAD Kösesoy, İrfan vd. "Prediction of host-pathogen protein interactions by extended network model". Turkish Journal of Biology 45/2 (2021), 138-148. https://doi.org/10.3906/biy-2009-4