Mansur Alp TOCOĞLU
(Manisa Celal Bayar Üniversitesi, Yazılım Mühendisliği Bölümü, Manisa, Türkiye)
Yıl: 2020Cilt: 3Sayı: 3ISSN: 2636-8129Sayfa Aralığı: 296 - 308İngilizce

52 0
Sentiment Analysis for Software Engineering Domain in Turkish
The focus of this study is to provide a model to be used for the identification of sentiments of comments abouteducation and profession life of software engineering in social media and microblogging sites. Such a pre-trainedmodel can be useful to evaluate students’ and software engineers’ feedbacks about software engineering. Thisproblem is considered as a supervised text classification problem, which thereby requires a dataset for the trainingprocess. To do so, a survey is conducted among students of a software engineering department. In the classificationphase, we represent the corpus by using conventional and word-embedding text representation schemes and yieldaccuracy, recall and precision results by using conventional supervised machine learning classifiers and wellknown deep learning architectures. In the experimental analysis, first we focus on achieving classification resultsby using three conventional text representation schemes and three N-gram models in conjunction with fiveclassifiers (i.e., naïve bayes, k-nearest neighbor algorithm, support vector machines, random forest and logisticregression). In addition, we evaluate the performances of three ensemble learners and three deep learningarchitectures (i.e. convolutional neural network, recurrent neural network, and long short-term memory). Theempirical results indicate that deep learning architectures outperform conventional supervised machine learningclassifiers and ensemble learners.
DergiAraştırma MakalesiErişime Açık
  • [1] B. Pang and L. Lee, “Opinion mining and sentiment analysis,” Found. Trends Inf. Retr., pp.1– 135, 2008.
  • [2] E. Fersini, E. Messina, and F. A. Pozzi, “Sentiment analysis: Bayesian Ensemble Learning,” Decis. Support Syst., vol. 68, pp.26–38, 2014.
  • [3] B. Lin, F. Zampetti, G. Bavota, M. Di Penta, M. Lanza, and R. Oliveto, “Sentiment Analysis for Software Engineering: How Far CanWe Go?”, Proc. - 40th International Conference on Software Engineering, pp. 94–104, 2018.
  • [4] E. Guzman, D. Azócar, and Y. Li, “Sentiment Analysis of Commit Comments in GitHub: An Empirical Study,” Proc. - 11thWorking Conference on Mining Software Repositories, pp. 352– 355, 2014.
  • [5] M. Goul, O. Marjanovic, S. Baxley, and K. Vizecky, “Managing the Enterprise Business Intelligence App Store: Sentiment Analysis Supported Requirements Engineering,” Proc. - 45th Hawaii International Conference on System Sciences, pp. 4168–4177, 2012.
  • [6] M. Ortu, B. Adams, G. Destefanis, P. Tourani, M. Marchesi, and R. Tonelli, “Are Bullies More Productive? Empirical Study of Affectiveness vs. Issue Fixing Time,” Proc. - 12th Working Conference on Mining Software Repositories, pp. 303–313, 2015.
  • [7] F. Calefato, F. Lanubile, and N. Novielli, “EmoTxt: A Toolkit for Emotion Recognition from Text,” Proc. - 7th International Conference on Affective Computing and Intelligent Interaction, pp. 79–80, 2017.
  • [8] M. Goul, O. Marjanovic, S. Baxley, and K. Vizecky, “Managing the Enterprise Business Intelligence App Store: Sentiment Analysis Supported Requirements Engineering,” Proc. - 45th Hawaii International Conference on System Sciences, pp. 4168–4177, 2012.
  • [9] L. V. G. Carreno and K. Winbladh, “Analysis of User Comments: An Approach for Software Requirements Evolution,” Proc. - 35th International Conference on Software Engineering, pp. 582–591, 2013.
  • [10] E. Guzman, O. Aly, and B. Bruegge, “Retrieving Diverse Opinions from App Reviews”, Proc. - 9th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp.21–30, 2015.
  • [11] M. Thelwall, K. Buckley, G. Paltoglou, D. Cai, and A. Kappas, “Sentiment in short strength detection informal text,” J. Am. Soc. Inf. Sci. Technol., vol. 61, no. 12, pp. 2544–2558, 2010.
  • [12] S. Panichella, A. D. Sorbo, E. Guzman, C. A. Visaggio,G. Canfora, and . C. Gall, “How Can I Improve My App? Classifying User Reviews for Software Maintenance and Evolution,” Proc. - 31st International Conference on Software Maintenance and Evolution, pp. 281–290, 2015.
  • [13] E. Guzman, R. Alkadhi, and N. Seyff, “An exploratory study of Twitter messages about software applications,” Requir. Eng., vol. 22, pp. 387–412, 2017.
  • [14] F. Calefato, F. Lanubile, F. Maiorano, and N. Novielli, “Sentiment polarity detection for software development,” Empir. Software Eng., vol. 23, pp. 1352–1382, 2018.
  • [15] L. Zhao, and A Zhao, “Sentiment analysis based requirement evolution prediction,” Future Internet, vol. 11, no. 2, article no. 5, 2019.
  • [16] F. Sağlam, H. Sever and B. Genç, “Developing Turkish Sentiment Lexicon for Sentiment Analysis using Online News Media,” Proc. - 13th International Conference of Computer Systems and Applications, pp. 1–5, 2016.
  • [17] K. Bayraktar, U. Yavanoglu and A. Ozbilen, “A Rule-Based Holistic Approach for Turkish Aspect-Based Sentiment Analysis,” Proc. - IEEE International Conference on Big Data, pp. 2154–2158, 2019.
  • [18] M. Rumelli, D. Akkuş, Ö. Kart and Z. Isik, “Sentiment Analysis in Turkish Text with Machine Learning Algorithms,” Proc. - Innovations in Intelligent Systems and Applications Conference, pp. 1–5, 2019.
  • [19] B. Ciftci and M. S. Apaydin, “A Deep Learning Approach to Sentiment Analysis in Turkish,” Proc. - International Conference on Artificial Intelligence and Data Processing, pp. 1–5, 2018.
  • [20] A. A. Karcioğlu and T. Aydin, “Sentiment Analysis of Turkish and English Twitter Feeds Using Word2Vec Model,” Proc. - 27th Signal Processing and Communications Applications Conference, pp. 1–4, 2019.
  • [21] D. Ayata, M. Saraçlar and A. Özgür, “Turkish Tweet Sentiment Analysis with Word Embedding and Machine Learning,” Proc. - 25th Signal Processing and Communications Applications Conference, pp. 1–4, 2017.
  • [22] A. Onan, “Mining opinions from instructor evaluation reviews: A deep learning approach,” Comput. Appl. Eng. Educ., vol. 28, no. 1, pp. 117–138, 2020.
  • [23] E. Stamatatos, “A survey of modern authorship attribution methods,” J. Am. Soc. Inf. Sci. Technol., vol. 60, no. 3, pp. 538–556, 2009.
  • [24] M. F. Porter, “Snowball: A language for stemming algorithms,” 2001.
  • [25] S. Bird, and E. Loper, “NLTK : The Natural Language Toolkit NLTK : The Natural Language Toolkit,” Proc. - Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, pp. 63–70, 2016.
  • [26] C. C. Aggarwal and C. X. Zhai, “A survey of text clustering algorithms,” in Mining Text Data, pp.77–128, 2012.
  • [27] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” Proc. - Advances in Neural Information Processing Systems, pp. 3111–3119, 2013.
  • [28] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A Neural Probabilistic Language Model,” 2003. J. Mach. Learn. Research, vol. 3, pp. 1137–1155, 2003.
  • [29] H. Zhang, “The Optimality of Naive Bayes,” Proc. - 17th International Florida Artificial Intelligence Research Society Conference, pp. 562–567, 2004.
  • [30] C. Cortes and V. Vapnik, “Support-Vector Networks,” Mach. Learn., vol. 20, no. 3, pp. 273– 297, 1995.
  • [31] L. Breiman, “Random forests,” Mach. Learn., vol. 45, pp. 5–32, 2001.
  • [32] M. Kantardzic, Data Mining: Concepts, Models, Methods, and Algorithms: Second Edition. Wiley, Hoboken, 2011.
  • [33] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Proc. - 25th International Conference on Neural Information Processing Systems, pp. 1097-1105, 2012.
  • [34] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
  • [35] X. Li et al., “Long short-term memory neural network for air pollutant concentration predictions: Method development and evaluation,” Environ. Pollut., vol. 231, pp. 997–1004, 2017.
  • [36] A. Onan, S. Korukoǧlu, and H. Bulut, “Ensemble of keyword extraction methods and classifiers in text classification,” Expert Syst. Appl., vol. 57, pp. 232–247, 2016.
  • [37] Z.H. Zhou, “Ensemble Methods: Foundations and Algorithm,” UK: CRC Press, 2012.
  • [38] L. Breiman, “Bagging predictors,” Mach. Learn., vol. 24, pp. 123–140, 1996.
  • [39] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
  • [40] NLPL word embeddings repository, “word embeddings repository homepage,” 2017. [Online]. Available: http://vectors.nlpl.eu/repository/. [Accessed: 25-Nov-2020].
  • [41] W. Yin, K. Kann, M. Yu, and H. Schutze, “Comparative study of CNN and RNN for natural language processing,” arXiv preprint arXiv:1702.01923, 2017.
  • [42] D. Tang, B. Qin, and T. Liu, “Document Modeling with Gated Recurrent Neural Network for Sentiment Classification,” Proc. - Conference on Empirical Methods in Natural Language Processing, pp. 1422–1432, 2015.
  • [43] Y. N. Dauphin, A. Fan, M. Auli, and D. Grangier, “Language Modeling with Gated Convolutional Networks,” arXiv preprint arXiv:1612.08083, 2016.

TÜBİTAK ULAKBİM Ulusal Akademik Ağ ve Bilgi Merkezi Cahit Arf Bilgi Merkezi © 2019 Tüm Hakları Saklıdır.