Yıl: 2020 Cilt: 8 Sayı: 2 Sayfa Aralığı: 94 - 101 Metin Dili: İngilizce İndeks Tarihi: 31-10-2020

NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH

Öz:
Opinion Mining (OM) works on transferring the online available opinions into useful knowledge. In this paper, a novel opinionmining system of reviews in Turkish has been presented. The proposed system utilizes Word2Vec, which is one of the states of the art text feature extraction method, along with an ensemble learning algorithm for classification. The challenging and benchmark “IMDB Movies Reviews” dataset has been used for conducting the experimental comparison and verification. In addition, the performance of the proposed method is compared to some of the well-known machine learning algorithms like Support Vector Machine (SVM), K-Nearest Neighbours (KNN), and Naive Bayes (NB). The tested ensemble methods are the Random Forest (RF), AdaBoost Classifier, and Gradient-Boosting Classifier (GBC). The results of the conducted experiments using the dataset have shown that the performance of SVM, KNN, and NB are comparable. However, the performance, robustness, and stability of the system have been significantly improved by adapting the RF ensemble learning, along with the Word2Vec feature vector, and suitable pre-processing operations on the data. In addition, the proposed method is compared to one of the states of art ensemble methods and have shown superior performance with respect to it.
Anahtar Kelime:

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • [1]Social Media Examiner, "2018 Social media marketing industry report", Social media examiner, 2019 [Online]. Available: http://www.socialmediaexaminer.com/ report2016/.[Accessed: 20.1.2019]
  • [2]Liu, B., & Zhang, L. (2012). A survey of opinion mining and sentiment analysis. In mining text data. Springer, Boston, MA, 415-463.
  • [3]Mäntylä, M. V., Graziotin, D., & Kuutila, M. (2018). The evolution of sentiment analysis—A review of research topics, venues, and top cited papers.Computer Science Review, 27, 16-32.
  • [4]Pradhan, V. M., Vala, J., & Balani, P. (2016). A survey on Sentiment Analysis Algorithms for opinion mining. International Journal of Computer Applications, 133(9), 7-11.
  • [5]Bing Liu. (May 2012). Sentiment Analysis and Opinion Mining, Morgan & Claypool Publishers.
  • [6]Riaz, S., Fatima, M., Kamran, M., & Nisar, M. W., “Opinion mining on large scale data using sentiment analysisand k-means clustering”, Cluster Computing, 22(3), pp.7149-7164, 2019.
  • [7]Aishwarya, R., et al, "A Novel Adaptable Approach for Sentiment Analysis", International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 5 (2), 2019.
  • [8]Neviarouskaya, A., Prendinger, H., and Ishizuka, M. (2015). Attitude Sensing in Text Based on A Compositional Linguistic Approach. Computational Intelligence, 31(2), 256–300.
  • [9]Esuli, A., & Sebastiani, F. (2006, May). Sentiwordnet: A publicly available lexical resource for opinion mining. In LREC, 6, 417-422.
  • [10]Baccianella, S., Esuli, A., & Sebastiani, F. (2010, May). Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In Lrec 10 (2010), 2200-2204.
  • [11]Korovkinas, K., Danėnas, P., & Garšva, G. (2017). SVM and Naïve Bayes Classification Ensemble Method for Sentiment Analysis. Baltic Journal of Modern Computing, 5(4), 398-409.
  • [12]Eroğul, U. (2009). Sentiment analysis in Turkish. Master’s thesis. Middle East Technical University, Ankara.
  • [13]Dehkharghani, R., Yanikoglu, B., Saygin, Y., & Oflazer, K. (2017). Sentiment analysis in Turkish at different granularity levels. Natural Language Engineering, 23(4), 535-559.
  • [14]Vural, A. G., Cambazoglu, B. B., Senkul, P., & Tokgoz, Z. O. (2013). A framework for sentiment analysis in turkish: Application to polarity detection of movie reviews in turkish. In Computer and Information Sciences III (pp. 437-445). Springer, London.
  • [15]Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., & Kappas, A. (2011). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 62(2), 419.
  • [16]Türkmenoglu, C., & Tantug, A. C. (2014, June). Sentiment analysis in Turkish media. In Proceedings of Workshop on Issues of Sentiment Discovery and Opinion Mining, International Conference on Machine Learning (ICML), Beijing, China.
  • [17]Catal, C. and Nangir, M., 2017. A sentiment classification model based on multiple classifiers. Applied Soft Computing, 50, pp.135-141.
  • [18]Shehu, H. A., & Tokat, S. (2019, April). A hybrid approach for the sentiment analysis of Turkish Twitter data. In The International Conference on Artificial Intelligence and Applied Mathematics in Engineering (pp. 182-190). Springer, Cham.
  • [19]Dehkharghani, R., Saygin, Y., Yanikoglu, B., & Oflazer, K. (2016). SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Language Resources and Evaluation, 50(3), 667-685.
  • [20]Demirtas, E., & Pechenizkiy, M. (2013, August). Cross-lingual polarity detection withmachine translation. In Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining (p. 9). ACM.
  • [21]Ucan, A., Naderalvojoud, B., Sezer, E. A., & Sever, H. (2016, January). SentiWordNet for new language: automatic translation approach. In 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS) (pp. 308-315). IEEE.
  • [22]CUNNINGHAM, Padraig; DELANY, Sarah Jane. k-Nearest neighbour classifiers. Multiple Classifier Systems, 2007, 34.8: 1-17.
  • [23]NIKHATH, A. Kousar; SUBRAHMANYAM, K.; VASAVI, R. Building a K-Nearest Neighbor Classifier for Text Categorization. International Journal of Computer Science and Information Technologies, 2016, 7.1: 254-256.
  • [24]FRANK, Eibe; BOUCKAERT, Remco R. Naive bayes for text classification with unbalanced classes. In: European Conference on Principles of Data Mining and Knowledge Discovery. Springer, Berlin, Heidelberg, 2006. p. 503-510.
  • [25]DIETTERICH, Thomas G. Ensemble methods in machine learning. In: International workshop on multiple classifier systems. Springer, Berlin, Heidelberg, 2000. p. 1-15.
  • [26]DADGAR, Seyyed Mohammad Hossein; ARAGHI, Mohammad Shirzad; FARAHANI, Morteza Mastery. A novel text mining approach based on TF-IDF and Support Vector Machine for news classification. In: 2016 IEEE International Conference on Engineering and Technology (ICETECH). IEEE, 2016. p. 112-116.
  • [27]ONAN, Aytuğ; KORUKOĞLU, Serdar; BULUT, Hasan. Ensemble of keyword extraction methods and classifiers in text classification. Expert Systems with Applications, 2016, 57: 232-247.
  • [28]Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123-140.
  • [29]RODRIGUEZ, Juan José; KUNCHEVA, Ludmila I.; ALONSO, Carlos J. Rotation forest: A new classifier ensemble method. IEEE transactions on pattern analysis and machine intelligence, 2006, 28.10: 1619-1630.
  • [30]FRIEDMAN, Jerome H. Stochastic gradient boosting. Computational statistics & data analysis, 2002, 38.4: 367-378.
  • [31]C Hans, M Agus, and D Suhartono,”Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF),” ComTech: Computer, Mathematics and Engineering Applications, vol. 7, no. 4, pp. 285-294, 2016.
  • [32]Mikolov, T., Chen, K., Corrado, G. S., Dean, J., Sutskever, L., & Zweig, G. (2013). word2vec. URL https://code.google.com/p/word2vec.
  • [33]Mikolov, T., Grave, E., Bojanowski, P., Puhrsch, C., & Joulin, A. (2017). Advances in pre-training distributed word representations. arXiv preprint arXiv:1712.09405.
  • [34]P Jeffrey, R Socher, and C Manning,”Glove: Global vectors for wordrepresentation,” In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532-1543. 2014.
  • [35]J Armand, E Grave, P Bojanowski, M Douze, H Jégou, and T Mikolov,”Fasttext. Zip: Compressing text classification models,” arXiv preprint arXiv: 1612.03651, 2016.
  • [36]Rodriguez-Galiano, V. F., Chica-Olmo, M., Abarca-Hernandez, F., Atkinson, P. M., & Jeganathan, C. (2012). Random Forest classification of Mediterranean land cover using multi-seasonal imagery and multi-seasonal texture. Remote Sensing of Environment, 121, 93-107.
APA Abdul Hafez A (2020). NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH. , 94 - 101.
Chicago Abdul Hafez Abdul Hafez NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH. (2020): 94 - 101.
MLA Abdul Hafez Abdul Hafez NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH. , 2020, ss.94 - 101.
AMA Abdul Hafez A NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH. . 2020; 94 - 101.
Vancouver Abdul Hafez A NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH. . 2020; 94 - 101.
IEEE Abdul Hafez A "NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH." , ss.94 - 101, 2020.
ISNAD Abdul Hafez, Abdul Hafez. "NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH". (2020), 94-101.
APA Abdul Hafez A (2020). NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH. International Journal of Intelligent Systems and Applications in Engineering, 8(2), 94 - 101.
Chicago Abdul Hafez Abdul Hafez NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH. International Journal of Intelligent Systems and Applications in Engineering 8, no.2 (2020): 94 - 101.
MLA Abdul Hafez Abdul Hafez NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH. International Journal of Intelligent Systems and Applications in Engineering, vol.8, no.2, 2020, ss.94 - 101.
AMA Abdul Hafez A NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH. International Journal of Intelligent Systems and Applications in Engineering. 2020; 8(2): 94 - 101.
Vancouver Abdul Hafez A NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH. International Journal of Intelligent Systems and Applications in Engineering. 2020; 8(2): 94 - 101.
IEEE Abdul Hafez A "NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH." International Journal of Intelligent Systems and Applications in Engineering, 8, ss.94 - 101, 2020.
ISNAD Abdul Hafez, Abdul Hafez. "NOVEL OPINION MINING SYSTEMFOR MOVIE RVIEWSIN TURKISH". International Journal of Intelligent Systems and Applications in Engineering 8/2 (2020), 94-101.