Yıl: 2019 Cilt: 27 Sayı: 6 Sayfa Aralığı: 4102 - 4117 Metin Dili: İngilizce DOI: 10.3906/elk-1812-18 İndeks Tarihi: 20-05-2020

A hybrid feature-selection approach for finding the digital evidence of web application attacks

Öz:
The most critical challenge of web attack forensic investigations is the sheer amount of data and levelof complexity. Machine learning technology might be an efficient solution for web attack analysis and investigation.Consequently, machine learning applications have been applied in various areas of information security and digitalforensics, and have improved over time. Moreover, feature selection is a crucial step in machine learning; in fact,selecting an optimal feature subset could enhance the accuracy and performance of the predictive model. To date,there has not been an adequate approach to select optimal features for the evidence of web attack. In this study, ahybrid approach that selects the relevant web attack features by combining the filter and wrapper methods is proposed.This approach has been validated by experimental measurements on 3 web attack datasets. The results show that ourproposed approach can find the evidence with high recall, high accuracy, and low error rates. We believe that the resultspresented herein may help us to improve accuracy and recall of machine learning techniques; particularly, in the field ofweb attack investigation. The tools that use this approach may help digital forensic professionals and law enforcementin finding the evidence much more efficiently and faster.
Anahtar Kelime:

Konular: Mühendislik, Elektrik ve Elektronik Bilgisayar Bilimleri, Yazılım Mühendisliği Bilgisayar Bilimleri, Sibernitik Bilgisayar Bilimleri, Bilgi Sistemleri Bilgisayar Bilimleri, Donanım ve Mimari Bilgisayar Bilimleri, Teori ve Metotlar Bilgisayar Bilimleri, Yapay Zeka
Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • [1] Tully G. Codes of Practice and Conduct for Forensic Science Providers and Practitioner in the Criminal Justice System. UK: The Forensic Science Regulator, 2017.
  • [2] Palmer G. A road map for digital forensic research. In: Proceedings of the 1st Digital Forensics Research Workshop Conference; Utica, NY, USA; 2001. pp. 1-42.
  • [3] James JI, Gladyshev P. Challenges with automation in digital forensic investigations. arXiv: Computers and Society, 2013.
  • [4] Katzir Z, Elovici Y. Quantifying the resilience of machine learning classifiers used for cyber security. Expert Systems with Applications 2018; 92: 419-429.
  • [5] Dong Y, Zhang Y, Ma H, Wu Q, Liu Q et al. An adaptive system for detecting malicious queries in web attacks. Science China Information Sciences 2018; 61 (3): 32-114.
  • [6] Santhosh RP, Silambarasan G, Scholar MP. Role of data mining in cyber security. International Journal of Engineering Science and Computing 2017; 7 (7): 13932.
  • [7] Tianfield H. Data mining based cyber-attack detection. System Simulation Technology 2017; 13 (2). 90-104.
  • [8] McWhirter PR, Kifayat K, Shi Q, Askwith B. SQL injection attack classification through the feature extraction of SQL query strings using a gap-weighted string subsequence kernel. Journal of Information Security and Applications 2018; 40: 199-216.
  • [9] Zamani M, Movahedi M. Machine learning techniques for intrusion detection. arXiv:Cryptography and Security, 2013.
  • [10] Pu W, Jun-Qing W. Intrusion detection system with the data mining technologies. In: IEEE 2001 3rd International Conference on Communication Software and Networks; Xi’an, China; 2011. pp. 490-492.
  • [11] Choras M, Kozik R. Machine learning techniques applied to detect cyber attacks on web applications. Logic Journal of the IGPL 2014; 23 (1): 45-56.
  • [12] Goodison SE, Davis RC, Jackson BA. Digital evidence and the U.S. criminal justice system: identifying technology and other needs to more effectively acquire and utilize digital evidence. USA: RAND Corporation, 2015.
  • [13] Ashcroft J. A guide for first responders. USA: United States Department of Justice Off. Justice, 2001.
  • [14] Pichan A, Lazarescu M, Soh ST. Towards a practical cloud forensics logging framework. Journal of Information Security and Applications 2018; 42: 18-28.
  • [15] Šuteva N, Mileva A, Loleski M. Finding forensic evidence for several web attacks. International Journal of Internet Technology and Secured Transactions 2015; 6 (1): 64.
  • [16] Hraiz S. Challenges of digital forensic investigation in cloud computing. In: ICIT 2017 8th International Conference on Information Technology; Amman, Jordan; 2017. pp. 568-571.
  • [17] Kyaw AK, Sioquim F, Joseph J. Dictionary attack on wordpress: security and forensic analysis. In: 2015 2nd International Conference on Information Security and Cyber Forensics, InfoSec; Cape Town, South Africa; 2015. pp. 158-164.
  • [18] Khobragade PK, Malik LG. Data generation and analysis for digital forensic application using data mining. In: 2014 4th International Conference on Communication Systems and Network Technologies; Bhopal, India; 2014. pp. 458-462.
  • [19] Meyer R. Detecting Attacks on Web Applications from Log-files. USA: SANS Institute, 2008.
  • [20] Seyvar BM, Catak FO, Gul E. Detection of attack-targeted scans from the apache HTTP Server access logs. Applied Computing and Informatics 2018; 14: 28-36.
  • [21] Lu Q, Li X, Dong Y. Structure preserving unsupervised feature selection. Neurocomputing 2018; 301: 36-45.
  • [22] Guyon I. An introduction to variable and feature selection. Journal of Machine Learning Research 2003; 3: 1157- 1182.
  • [23] Cai J, Luo J, Wang S, Yang S. Feature selection in machine learning: a new perspective. Neurocomputing 2018; 300: 70-79.
  • [24] Hu L, Gao W, Zhao K, Zhang P, Wang F. Feature selection considering two types of feature relevancy and feature interdependency. Expert Systems with Applications 2018; 93: 423-434.
  • [25] Hidayah MSN, Faizal MA, Selamat SR, Fadhlee MDR, Ramzi WYWA. Revealing the feature influence in HTTP botnet detection. International Journal of Communication Networks and Information Security 2017; 9 (2): 274-281.
  • [26] Kruegel C, Vigna G, Robertson W. A multi-model approach to the detection of web-based attacks. Computer Networks 2005; 48 (5): 717-738.
  • [27] Robertson W, Vigna G, Kruegel C, Kemmerer R. Using generalization and characterization techniques in the anomaly-based detection of web attacks. In: Proceedings of the 13th Symposium on Network and Distributed System Security; San Diego, CA, USA; 2006. pp. 15.
  • [28] Nguyen HT, Torrano-Gimenez C, Alvarez G, Petrović S, Franke K. Application of the generic feature selection measure in detection of web attacks. Lecture Notes in Computer Science. Berlin, Heidelberg, Germany: Springer, 2011.
  • [29] Atienza D, Herrero Á, Corchado E. Neural analysis of HTTP traffic for web attack detection. Advances in Intelligent Systems and Computing 2015; 369: 201-212.
  • [30] Torrano-Gimenez C, Nguyen HT, Alvarez G, Petrovic S, Franke K. Applying feature selection to payload-based web application firewalls. In: Proceedings of the 3rd International Workshop on Security and Communication Networks; Gjovik, Norway; 2011. pp. 75-81.
  • [31] Torrano-Gimenez C, Nguyen HT, Alvarez G, Franke K. Combining expert knowledge with automatic feature extraction for reliable web attack detection. Security and Communication Networks 2015; 8 (16): 2750-2767.
  • [32] Zhang Z, George R, Shujaee K. Efficient detection of anomolous HTTP payloads in networks. In: Conference Proceedings of IEEE SOUTHEASTCON; Norfolk, VA, USA; 2016. pp. 1-3.
  • [33] Choi JH, Choi C, Ko BK, Kim PK. Detection of cross site scripting attack in wireless networks using n-Gram and SVM. Mobile Information Systems 2012; 8 (3): 275-286.
  • [34] Wressnegger C, Schwenk G, Arp D, Rieck K. A close look on n-grams in intrusion detection. In: Proceedings of the 2013 ACM Workshop on Artificial Intelligence and Security AISec13; Berlin, Germany; 2013. pp. 67-76.
  • [35] Nascimento G, Correia M. Anomaly-based intrusion detection in software as a service. In: Proceedings of the International Conference on Dependable Systems and Networks; Hong Kong; 2011. pp. 19-24.
  • [36] Torrano Giménez C, Pérez Villegas A GÁM. Http dataset CSIC 2010. Spain: Information Security Institute of CSIC Spanish Research National Council, 2010.
  • [37] Gallagher B, Eliassi-Rad T. Classification of HTTP Attacks: A Study on the ECML/PKDD 2007 Discovery Challenge. USA: U.S. Department of Energy Lawrence National Laboratory, 2008.
  • [38] Habibi Lashkari A, Draper Gil G, Mamun MSI, Ghorbani AA. Characterization of tor traffic using time based features. In: Proceedings of the 3rd International Conference on Information Systems Security and Privacy; Porto, Portugal; 2017. pp. 253-262.
  • [39] Fayyad UM, Irani KB. Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence; Chambery, France; 1993. pp. 1022-1027.
  • [40] Sharafaldin I, Habibi Lashkari A, Ghorbani AA. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of the 4th International Conference on Information Systems Security and Privacy; Portugal; 2018. pp. 108-116.
  • [41] Hall M. Correlation-based feature selection for machine learning. PhD, University of Waikato, Hamilton, New Zealand, 1999.
  • [42] Nguyen HT, Torrano-Gimenez C, Aalvarez G, Franke K, Petrović S. Enhancing the effectiveness of web application firewalls by generic feature selection. Logic Journal of the IGPL 2013; 21 (4): 560-570.
  • [43] Althubiti S, Yuan X, Esterline A. Analyzing HTTP requests for web intrusion detection. In: 2017 KSU Conference on Cybersecurity Education Research and Practice; Kennesaw State University, GA, USA; 2017.
  • [44] Pham TS, Hoang TH, Vu VC. Machine learning techniques for web intrusion detection-a comparison. In: 2016 8th International Conference on Knowledge and Systems Engineering, KSE 2016; Hanoi, Vietnam; 2016. pp. 291-297.
  • [45] Rafaland K, Choras M. A Proposal of algorithm for web applications cyber attack detection. In: IFIP International Conference on Computer Information Systems and Industrial Management; Ho Chi Minh City, Vietnam; 2014. pp. 680-687.
APA BABIKER M, Karaarslan E, HOŞCAN Y (2019). A hybrid feature-selection approach for finding the digital evidence of web application attacks. , 4102 - 4117. 10.3906/elk-1812-18
Chicago BABIKER Mohammed,Karaarslan Enis,HOŞCAN Yaşar A hybrid feature-selection approach for finding the digital evidence of web application attacks. (2019): 4102 - 4117. 10.3906/elk-1812-18
MLA BABIKER Mohammed,Karaarslan Enis,HOŞCAN Yaşar A hybrid feature-selection approach for finding the digital evidence of web application attacks. , 2019, ss.4102 - 4117. 10.3906/elk-1812-18
AMA BABIKER M,Karaarslan E,HOŞCAN Y A hybrid feature-selection approach for finding the digital evidence of web application attacks. . 2019; 4102 - 4117. 10.3906/elk-1812-18
Vancouver BABIKER M,Karaarslan E,HOŞCAN Y A hybrid feature-selection approach for finding the digital evidence of web application attacks. . 2019; 4102 - 4117. 10.3906/elk-1812-18
IEEE BABIKER M,Karaarslan E,HOŞCAN Y "A hybrid feature-selection approach for finding the digital evidence of web application attacks." , ss.4102 - 4117, 2019. 10.3906/elk-1812-18
ISNAD BABIKER, Mohammed vd. "A hybrid feature-selection approach for finding the digital evidence of web application attacks". (2019), 4102-4117. https://doi.org/10.3906/elk-1812-18
APA BABIKER M, Karaarslan E, HOŞCAN Y (2019). A hybrid feature-selection approach for finding the digital evidence of web application attacks. Turkish Journal of Electrical Engineering and Computer Sciences, 27(6), 4102 - 4117. 10.3906/elk-1812-18
Chicago BABIKER Mohammed,Karaarslan Enis,HOŞCAN Yaşar A hybrid feature-selection approach for finding the digital evidence of web application attacks. Turkish Journal of Electrical Engineering and Computer Sciences 27, no.6 (2019): 4102 - 4117. 10.3906/elk-1812-18
MLA BABIKER Mohammed,Karaarslan Enis,HOŞCAN Yaşar A hybrid feature-selection approach for finding the digital evidence of web application attacks. Turkish Journal of Electrical Engineering and Computer Sciences, vol.27, no.6, 2019, ss.4102 - 4117. 10.3906/elk-1812-18
AMA BABIKER M,Karaarslan E,HOŞCAN Y A hybrid feature-selection approach for finding the digital evidence of web application attacks. Turkish Journal of Electrical Engineering and Computer Sciences. 2019; 27(6): 4102 - 4117. 10.3906/elk-1812-18
Vancouver BABIKER M,Karaarslan E,HOŞCAN Y A hybrid feature-selection approach for finding the digital evidence of web application attacks. Turkish Journal of Electrical Engineering and Computer Sciences. 2019; 27(6): 4102 - 4117. 10.3906/elk-1812-18
IEEE BABIKER M,Karaarslan E,HOŞCAN Y "A hybrid feature-selection approach for finding the digital evidence of web application attacks." Turkish Journal of Electrical Engineering and Computer Sciences, 27, ss.4102 - 4117, 2019. 10.3906/elk-1812-18
ISNAD BABIKER, Mohammed vd. "A hybrid feature-selection approach for finding the digital evidence of web application attacks". Turkish Journal of Electrical Engineering and Computer Sciences 27/6 (2019), 4102-4117. https://doi.org/10.3906/elk-1812-18