A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study

Kabakus, Abdullah Talha

doi:10.35377/saucis.03.03.776573

A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study

Abdullah Talha KABAKUŞ (Düzce Üniversitesi, Düzce, Türkiye)

Sakarya University Journal of Computer and Information Sciences (Online)

5 1

Yıl: 2020 Cilt: 3 Sayı: 3 Sayfa Aralığı: 169 - 182 Metin Dili: İngilizce DOI: 10.35377/saucis.03.03.776573 İndeks Tarihi: 15-05-2021

A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study

Öz:

Deep learning, a subfield of machine learning, has proved its efficacy on a wide range of applications includingbut not limited to computer vision, text analysis and natural language processing, algorithm enhancement,computational biology, physical sciences, and medical diagnostics by producing results superior to the state-ofthe-art approaches. When it comes to the implementation of deep neural networks, there exist various state-of-theart platforms. Starting from this point of view, a qualitative and quantitative comparison of the state-of-the-artdeep learning platforms is proposed in this study in order to shed light on which platform should be utilized forthe implementations of deep neural networks. Two state-of-the-art deep learning platforms, namely, (𝑖𝑖) Keras, and(𝑖𝑖𝑖𝑖) PyTorch were included in the comparison within this study. The deep learning platforms were quantitativelyexamined through the models based on three most popular deep neural networks, namely, (𝑖𝑖) Feedforward NeuralNetwork (FNN), (𝑖𝑖𝑖𝑖) Convolutional Neural Network (CNN), and (𝑖𝑖𝑖𝑖𝑖𝑖) Recurrent Neural Network (RNN). Themodels were evaluated on three evaluation metrics, namely, (𝑖𝑖) training time, (𝑖𝑖𝑖𝑖) testing time, and (𝑖𝑖𝑖𝑖𝑖𝑖) predictionaccuracy. According to the experimental results, while Keras provided the best performance for both FNNs andCNNs, PyTorch provided the best performance for RNNs expect for one evaluation metric, which was the testingtime. This experimental study should help deep learning engineers and researchers to choose the most suitableplatform for the implementations of their deep neural networks.

Anahtar Kelime:

En Gelişkin Derin Öğrenme Platformlarının Bir Karşılaştırması: Deneysel Bir Çalışma

Öz:

Makine öğrenmesinin bir alt alanı olan derin öğrenme, bilgisayarlı görü, metin analizi ve doğal dil işleme, algoritma iyileştirme, hesaplamalı biyoloji, fen bilimleri ve hastalık teşhisi alanlarıyla sınırlı olmamak kaydıyla çok çeşitli uygulamalar üzerindeki etkinliğini en gelişkin yaklaşımlardan daha başarılı sonuçlar üreterek kanıtlamıştır. Derin sinir ağlarının gerçekleştiriminde çeşitli en gelişkin platformlar mevcuttur. Bu noktadan hareketle, derin sinir ağların gerçekleştiriminde hangi platformun kullanılması gerektiğine ışık tutmak amacıyla en gelişkin derin öğrenme platformlarının nitel ve nicel bir karşılaştırması bu çalışmada öne sürülmüştür. Bu çalışma kapsamındaki karşılaştırmaya iki en gelişkin derin öğrenme platformu, isim olarak, (𝑖𝑖) Keras ve (𝑖𝑖𝑖𝑖) PyTorch dahil edilmiştir. Derin öğrenme platformları en popüler üç derin sinir ağı olan (𝑖𝑖) İleri Beslemeli Sinir Ağı (FNN), (𝑖𝑖𝑖𝑖) Evrişimli Sinir Ağı (CNN) ve (𝑖𝑖𝑖𝑖𝑖𝑖) Tekrarlayan Sinir Ağı (RNN) temelli modeller üzerinden incelenmiştir. Modeller, (𝑖𝑖) eğitim süresi, (𝑖𝑖𝑖𝑖) test süresi ve (𝑖𝑖𝑖𝑖𝑖𝑖) tahmin doğruluğu olmak üzere üç değerlendirme kriteri kullanılarak değerlendirilmiştir. Elde edilen deneysel sonuçlara göre hem FNN hem de CNN’ler için en iyi performansı Keras sağlarken, RNN’ler için bir değerlendirme kriteri (test süresi) dışında en iyi performansı PyTorch sağlamıştır. Bu deneysel çalışma, derin öğrenme mühendisleri ve araştırmacılarının kendi derin öğrenme ağlarının gerçekleştiriminde en uygun platformun seçimi noktasında yardım etmesi gerekmektedir.

Anahtar Kelime:

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık

[1] O. Russakovsky et al., “ImageNet Large Scale Visual Recognition Challenge,” Int. J. Comput. Vis., vol. 115, pp. 211–252, 2015, doi: 10.1007/s11263-015-0816-y.
[2] N. Brancati, G. De Pietro, M. Frucci, and D. Riccio, “A Deep Learning Approach for Breast Invasive Ductal Carcinoma Detection and Lymphoma Multi-Classification in Histological Images,” IEEE Access, vol. 7, pp. 44709–44720, 2019, doi: 10.1109/ACCESS.2019.2908724.
[3] Y. Weng, F. Bell, H. Zheng, and G. Tur, “OCC: A Smart Reply System for Efficient In-App Communications,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19), 2019, pp. 1–8, doi: 10.1145/3292500.3330694.
[4] W. G. Hatcher and W. Yu, “A Survey of Deep Learning: Platforms, Applications and Emerging Research Trends,” IEEE Access, vol. 6, pp. 24411–24432, 2018, doi: 10.1109/ACCESS.2018.2830661.
[5] X. W. Chen and X. Lin, “Big Data Deep Learning: Challenges and Perspectives,” IEEE Access, vol. 2, pp. 514–525, 2014, doi: 10.1109/ACCESS.2014.2325029.
[6] N. D. Nguyen, T. Nguyen, and S. Nahavandi, “System Design Perspective for Human-Level Agents Using Deep Reinforcement Learning: A Survey,” IEEE Access, vol. 5, pp. 27091–27102, 2017, doi: 10.1109/ACCESS.2017.2777827.
[7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (NIPS’12), 2012, pp. 1097–1105.
[8] M. Nielsen, “Neural Networks and Deep Learning,” 2019. http://neuralnetworksanddeeplearning.com (accessed Sep. 03, 2020).
[9] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929– 1958, 2014.
[10] V. Nair and G. E. Hinton, “Rectified Linear Units Improve Restricted Boltzmann Machines,” in Proceedings of the 27th International Conference on Machine Learning (ICML 2010), 2010, pp. 807–814.
[11] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nat. Methods, vol. 521, pp. 436–444, 2015, doi: 10.1038/nmeth.3707.
[12] P. Goldsborough, “A Tour of TensorFlow,” arXiv Prepr., vol. 1610.01178, pp. 1–16, 2016.
[13] L. Rampasek and A. Goldenberg, “TensorFlow: Biology’s Gateway to Deep Learning?,” Cell Syst., vol. 2, no. 1, pp. 12–14, 2016, doi: 10.1016/j.cels.2016.01.009.
[14] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” in Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), 2015, pp. 448–456.
[15] F. Chollet, “Keras: the Python deep learning API,” 2015. https://keras.io (accessed Sep. 03, 2020).
[16] A. Paszke et al., “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” in Proceedings of the Thirty-third Conference on Neural Information Processing Systems (NIPS 2019), 2019, pp. 8026–8037.
[17] Y. Jia et al., “Caffe: Convolutional Architecture for Fast Feature Embedding,” in Proceedings of the 22nd ACM International Conference on Multimedia (MM 2014), 2014, pp. 675–678, doi: 10.1145/2647868.2654889.
[18] R. Al-Rfou, “Theano: A Python framework for fast computation of mathematical expressions,” arXiv Prepr., vol. 1605.02688, pp. 1–19, 2016.
[19] “The Microsoft Cognitive Toolkit,” Microsoft, 2017. https://docs.microsoft.com/enus/cognitive-toolkit/ (accessed Aug. 02, 2020).
[20] L. Liu, Y. Wu, W. Wei, W. Cao, S. Sahin, and Q. Zhang, “Benchmarking Deep Learning Frameworks: Design Considerations, Metrics and Beyond,” in Proceedings of the 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS 2018), 2018, pp. 1258–1269, doi: 10.1109/ICDCS.2018.00125.
[21] M. Abadi et al., “TensorFlow: A System for Large-Scale Machine Learning,” in Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2016), 2016, pp. 265–283.
[22] R. Collobert, K. Kavukcuoglu, and C. Farabet, “Torch7: A Matlab-like Environment for Machine Learning,” in Proceedings of the Twenty-fifth Conference on Neural Information Processing Systems (NIPS 2011), 2011, pp. 1–6.
[23] S. Bahrampour, N. Ramakrishnan, L. Schott, and M. Shah, “Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning,” in Proceedings of the 4th International Conference on Learning Representations (ICLR 2016), 2016, pp. 1–11, doi: 10.1227/01.NEU.0000297044.82035.57.
[24] “NervanaSystems/neon: Intel® NervanaTM reference deep learning framework committed to best performance on all hardware,” Intel, 2015. https://github.com/NervanaSystems/neon (accessed Aug. 02, 2020).
[25] S. Shi, Q. Wang, P. Xu, and X. Chu, “Benchmarking State-of-the-Art Deep Learning Software Tools,” in Proceedings of the 2016 7th International Conference on Cloud Computing and Big Data (CCBD 2016), 2016, pp. 99–104, doi: 10.1109/CCBD.2016.029.
[26] S. Chintala, “Easy benchmarking of all publicly accessible implementations of convnets,” 2017. https://github.com/soumith/convnet-benchmarks (accessed Aug. 02, 2020).
[27] M. Marcus, B. Santorini, and M. Marcinkiewicz, “Building a Large Annotated Corpus of English: The Penn Treebank,” Comput. Linguist., vol. 19, no. 2, pp. 313–330, 1993.
[28] A. Shatnawi, G. Al-Bdour, R. Al-Qurran, and M. Al-Ayyoub, “A Comparative Study of Open Source Deep Learning Frameworks,” in Proceedings of the 2018 9th International Conference on Information and Communication Systems (ICICS 2018), 2018, pp. 72–77, doi: 10.1109/IACS.2018.8355444.
[29] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, 1998, doi: 10.1109/5.726791.
[30] A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” 2009. doi: 10.1.1.222.9220.
[31] V. Kovalev, A. Kalinovsky, and S. Kovalev, “Deep Learning with Theano, Torch, Caffe, TensorFlow, and Deeplearning4J: Which One Is the Best in Speed and Accuracy?,” in Proceedings of the 13th International Conference on Pattern Recognition and Information Processing (PRIP 2016), 2016, pp. 99–103.
[32] “Deeplearning4j: Deep Learning for Java,” Konduit, 2020. https://deeplearning4j.org (accessed Aug. 02, 2020).
[33] “NVIDIA cuDNN,” NVIDIA, 2020. https://developer.nvidia.com/cudnn (accessed Aug. 02, 2020).
[34] A. Vedaldi and K. Lenc, “MatConvNet: Convolutional Neural Networks for MATLAB,” in Proceedings of the 23rd ACM International Conference on Multimedia (MM’15), 2015, pp. 689– 692.
[35] F. Chollet, Deep Learning with Python. Manning Publications, 2017.
[36] N. Ketkar, Deep Learning with Python. Springer, 2017.
[37] S. Chintala, “Roadmap for torch and pytorch,” 2017. https://discuss.pytorch.org/t/roadmap-fortorch-and-pytorch/38/2 (accessed Aug. 02, 2020).
[38] B. Hayes, “Programming Languages Most Used and Recommended by Data Scientists,” Business Over Broadway, 2019. https://businessoverbroadway.com/2019/01/13/programminglanguages-most-used-and-recommended-by-data-scientists/ (accessed Aug. 02, 2020).
[39] “Caffe2 and PyTorch join forces to create a Research + Production platform PyTorch 1.0,” 2018. https://caffe2.ai/blog/2018/05/02/Caffe2_PyTorch_1_0.html (accessed Aug. 02, 2020).
[40] T. E. Oliphant, A Guide to NumPy. Trelgol Publishing, 2006.
[41] Y. Bengio, “MILA and the future of Theano,” 2017. https://groups.google.com/forum/#!msg/theano-users/7Poq8BZutbY/rNCIfvAEAwAJ (accessed Aug. 02, 2020).
[42] D. Yu et al., “An Introduction to Computational Networks and the Computational Network Toolkit,” 2015. [Online]. Available: https://www.microsoft.com/en-us/research/wpcontent/uploads/2014/08/CNTKBook-20160217.pdf.
[43] “CNTK v2.7 Release Notes,” Microsoft Research, 2019. https://docs.microsoft.com/enus/cognitive-toolkit/releasenotes/cntk_2_7_release_notes (accessed Aug. 02, 2020).
[44] “Google Trends,” Google, 2020. https://trends.google.com/trends (accessed Aug. 02, 2020).
[45] “Colaboratory,” Google, 2020. https://colab.research.google.com (accessed Sep. 03, 2020).
[46] O. Y. Al-Jarrah, P. D. Yoo, S. Muhaidat, G. K. Karagiannidis, and K. Taha, “Efficient Machine Learning for Big Data: A Review,” Big Data Res., vol. 2, no. 3, pp. 87–93, 2015, doi: 10.1016/j.bdr.2015.04.001.
[47] T. Condie, P. Mineiro, N. Polyzotis, and M. Weimer, “Machine learning on Big Data,” in Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE 2013), 2013, pp. 1242–1244.
[48] D. C. Cireşan, U. Meier, L. M. Gambardella, and J. Schmidhuber, “Deep, Big, Simple Neural Nets for Handwritten Digit Recognition,” Neural Comput., vol. 22, no. 12, pp. 3207–3220, 2010, doi: 10.1162/NECO_a_00052.
[49] D. P. Kingma and J. L. Ba, “Adam: A Method for Stochastic Optimization,” in Proceeding of the 3rd International Conference on Learning Representations (ICLR 2015), 2015, pp. 1–15.
[50] H. Robbins and S. Monro, “A Stochastic Approximation Method,” Ann. Math. Stat., vol. 22, no. 3, pp. 400–407, 1951, doi: 10.1214/aoms/1177729586.
[51] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv Prepr., pp. 1–14, 2014, [Online]. Available: http://arxiv.org/abs/1409.1556.
[52] H. Wang, Y. Zhang, and X. Yu, “An Overview of Image Caption Generation Methods,” Comput. Intell. Neurosci., vol. 2020, pp. 1–13, 2020, doi: 10.1155/2020/3062706.
[53] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning Word Vectors for Sentiment Analysis,” 2011.

APA	Kabakus A (2020). A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study. , 169 - 182. 10.35377/saucis.03.03.776573
Chicago	Kabakus Abdullah Talha A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study. (2020): 169 - 182. 10.35377/saucis.03.03.776573
MLA	Kabakus Abdullah Talha A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study. , 2020, ss.169 - 182. 10.35377/saucis.03.03.776573
AMA	Kabakus A A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study. . 2020; 169 - 182. 10.35377/saucis.03.03.776573
Vancouver	Kabakus A A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study. . 2020; 169 - 182. 10.35377/saucis.03.03.776573
IEEE	Kabakus A "A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study." , ss.169 - 182, 2020. 10.35377/saucis.03.03.776573
ISNAD	Kabakus, Abdullah Talha. "A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study". (2020), 169-182. https://doi.org/10.35377/saucis.03.03.776573

APA	Kabakus A (2020). A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study. Sakarya University Journal of Computer and Information Sciences (Online), 3(3), 169 - 182. 10.35377/saucis.03.03.776573
Chicago	Kabakus Abdullah Talha A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study. Sakarya University Journal of Computer and Information Sciences (Online) 3, no.3 (2020): 169 - 182. 10.35377/saucis.03.03.776573
MLA	Kabakus Abdullah Talha A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study. Sakarya University Journal of Computer and Information Sciences (Online), vol.3, no.3, 2020, ss.169 - 182. 10.35377/saucis.03.03.776573
AMA	Kabakus A A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study. Sakarya University Journal of Computer and Information Sciences (Online). 2020; 3(3): 169 - 182. 10.35377/saucis.03.03.776573
Vancouver	Kabakus A A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study. Sakarya University Journal of Computer and Information Sciences (Online). 2020; 3(3): 169 - 182. 10.35377/saucis.03.03.776573
IEEE	Kabakus A "A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study." Sakarya University Journal of Computer and Information Sciences (Online), 3, ss.169 - 182, 2020. 10.35377/saucis.03.03.776573
ISNAD	Kabakus, Abdullah Talha. "A Comparison of the State-of-the-Art Deep Learning Platforms: An Experimental Study". Sakarya University Journal of Computer and Information Sciences (Online) 3/3 (2020), 169-182. https://doi.org/10.35377/saucis.03.03.776573