A particular vision on deep learning: Perspective analysis and projection of our research

Download PDF

Article Type: Short Commentary, Volume 1, Issue 1

A particular vision on deep learning: Perspective analysis and projection of our research

Roberto Rodríguez Morales*

Institute of Cybernetics, Mathematics & Physics (ICIMAF), Havana 10 400, Cuba.

*Corresponding author : Roberto Rodríguez Morales
MoralesInstitute of Cybernetics, Mathematics & Physics (ICIMAF), Havana 10 400, Cuba.

Email: rrm@icimaf.cu
Received: Sep 02, 2024
Accepted: Oct 03, 2024
Published Online: Oct 10, 2024
Journal: Journal of Artificial Intelligence & Robotics

Copyright: © Rodríguez Morales R (2024). This Article is distributed under the terms of Creative Commons Attribution 4.0 International License.

Citation: Rodríguez Morales R. A particular vision on deep learning: Perspective analysis and projection of our research. J Artif Intell Robot. 2024; 1(1): 1006.

Abstract

In this short communication, we present a particular vision on deep learning from the point of view of the research perspective carried out by our group. The advantages and disadvantages of DL are briefly discussed, and a reflection on DL is carried out in comparison with a well-established machine learning technique, and taking the database size as reference. Finally, a mention is made on a hybrid structure of convolutional neural networks where we propose to intercalate filtering layers.

Keywords: Machine learning; Deep learning; Supervised learning; Convolutional neural networks; Neural network architectures.

Introduction

In the last decade, machine learning, and especially its subfield of deep learning, has achieved remarkably impressive results, many of them superior to state-of-the-art approaches. Deep learning has proven its effectiveness in a wide range of applications including, among others, computer vision, natural language processing, text analysis, computational biology, physical sciences, autonomous car motion and medical diagnostics.

The intensive use of DL for many imaging applications is because DL does not need any human-designed rules to work; DL rather, uses a large amount of data to establish and map a given input to specific relations. In classification tasks, DL has the ability of automatically learning a feature set from input data, unlike to ML (where it requires several sequential steps; pre-processing, feature extraction, wise feature selection, among other). In the field of medical sciences, the successes of DL have been outstanding, improving the quality of human life in a remarkable way with an additional accuracy in diagnosis of diseases, in estimation of natural disasters, in the discovery of new drugs, among many other good results. The literature has confirmed that the average disease diagnosis accuracy of a DL network has been superior to that of specialist physicians.

The main disadvantage of DL is that it requires of large databases to carry out the training of the Convolutional Neural Networks (CNNs). When the database is small, it is true that there are many numerical methods and transformations that can be used to augment the database [1], which for the purposes of any work can be effective, but in real medical imaging problems it is best to try to have large real databases. These transformations can result in poor quality training and an undesired result, especially in problems of classification and recognition. We used other way to increase the database through filtering by using the mean shift iterative algorithm [2]. This is also an extenuating of the problem, not a solution.

Supervised learning requires a set of ground truth data and prior knowledge of the intended outcome of the data. To work with a DL-based prediction and classification algorithm, a sufficient amount of training samples is required. In many cases, such as in medical data analysis, specialists lack a suitable big dataset. For this reason, traditional machine learning methods should not always be discarded, especially those that have proven to be efficient.

Comparison of CNNs with support vector machine (SVM)

In some works, a comparison of CNNs with some classical machine learning methods has been carried out. In particular, in [3] it was found that the obtained results with CNNs were superior in terms of training time and accuracy than when using K-NN and XGBoos. In the paper, it was verified that even the results with CNN were better without Dropout.

However, in recent research we have found, especially when the database is not large enough, that the results obtained using support vector machine have been very similar to those achieved with DL, and in some cases superior in terms of shorter training time and higher prediction accuracy [4,5].

These experimental works do not pretend to reach definitive conclusions, as these are only the beginning of deeper researches that require a continuous growth of the database, so that it can be verified if both models (DL and SVM) follow the same trend in terms of the result in the evaluation metrics; or if there is an inflection point from which the growth of the database makes the DL start to outperform the SVM. However, it is a reality that established machine learning methods cannot be completely discarded.

Convolutional neural network architecture

An example of basic convolutional neural network architecture is shown in Figure 1, which it was that we used in [5]. With this deep convolutional neural network architecture, we detected COVID or non-COVID features from chest radiographic images (CT scans).

A typical CNN architecture for a classification consists of one (or more) convolutional layers and alternating pooling layers that perform feature extraction and subsampling. The final output features of those layers are flattened and Fully Connected (FC) to all nodes of the output layer to carry out classification.

Currently, our group is performing experiments to modify such a structure to a hybrid one, by intercalating a type of filtering in each convolution layer, and taking as a starting point a theorem that we proposed in 2017 [6] and that we are recently generalizing.

It has been common practice to develop advanced techniques to train a large neural network in order to overcome overfitting and improve prediction performance. This has been achieved by adding an additional step between some training steps in the layers. Two important techniques that have contributed greatly in this direction to the development of deep learning are Dropout and Batch normalization [7, 8]. In the case of the Batch normalization, simply standardizes the mini-batch of output linearly calculated values in each layer to mean “0” and variance “1” prior to activation function [3].

Currently, we are working on proving the generalization of the theorem I mentioned. In our filtering experiments, after each convolution, a more pronounced convergence in training should be achieved, since the entropy in the whole system will be decreasing.

In future work, we will deepen these experiments, increasing the database, analyzing the trend of models and trying to find an inflection point that determines when the CNN model outperforms the ML model.

Figure 1: Architecture of a CNN model.

References

Md K Islama, Sultana U Habibaa, Tahsin A Khana, Farzana Tasnimb. COV- RadNet: A Deep Convolutional Neural Network for Automatic Detection of COVID-19 from Chest X-Rays and CT Scans, Computer Methods and Programs in Biomedicine Update. 2022; 2: 100064. https://doi.org/10.1016/j.cmpbup.2022.100064.
Rodríguez R, Garcés Y, Torres E, Sossa H, Tovar R. A vision from a physical point of view and the information theory on the image segmentation, Journal of Intelligent & Fuzzy Systems. IOS Press. 2019; 37: 2835-2845. DOI: 10.3233/JIFS-190030. https://content.iospress.com/articles/journal-of-intelligent-and-fuzzy- systems/ifs190030.
Hagyeong L, Jongwoo S. Introduction to convolutional neural network using Keras; an understanding from a statistician, Communications for Statistical Applications and Methods. 2019; 26(6): 591-610. https://doi.org/10.29220/CSAM.2019.26.6.591.
Laura Brito, Roberto Rodríguez. Classification of some epidemics through microscopic images by using deep learning. Comparison, Imaging and Radiation Research. 2024; 6(1). https://doi.org/10.24294/irr.v6i1.5451.
Roberto R, Anthony L, Laura B, Rocio C. An Experimental Study for Automatic Detection of COVID-19 from Chest CT Scans. A Comparison through Deep Learning and Support Vector Machine, Journal of Mathematical Techniques and Computational Mathematics. 2024; 3(8): 4.
R Rodríguez, E Torres, J H Sossa, Y Garcés. A new stopping criterion for the mean shift iterative algorithm. Its use in image segmentation, International Journal of Imaging and Robotics. 2017; 17(2). ISSN: 2231-525X.
Hinton G E, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR. Improving neural networks by preventing co-adaptation of feature detectors, arXiv: 1207.0580. 2012.
Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning. 2015; 37: 448-456. arXiv: 1502.03167.