TY - GEN
T1 - An Image Pixel Interval Power (IPIP) Method Using Deep Learning Classification Models
AU - Anorboev, Abdulaziz
AU - Musaev, Javokhir
AU - Hong, Jeongkyu
AU - Nguyen, Ngoc Thanh
AU - Hwang, Dosam
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - The implementation of deep learning (DL) in various fields is becoming common. In addition, demand for higher accuracy models is increasing continuously at the same rates as the growth of other fields of science. Using all DL tools in the development of computer vision (CV) is a fundamental aspect of its future development. Considering all such tools, we conducted research on the effect of data representation on the final classification accuracy and proposed image pixels’ double representation (IPDR) and image pixels’ multiple representations (IPMR) for skipping certain pixels in the images in a dataset. Because the image pixel values range from 0 to 255, we proposed including all knowledge from different intervals of pixels. With IPDR, we trained the model using a dataset and obtained the prediction probabilities for the classification task. Next, we created two different datasets from an existing dataset. The first dataset took only image pixels of higher than 127, with all other image pixels in the dataset changed to zeros. The second dataset took only image pixels equal to or lower than 127. These two created datasets were trained on the same model architecture and their prediction accuracies for classification were ensembled with the prediction accuracies of the main model. With the IPMR method, we applied the same method as previously described, although instead of two intervals, from 0 to 127, and 127 to 255, we used, multiple intervals of 50 (i.e., [0:50], (50:100], (100:150], (150:200], and (200:255]) for the Cifar10 dataset. The number of intervals depends on the dataset, and applying our method, we achieved 89.46%, 98.90%, and 73, 38% accuracies on the Fashion MNIST, MNIST, and Cifar10 datasets, respectively, whereas their original classification accuracies under classic training were 89.27%, 98.65%, and 71.29%, respectively. As the advantage of using this method, it can be applied to any classification task and gives only extra knowledge on the trained data. As another simplicity of this method, it can be used with other DL ensemble models simultaneously.
AB - The implementation of deep learning (DL) in various fields is becoming common. In addition, demand for higher accuracy models is increasing continuously at the same rates as the growth of other fields of science. Using all DL tools in the development of computer vision (CV) is a fundamental aspect of its future development. Considering all such tools, we conducted research on the effect of data representation on the final classification accuracy and proposed image pixels’ double representation (IPDR) and image pixels’ multiple representations (IPMR) for skipping certain pixels in the images in a dataset. Because the image pixel values range from 0 to 255, we proposed including all knowledge from different intervals of pixels. With IPDR, we trained the model using a dataset and obtained the prediction probabilities for the classification task. Next, we created two different datasets from an existing dataset. The first dataset took only image pixels of higher than 127, with all other image pixels in the dataset changed to zeros. The second dataset took only image pixels equal to or lower than 127. These two created datasets were trained on the same model architecture and their prediction accuracies for classification were ensembled with the prediction accuracies of the main model. With the IPMR method, we applied the same method as previously described, although instead of two intervals, from 0 to 127, and 127 to 255, we used, multiple intervals of 50 (i.e., [0:50], (50:100], (100:150], (150:200], and (200:255]) for the Cifar10 dataset. The number of intervals depends on the dataset, and applying our method, we achieved 89.46%, 98.90%, and 73, 38% accuracies on the Fashion MNIST, MNIST, and Cifar10 datasets, respectively, whereas their original classification accuracies under classic training were 89.27%, 98.65%, and 71.29%, respectively. As the advantage of using this method, it can be applied to any classification task and gives only extra knowledge on the trained data. As another simplicity of this method, it can be used with other DL ensemble models simultaneously.
KW - Image pixel double representation
KW - Image pixel multiple representation
KW - Model ensemble
KW - Prediction scope
UR - http://www.scopus.com/inward/record.url?scp=85145162927&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-21743-2_16
DO - 10.1007/978-3-031-21743-2_16
M3 - Conference contribution
AN - SCOPUS:85145162927
SN - 9783031217425
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 196
EP - 208
BT - Intelligent Information and Database Systems - 14th Asian Conference, ACIIDS 2022, Proceedings
A2 - Nguyen, Ngoc Thanh
A2 - Trawiński, Bogdan
A2 - Nguyen, Ngoc Thanh
A2 - Tran, Tien Khoa
A2 - Tukayev, Ualsher
A2 - Hong, Tzung-Pei
A2 - Szczerbicki, Edward
PB - Springer Science and Business Media Deutschland GmbH
T2 - 14th Asian Conference on Intelligent Information and Database Systems , ACIIDS 2022
Y2 - 28 November 2022 through 30 November 2022
ER -