Oral cancer is a major health problem requiring accurate healthcare support systems, and Deep learning (DL) based medical imaging has proven to be an effective solution. This work addresses the oral cancer classification task by employing different convolutional architectures. Our goal is to improve the classification tasks by incorporating segmentation information. We propose two segment-driven strategies to strengthen the traditional classification training. The first one involves training a dedicated neural network (NN) to predict masks, which are then used to classify masked images to hide unuseful information. Specifically, we introduce an approach relying on soft-masks to weigh the contribution of each pixel to the final classification against the already proposed hard-mask strategy. The second proposed approach involves training the NN via CrossEntropyIoU, a loss function consisting of the CrossEntropy for identifying the correct label, and the Intersection over Union measuring the mismatch between the activation map and the mask. Experiments show that implementing segment-driven strategies enhances accuracy and training speed using both convolutional and transformer architectures.
Improving oral cancer classification via segment-driven photographic deep learning imaging
Parola M.;Malaspina E.;Cimino M. G. C. A.;
2025-01-01
Abstract
Oral cancer is a major health problem requiring accurate healthcare support systems, and Deep learning (DL) based medical imaging has proven to be an effective solution. This work addresses the oral cancer classification task by employing different convolutional architectures. Our goal is to improve the classification tasks by incorporating segmentation information. We propose two segment-driven strategies to strengthen the traditional classification training. The first one involves training a dedicated neural network (NN) to predict masks, which are then used to classify masked images to hide unuseful information. Specifically, we introduce an approach relying on soft-masks to weigh the contribution of each pixel to the final classification against the already proposed hard-mask strategy. The second proposed approach involves training the NN via CrossEntropyIoU, a loss function consisting of the CrossEntropy for identifying the correct label, and the Intersection over Union measuring the mismatch between the activation map and the mask. Experiments show that implementing segment-driven strategies enhances accuracy and training speed using both convolutional and transformer architectures.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


