LiwTERM-r: A Revised Lightweight Transformer-based Model for Multimodal Skin Lesion Detection Robust to Incomplete Input
DOI:
https://doi.org/10.5753/jbcs.2026.5871Keywords:
Deep Learning, Skin Lesion Detection, Transformers, Lightweight ArchitecturesAbstract
As the most common type of cancer in the world, skin cancer accounts for approximately 30% of all diagnosed tumor-based lesions. Early diagnosis can reduce mortality and prevent disfiguring in different skin regions. With the application of machine learning techniques in recent years, especially deep learning, promising results in this task could be achieved, presenting studies demonstrating that the combination of patients' clinical anamneses and images of the injured lesion is essential for improving the correct classification of skin lesions. Despite that, meaningful use of anamneses with multiple collected images of the same skin lesion is mandatory, requiring further investigation. Thus, this project aims to contribute to developing multimodal machine learning-based models to solve the skin lesion classification problem by employing a lightweight transformer model that is robust to missing clinical information input. As a main hypothesis, models can be fed by multiple images from different sources as input along with clinical anamneses from the patient's historical evaluations, leading to a more factual and trustworthy diagnosis. Our model deals with the not-trivial task of combining images and clinical information concerning the skin lesions in a lightweight transformer architecture that does not demand high computation resources or even all the information from the anamneses but still presents competitive classification results.
Downloads
References
Argenziano, G., Fabbrocini, G., Carli, P., De Giorgi, V., Sammarco, E., and Delfino, M. (1998). Epiluminescence Microscopy for the Diagnosis of Doubtful Melanocytic Skin Lesions: Comparison of the ABCD Rule of Dermatoscopy and a New 7-Point Checklist Based on Pattern Analysis. Archives of Dermatology, 134(12):1563-1570. DOI: 10.1001/archderm.134.12.1563.
Devlin, J., Chang, M., Lee, K., and Toutanova, K. (2018). BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805. Available at:[link].
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. CoRR, abs/2010.11929. DOI: 10.48550/arxiv.2010.11929.
Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., and Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542:115-. DOI: 10.1038/nature21056.
Feng, H., Berk-Krauss, J., Feng, P. W., and Stein, J. A. (2018). Comparison of dermatologist density between urban and rural counties in the united states. JAMA Dermatology, 154:1265–-1271. DOI: 10.1001/jamadermatol.2018.3022.
Green, A., Martin, N., Pfitzner, J., O’Rourke, M., and Knight, N. (1994). Computer image analysis in the diagnosis of melanoma. Journal of the American Academy of Dermatology, 31(6):958-964. DOI: 10.1016/S0190-9622(94)70264-0.
Hou, W., Wang, L., Cai, S., Lin, Z., Yu, R., and Qin, J. (2021). Early neoplasia identification in barrett's esophagus via attentive hierarchical aggregation and self-distillation. Medical Image Analysis, 72:102092. DOI: 10.1016/j.media.2021.102092.
INCA (2022). Incidência do câncer no Brasil. Available at: [link]. Last access: 06/05/2023.
ISIC (2019). Skin lesion analysis towards melanoma detection. Available at:[link] Last accessed: 10 March 2020.
Kharazmi, P., Kalia, S., Lui, H., Wang, Z. J., and Lee, T. K. (2018). A feature fusion system for basal cell carcinoma detection through data-driven feature learning and patient profile. Skin Research and Technology, 24(2):256-264. DOI: 10.1111/srt.12422.
Kittler, H., Pehamberger, H., Wolff, K., and Binder, M. (2002). Diagnostic accuracy of dermoscopy. The Lancet Oncology, 3(3):159-165. DOI: 10.1016/S1470-2045(02)00679-4.
Li, Y., Mao, H., and Wang, Z. (2022). A lightweight skin cancer detection model based on convolutional neural network. In CAIBDA 2022; 2nd International Conference on Artificial Intelligence, Big Data and Algorithms, pages 1-7. Available at:[link].
Masood, A. and Al-Jumaily, A. (2013). Computer aided diagnostic support system for skin cancer: A review of techniques and algorithms. International journal of biomedical imaging, 2013:323268. DOI: 10.1155/2013/323268.
OMS (2017). Radiation: Ultraviolet (UV) radiation and skin cancer. Available at: [link]. Last access: 06/05/2023.
Pacheco, A. G. and Krohling, R. A. (2020). The impact of patient clinical information on automated skin cancer detection. Computers in Biology and Medicine, 116:103545. DOI: 10.1016/j.compbiomed.2019.103545.
Pacheco, A. G., Lima, G. R., Salomão, A. S., Krohling, B., Biral, I. P., de Angelo, G. G., Alves Jr, F. C., Esgario, J. G., Simora, A. C., Castro, P. B., Rodrigues, F. B., Frasson, P. H., Krohling, R. A., Knidel, H., Santos, M. C., do Espírito Santo, R. B., Macedo, T. L., Canuto, T. R., and de Barros, L. F. (2020a). Pad-ufes-20: A skin lesion dataset composed of patient data and clinical images collected from smartphones. Data in Brief, 32:106221. DOI: 10.1016/j.dib.2020.106221.
Pacheco, A. G. C., Ali, A., and Trappenberg, T. (2019). Skin cancer detection based on deep learning and entropy to detect outlier samples. CoRR, abs/1909.04525. Available at:[link].
Pacheco, A. G. C. and Krohling, R. A. (2019). Recent advances in deep learning applied to skin cancer detection. DOI: 10.48550/arxiv.1912.03280.
Pacheco, A. G. C. and Krohling, R. A. (2021). An attention-based mechanism to combine images and metadata in deep learning models applied to skin cancer classification. IEEE Journal of Biomedical and Health Informatics, 25(9):3554-3563. DOI: 10.1109/JBHI.2021.3062002.
Pacheco, A. G. C., Trappenberg, T., and Krohling, R. A. (2020b). Learning dynamic weights for an ensemble of deep models applied to medical imaging classification. In 2020 International Joint Conference on Neural Networks (IJCNN), pages 1-8. DOI: 10.1109/IJCNN48605.2020.9206685.
Scheffler, R. M., Liu, J. X., Kinfu, Y., and Poz, M. R. D. (2008). Forecasting the global shortage of physicians: an economic- and needs-based approach. Bulletin of the World Health Organization, 867:516-523B. DOI: 10.2471/blt.07.046474.
Schmidt, C. W., Reddy, V., Zhang, H., Alameddine, A., Uzan, O., Pinter, Y., and Tanner, C. (2024). Tokenization is more than compression. DOI: 10.18653/v1/2024.emnlp-main.40.
Sierra, S. and González, F. A. (2018). Combining textual and visual representations for multimodal author profiling: Notebook for PAN at CLEF 2018. In Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum, Avignon, France, September 10-14, 2018, volume 2125 of CEUR Workshop Proceedings. CEUR-WS.org. Available at:[link].
Sinz, C., Tschandl, P., Rosendahl, C., Akay, B. N., Argenziano, G., Blum, A., Braun, R. P., Cabo, H., Gourhant, J.-Y., Kreusch, J., Lallas, A., Lapins, J., Marghoob, A. A., Menzies, S. W., Paoli, J., Rabinovitz, H. S., Rinner, C., Scope, A., Soyer, H. P., Thomas, L., Zalaudek, I., and Kittler, H. (2017). Accuracy of dermatoscopy for the diagnosis of nonpigmented cancers of the skin. Journal of the American Academy of Dermatology, 5444(6):A1-A50. DOI: http://dx.doi.org/10.1016/j.jaad.2017.07.022.
Souza Jr., L. A., Pacheco, A. G. C., de Angelo, G. G., Oliveira-Santos, T., Palm, C., and Papa, J. P. (2024). Liwterm: A lightweight transformer-based model for dermatological multimodal lesion detection. In 2024 37th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 1-6. DOI: 10.1109/SIBGRAPI62404.2024.10716324.
Tuncer, T., Barua, P. D., Tuncer, I., Dogan, S., and Acharya, U. R. (2024). A lightweight deep convolutional neural network model for skin cancer image classification. Applied Soft Computing, page 111794. DOI: 10.1016/j.asoc.2024.111794.
Webster, J. J. and Kit, C. (1992). Tokenization as the initial phase in nlp. In Proceedings of the 14th Conference on Computational Linguistics - Volume 4, COLING '92, page 1106–1110, USA. Association for Computational Linguistics. DOI: 10.3115/992424.992434.
Wilcoxon, F. (1945). Individual Comparisons by Ranking Methods. Biometrics Bulletin, 1(6):80-83. DOI: 10.2307/3001968.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Luis Antonio de Souza Júnior, André Georghton Cardoso Pacheco, Thiago Oliveira dos Santos dos Santos, Wyctor Fogos da Rocha, Pedro Henrique Bouzon, Christoph Palm, João Paulo Papa

This work is licensed under a Creative Commons Attribution 4.0 International License.

