A Semiparametric Approach to Mitigating the Impact of Outliers in ROC Curve Generation for Image Analysis
DOI:
https://doi.org/10.5753/jbcs.2025.5288Keywords:
Receiver Operating Characteristic, Roc Curve, ROC Analysis, Images Analysis, OutliersAbstract
Artificial intelligence enables the development of machine learning algorithms that can identify and categorize patterns using large amounts of data across various areas. Computational tools were created to analyze these algorithms, allowing for the validation and comparison of their results. The Receiver Operating Characteristic (ROC) is an important statistical technique used for analyzing binary classification models. A ROC curve is commonly utilized in image analysis as a validation metric to compare images generated by a classification model with images created by humans, referred to as Ground Truth (GT). Currently, machine learning algorithms produce ROC curves with a limited number of points, even when trained on large-scale datasets. The result is the presence of outliers which can significantly distort the ROC curve, potentially leading to inaccurate conclusions about the model's performance. This study introduces a novel method for preventing outliers in the creation of ROC curves, guaranteeing a reliable and robust evaluation of image classification models. We implemented our algorithm in Python using a dataset of 1000 grayscale contour images. Performance was compared against Logistic Regression, SVM, Random Forests and SKlearn using ROC curves, AUC, precision, accuracy, and F1-score. Statistical significance was assessed via paired t-tests and Cohen’s d for effect size, with outlier detection via Local Outlier Factor. Results demonstrated that SPROC showed a refined curve with more precise AUC values on noisy images in contrast to machine learning approaches.
Downloads
References
Alghushairy, O., Alsini, R., Soule, T., and Ma, X. (2020). A review of local outlier factor algorithms for outlier detection in big data streams. DOI: 10.3390/bdcc5010001.
Breiman, L. (2001). Random forests. 45:5-32. DOI: 10.1023/A:1010933404324.
Bueno, R. C., Masotti, P. H., Justo, J. F., Andrade, D. A., Rocha, M. S., Torres, W. M., and de Mesquita, R. N. (2018). Two-phase flow bubble detection method applied to natural circulation system using fuzzy image processing. Nuclear Engineering and Design, 335:255-264. DOI: 10.1016/j.nucengdes.2018.05.026.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1):155-159. DOI: 10.1037/0033-2909.112.1.155.
Cook, J. A. (2017). Roc curves and nonrandom data. Pattern Recognition Letters, 85:35-41. DOI: 10.1016/j.patrec.2016.11.015.
Du, Z. X., Chang, F. Q., Wang, Z. J., Zhou, D. M., Li, Y., and Yang, J. H. (2022). A risk prediction model for acute kidney injury in patients with pulmonary tuberculosis during anti-tuberculosis treatment. Renal Failure, 44:625-635. DOI: 10.1080/0886022X.2022.2058405.
Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27:861-874. DOI: 10.1016/j.patrec.2005.10.010.
Friedman, N., Geiger, D., Provan, G., Langley, P., and Smyth, P. (1997). Bayesian network classifiers *. 29:131-163. Available at:[link].
Gao, Y., Li, T., Han, M., Li, X., Wu, D., Xu, Y., Zhu, Y., Liu, Y., Wang, X., and Wang, L. (2020). Diagnostic utility of clinical laboratory data determinations for patients with the severe covid-19. Journal of Medical Virology, 92:791-796. DOI: 10.1002/jmv.25770.
Ghamry, F. M., El-Banby, G. M., El-Fishawy, A. S., El-Samie, F. E., and Dessouky, M. I. (2024). A survey of anomaly detection techniques. Journal of Optics (India), 53:756-774. DOI: 10.1007/s12596-023-01147-4.
Hannun, A. Y., Rajpurkar, P., Haghpanahi, M., Tison, G. H., Bourn, C., Turakhia, M. P., and Ng, A. Y. (2019). Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nature Medicine, 25:65-69. DOI: 10.1038/s41591-018-0268-3.
He, X., Gallas, B. D., and Frey, E. C. (2010). Three-class roc analysistoward a general decision theoretic solution. IEEE Transactions on Medical Imaging, 29:206-215. DOI: 10.1109/TMI.2009.2034516.
Hong, H., Liu, J., Bui, D. T., Pradhan, B., Acharya, T. D., Pham, B. T., Zhu, A. X., Chen, W., and Ahmad, B. B. (2018). Landslide susceptibility mapping using j48 decision tree with adaboost, bagging and rotation forest ensembles in the guangchang area (china). Catena, 163:399-413. DOI: 10.1016/j.catena.2018.01.005.
Keidar, D., Yaron, D., Goldstein, E., Shachar, Y., Blass, A., Charbinsky, L., Aharony, I., Lifshitz, L., Lumelsky, D., Neeman, Z., Mizrachi, M., Hajouj, M., Eizenbach, N., Sela, E., Weiss, C. S., Levin, P., Benjaminov, O., Shabshin, N., Elyada, Y. M., and Eldar, Y. C. (2020). Covid-19 classification of x-ray images using deep neural networks. DOI: 10.1007/s00330-021-08050-1/Published.
Khawaja, A. M., Asayesh, B. M., Hainzl, S., and Schorlemmer, D. (2023). Towards improving the spatial testability of aftershock forecast models. Natural Hazards and Earth System Sciences, 23:2683-2696. DOI: 10.5194/nhess-23-2683-2023.
Khosravi, K., Pham, B. T., Chapi, K., Shirzadi, A., Shahabi, H., Revhaug, I., Prakash, I., and Bui, D. T. (2018). A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at haraz watershed, northern iran. Science of the Total Environment, 627:744-755. DOI: 10.1016/j.scitotenv.2018.01.266.
Kun-Peng, Z., Xiao-Long, M., and Chun-Lin, Z. (2018). Overexpressed circpvt1, a potential new circular rna biomarker, contributes to doxorubicin and cisplatin resistance of osteosarcoma cells by regulating abcb1. International Journal of Biological Sciences, 14:321-330. DOI: 10.7150/ijbs.24360.
Li, K., Fang, Y., Li, W., Pan, C., Qin, P., Zhong, Y., Liu, X., Huang, M., Liao, Y., and Li, S. (2020). Ct image visual quantitative evaluation and clinical classification of coronavirus disease (covid-19). DOI: 10.1007/s00330-020-06817-6/Published.
Li, M., Lin, Z., Mech, R., Yumer, E., and Ramanan, D. (2019). Photo-sketching: Inferring contour drawings from images. DOI: 10.1109/wacv.2019.00154.
Martin, O. (2024). Bayesian Analysis with Python - Third Edition: A Practical Guide to Probabilistic Modeling. Packt Publishing. Book.
McGowan, L. D., Bullen, J. A., and Obuchowski, N. A. (2016). Location bias in roc studies. Statistics in Biopharmaceutical Research, 8:258-267. DOI: 10.1080/19466315.2016.1173583.
Moreira, D. (2020). Comparing empirical roc curves using a java application: Cercus. DOI: 10.1007/978-3-030-24302-9_3.
Nahm, F. S. (2022). Receiver operating characteristic curve: overview and practical use for clinicians. Korean Journal of Anesthesiology, 75:25-36. DOI: 10.4097/kja.21209.
Niu, M., Song, K., Huang, L., Wang, Q., Yan, Y., and Meng, Q. (2021). Unsupervised saliency detection of rail surface defects using stereoscopic images. IEEE Transactions on Industrial Informatics, 17:2271-2281. DOI: 10.1109/TII.2020.3004397.
Pourghasemi, H. R. and Rahmati, O. (2018). Prediction of the landslide susceptibility: Which algorithm, which precision? Catena, 162:177-192. DOI: 10.1016/j.catena.2017.11.022.
Sachs, M. C. (2017). Plotroc: A tool for plotting roc curves. Journal of Statistical Software, 79. DOI: 10.18637/jss.v079.c02.
Schott, S. M. C., da Silva, M. C. B., de Andrade, D. A., and de Mesquita, R. N. (2024). Convolutional neural network-based pattern recognition in natural circulation instability images. Concilium, 24:267-288. DOI: 10.53660/clm-2919-24d10.
Shkurnikov, M., Nersisyan, S., Jankevic, T., Galatenko, A., Gordeev, I., Vechorko, V., and Tonevitsky, A. (2021). Association of hla class i genotypes with severity of coronavirus disease-19. Frontiers in Immunology, 12. DOI: 10.3389/fimmu.2021.641900.
Student (1908). The probable error of a mean. Biometrika, 6(1):1-25. DOI: 10.2307/2331554.
Termeh, S. V. R., Kornejady, A., Pourghasemi, H. R., and Keesstra, S. (2018). Flood susceptibility mapping using novel ensembles of adaptive neuro fuzzy inference system and metaheuristic algorithms. Science of the Total Environment, 615:438-451. DOI: 10.1016/j.scitotenv.2017.09.262.
Wang, D., Fan, G., Wu, S., Yang, T., Xu, J., Yang, L., Zhao, J., Zhang, X., Bai, C., Kang, J., Ran, P., Shen, H., Wen, F., Huang, K., Chen, Y., Sun, T., Shan, G., Lin, Y., Xu, G., Wang, R., Shi, Z., Xu, Y., Ye, X., Song, Y., Wang, Q., Zhou, Y., Li, W., Ding, L., Wan, C., Yao, W., Guo, Y., Xiao, F., Lu, Y., Peng, X., Zhang, B., Xiao, D., Wang, Z., Bu, X., Zhang, H., Zhang, X., An, L., Zhang, S., Zhu, J., Cao, Z., Zhan, Q., Yang, Y., Liang, L., Dai, H., Cao, B., He, J., and Wang, C. (2022). Development and validation of a screening questionnaire of copd from a large epidemiological study in china. COPD: Journal of Chronic Obstructive Pulmonary Disease, 19:118-124. DOI: 10.1080/15412555.2022.2042504.
Wu, J.-p., Ding, W.-Z., Wang, Y.-L., Liu, S., Zhang, X.-q., Yang, Q., Cai, W.-J., Yu, X.-l., Liu, F.-y., Kong, D., et al. (2022). Radiomics analysis of ultrasound to predict recurrence of hepatocellular carcinoma after microwave ablation. International Journal of Hyperthermia, 39(1):595-604. DOI: 10.1080/02656736.2022.2062463.
Zeiler, M. D. and Fergus, R. (2014). Lncs 8689 - visualizing and understanding convolutional networks. CoRR, abs/1311.2901. DOI: 10.48550/arXiv.1311.2901.
Zhao, S., Pan, H., Guo, Q., Xie, W., and Wang, J. (2022). Platelet to white blood cell ratio was an independent prognostic predictor in acute myeloid leukemia. Hematology (United Kingdom), 27:426-430. DOI: 10.1080/16078454.2022.2055857.
Zhao, W., Lu, M., Wang, X., and Guo, Y. (2021). The role of sarcopenia questionnaires in hospitalized patients with chronic heart failure. Aging Clinical and Experimental Research, 33:339-344. DOI: 10.1007/s40520-020-01561-9.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Regis Cortez Bueno, Renan Gimeniz Marques, Raphael Antonio de Souza, Sivanilza Teixeira Machado

This work is licensed under a Creative Commons Attribution 4.0 International License.

