Comparative Analysis of Machine Learning Algorithms for Predicting Prostate Cancer Recurrence Using Data from the Oncocentro Foundation of São Paulo
DOI:
https://doi.org/10.5753/reic.2026.6648Keywords:
Prostate Cancer, Machine Learning, Recurrence Prediction, Imbalanced Data, Supervised ClassificationAbstract
Prostate cancer is the most common neoplasm among men in Brazil, making the study of recurrence cases a topic of great interest to medicine. The main focus of this work is the application of supervised machine learning, through a comparative analysis of classification algorithms — Random Forest, XGBoost, HistGradientBoosting, and Naive Bayes — to predict the occurrence of recurrence. The main goal is to evaluate the performance of these models in predicting the recurrence feature based on the other characteristics present in the dataset from the Fundação Oncocentro de São Paulo. It is hoped to contribute to the study of the effectiveness of different machine learning approaches on open and imbalanced clinical data regarding prostate cancer. The results obtained are significant and point to the superiority of simpler models in this scenario, as well as suggesting the possibility of continuing the studies with more advanced techniques.
Downloads
References
Antunes, M. E., Araújo, T. G., Till, T. T., Pantaleão, E., Mancera, P. F. A., and Oliveira, M. H. d. (2025). Machine learning models for predicting prostate cancer recurrence and identifying potential molecular biomarkers. Disponível em: [link]. Último acesso em 28 de agosto de 2025.
Bigonha, R. S. (2025). Fundamentos do Aprendizado de Máquina. Disponível em: [link]. Último acesso em 28 de agosto de 2025.
Bonaccorso, G. (2017). Machine Learning Algorithms: Reference Guide for Popular Algorithms for Data Science and Machine Learning. Packt Publishing, Birmingham, UK. Disponível em: [link]. Acesso em: 8 de dezembro de 2025.
Fonseca, R. P., Fernandes Junior, A. S., Lima, V. S., Lima, S. S. S., Castro, A. F. d., Horta, H. d. L. e., and Favato Neto, B. (2007). Recidiva bioquímica em câncer de próstata: artigo de revisão. Revista Brasileira de Cancerologia, 53(2):167–172. DOI: 10.32635/2176-9745.RBC.2007v53n2.1812.
INCA (2025). Números do câncer. Disponível em: [link]. Último acesso em 28 de agosto de 2025.
Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., and Fotiadis, D. I. (2015). Machine learning applications in cancer prognosis and prediction. Computational and Structural Biotechnology Journal, 13:8–17. DOI: 10.1016/j.csbj.2014.11.005.
Lee, S. J., Yu, S. H., Kim, Y., Kim, J. K., Hong, J. H., Kim, C.-S., Seo, S. I., Byun, S.-S., Jeong, C. W., Lee, J. Y., and Choi, I. Y. (2020). Prediction system for prostate cancer recurrence using machine learning. Journal of Clinical Medicine. DOI: 10.3390/jcm9041200.
Maeda, A. E., Crocco, P. F., Ruppert, G. C. S., Dametto, M., and Bonacin, R. (2022). Um estudo sobre a predição da recidiva de câncer usando técnicas de aprendizado de máquina. In XXIV Jornada de Iniciação Científica do Centro de Tecnologia da Informação Renato Archer (JICC 2022), Campinas, SP. Disponível em: [link]. Último acesso em 28 de agosto de 2025.
Shalev-Shwartz, S. and Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. Disponível em: [link]. Acesso em: 8 de dezembro de 2025.
Uddin, S., Khan, A., Hossain, M. E., and Moni, M. A. (2019). Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making. Disponível em: [link]. Último acesso em 28 de agosto de 2025.
Wang, P., Li, Y., and Reddy, C. K. (2019). Machine learning for survival analysis: A survey. ACM Computing Surveys (CSUR). Disponível em: [link]. Último acesso em 28 de agosto de 2025.
Zardeto, H. N., Schmidt, T. P., and Schneider, I. J. C. (2022). Prostate cancer: Analysis of survival and prognostic factors by age at diagnosis. Research, Society and Development, 11(8):e49411831344. DOI: 10.33448/rsd-v11i8.31344.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 The authors

This work is licensed under a Creative Commons Attribution 4.0 International License.
