Comparative Analysis of Machine Learning Algorithms for Predicting Prostate Cancer Recurrence Using Data from the Oncocentro Foundation of São Paulo

Authors

DOI:

https://doi.org/10.5753/reic.2026.6648

Keywords:

Prostate Cancer, Machine Learning, Recurrence Prediction, Imbalanced Data, Supervised Classification

Abstract

Prostate cancer is the most common neoplasm among men in Brazil, making the study of recurrence cases a topic of great interest to medicine. The main focus of this work is the application of supervised machine learning, through a comparative analysis of classification algorithms — Random Forest, XGBoost, HistGradientBoosting, and Naive Bayes — to predict the occurrence of recurrence. The main goal is to evaluate the performance of these models in predicting the recurrence feature based on the other characteristics present in the dataset from the Fundação Oncocentro de São Paulo. It is hoped to contribute to the study of the effectiveness of different machine learning approaches on open and imbalanced clinical data regarding prostate cancer. The results obtained are significant and point to the superiority of simpler models in this scenario, as well as suggesting the possibility of continuing the studies with more advanced techniques.

Downloads

Download data is not yet available.

References

Antunes, M. E., Araújo, T. G., Till, T. T., Pantaleão, E., Mancera, P. F. A., and Oliveira, M. H. d. (2025). Machine learning models for predicting prostate cancer recurrence and identifying potential molecular biomarkers. Disponível em: [link]. Último acesso em 28 de agosto de 2025.

Bigonha, R. S. (2025). Fundamentos do Aprendizado de Máquina. Disponível em: [link]. Último acesso em 28 de agosto de 2025.

Bonaccorso, G. (2017). Machine Learning Algorithms: Reference Guide for Popular Algorithms for Data Science and Machine Learning. Packt Publishing, Birmingham, UK. Disponível em: [link]. Acesso em: 8 de dezembro de 2025.

Fonseca, R. P., Fernandes Junior, A. S., Lima, V. S., Lima, S. S. S., Castro, A. F. d., Horta, H. d. L. e., and Favato Neto, B. (2007). Recidiva bioquímica em câncer de próstata: artigo de revisão. Revista Brasileira de Cancerologia, 53(2):167–172. DOI: 10.32635/2176-9745.RBC.2007v53n2.1812.

INCA (2025). Números do câncer. Disponível em: [link]. Último acesso em 28 de agosto de 2025.

Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., and Fotiadis, D. I. (2015). Machine learning applications in cancer prognosis and prediction. Computational and Structural Biotechnology Journal, 13:8–17. DOI: 10.1016/j.csbj.2014.11.005.

Lee, S. J., Yu, S. H., Kim, Y., Kim, J. K., Hong, J. H., Kim, C.-S., Seo, S. I., Byun, S.-S., Jeong, C. W., Lee, J. Y., and Choi, I. Y. (2020). Prediction system for prostate cancer recurrence using machine learning. Journal of Clinical Medicine. DOI: 10.3390/jcm9041200.

Maeda, A. E., Crocco, P. F., Ruppert, G. C. S., Dametto, M., and Bonacin, R. (2022). Um estudo sobre a predição da recidiva de câncer usando técnicas de aprendizado de máquina. In XXIV Jornada de Iniciação Científica do Centro de Tecnologia da Informação Renato Archer (JICC 2022), Campinas, SP. Disponível em: [link]. Último acesso em 28 de agosto de 2025.

Shalev-Shwartz, S. and Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. Disponível em: [link]. Acesso em: 8 de dezembro de 2025.

Uddin, S., Khan, A., Hossain, M. E., and Moni, M. A. (2019). Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making. Disponível em: [link]. Último acesso em 28 de agosto de 2025.

Wang, P., Li, Y., and Reddy, C. K. (2019). Machine learning for survival analysis: A survey. ACM Computing Surveys (CSUR). Disponível em: [link]. Último acesso em 28 de agosto de 2025.

Zardeto, H. N., Schmidt, T. P., and Schneider, I. J. C. (2022). Prostate cancer: Analysis of survival and prognostic factors by age at diagnosis. Research, Society and Development, 11(8):e49411831344. DOI: 10.33448/rsd-v11i8.31344.

Published

2026-04-02

How to Cite

Reis, G., & Ruppert, G. (2026). Comparative Analysis of Machine Learning Algorithms for Predicting Prostate Cancer Recurrence Using Data from the Oncocentro Foundation of São Paulo. Electronic Journal of Undergraduate Research on Computing, 24(1), 192–197. https://doi.org/10.5753/reic.2026.6648

Issue

Section

Full Papers