Exploring the Use of Clustering Algorithms and LLMs to Identify Programming Strategies

Authors

DOI:

https://doi.org/10.5753/rbie.2026.6550

Keywords:

Programming strategies, LLMs, Clustering, Programming learning

Abstract

In programming courses, students may use various strategies to solve the same problem. Understanding these strategies can be important for teachers and instructors to assess student progress and provide targeted feedback. Traditional clustering methods have been widely used to group similar programming solutions. However, these techniques often rely on syntactical similarities that do not always reflect the strategies for solving a problem. This research uses clustering algorithms and Large Language Models (LLMs) to identify programming strategies. We conducted the experiments using a dataset of correct student solutions collected from ten Algorithms and Data Structures classes at Federal University of Amazonas. Although the Mean Shift and Affinity Propagation clustering algorithms provided us with visually well-separated clusters, quantitative results showed that the algorithms were not accurate in grouping strategies. In contrast, LLMs demonstrated a better ability to identify strategies aligned with human labels. The results suggest that LLMs can be valuable tools to assist programming instructors in analyzing student solutions.

Downloads

Não há dados estatísticos.

Referências

Barbosa, A. d. A., Costa, E. d. B., & Brito, P. H. (2018). Adaptive clustering of codes for assessment in introductory programming courses. Intelligent Tutoring Systems: 14th International Conference, ITS 2018, Montreal, QC, Canada, June 11–15, 2018, Proceedings 14, 13–22. https://doi.org/10.1007/978-3-319-91464-0_2 [GS Search].

Barbosa, A. d. A., de Barros Costa, E., & Brito, P. H. (2023). Juízes online são suficientes ou precisamos de um var? Simpósio Brasileiro de Educação em Computação (EDUCOMP), 386–394. https://doi.org/10.5753/educomp.2023.228224 [GS Search].

Beh, M. Y., Gottipatti, S., LO, D., & Shankararaman, V. (2016). Semi-automated tool for providing effective feedback on programming assignments. [Link] [GS Search].

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877–1901. [GS Search].

Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on pattern analysis and machine intelligence, 24(5), 603–619. https://doi.org/10.1109/34.1000236 [GS Search].

Combéfis, S., & Schils, A. (2016). Automatic programming error class identification with code plagiarism-based clustering. Proceedings of the 2nd International Code Hunt Workshop on Educational Software Engineering, 1–6. https://doi.org/10.1145/2993270.2993271 [GS Search].

Effenberger, T., & Pelánek, R. (2021). Interpretable clustering of students’ solutions in introductory programming. International Conference on Artificial Intelligence in Education, 101–112. https://doi.org/10.1007/978-3-030-78292-4_9 [GS Search].

Emerson, A., Smith, A., Rodriguez, F. J., Wiebe, E. N., Mott, B. W., Boyer, K. E., & Lester, J. C. (2020). Cluster-based analysis of novice coding misconceptions in block-based programming. Proceedings of the 51st ACM Technical Symposium on Computer Science Education, 825–831. https://doi.org/10.1145/3328778.3366924 [GS Search].

Ester, M., Kriegel, H.-P., Sander, J., Xu, X., et al. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. kdd, 96(34), 226–231. [GS Search].

Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., Shou, L., Qin, B., Liu, T., Jiang, D., et al. (2020). Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155. https://doi.org/10.48550/arXiv.2002.08155 [GS Search].

Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. science, 315(5814), 972–976. https://doi.org/10.1126/science.1136800 [GS Search].

Fu, Y., Osei-Owusu, J., Astorga, A., Zhao, Z. N., Zhang, W., & Xie, T. (2021). Pacon: A symbolic analysis approach for tactic-oriented clustering of programming submissions. Proceedings of the 2021 ACM SIGPLAN International Symposium on SPLASH-E, 32–42. https://doi.org/10.1145/3484272.3484963 [GS Search].

Galvão, L., Fernandes, D., & Gadelha, B. (2016). Juiz online como ferramenta de apoio a uma metodologia de ensino híbrido em programação. Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE), 27(1), 140. https://doi.org/10.5753/cbie.sbie.2016.140 [GS Search].

Gao, L., Wan, B., Fang, C., Li, Y., & Chen, C. (2019). Automatic clustering of different solutions to programming assignments in computing education. Proceedings of the ACM Conference on Global Computing Education, 164–170. https://doi.org/10.1145/3300115.3309515 [GS Search].

Glassman, E. L., Scott, J., Singh, R., Guo, P. J., & Miller, R. C. (2015). Overcode: Visualizing variation in student solutions to programming problems at scale. ACM Transactions on Computer-Human Interaction (TOCHI), 22(2), 1–35. https://doi.org/10.1145/2699751 [GS Search].

Head, A., Glassman, E., Soares, G., Suzuki, R., Figueredo, L., D’Antoni, L., & Hartmann, B. (2017). Writing reusable code feedback at scale with mixed-initiative program synthesis. Proceedings of the Fourth (2017) ACM Conference on Learning@ Scale, 89–98. https://doi.org/10.1145/3051457.3051467 [GS Search].

Huang, J., Piech, C., Nguyen, A., & Guibas, L. (2013). Syntactic and functional variability of a million code submissions in a machine learning mooc. AIED 2013 Workshops Proceedings Volume, 25. [GS Search].

Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. Prentice-Hall, Inc. [GS Search].

Joyner, D., Arrison, R., Ruksana, M., Salguero, E., Wang, Z., Wellington, B., & Yin, K. (2019). From clusters to content: Using code clustering for course improvement. Proceedings of the 50th ACM Technical Symposium on Computer Science Education, 780–786. https://doi.org/10.1145/3287324.3287459 [GS Search].

Jury, B., Lorusso, A., Leinonen, J., Denny, P., & Luxton-Reilly, A. (2024). Evaluating llm-generated worked examples in an introductory programming course. Proceedings of the 26th Australasian computing education conference, 77–86. https://doi.org/10.1145/3636243.3636252 [GS Search].

Kawabayashi, S., Rahman, M. M., & Watanobe, Y. (2021). A model for identifying frequent errors in incorrect solutions. 2021 10th International Conference on Educational and Information Technology (ICEIT), 258–263. https://doi.org/10.1109/ICEIT51700.2021.9375615 [GS Search].

Knuth, D. E. (1998). The art of computer programming: Sorting and searching, volume 3. Addison-Wesley Professional. [GS Search].

Koivisto, T., & Hellas, A. (2022). Evaluating codeclusters for effectively providing feedback on code submissions, 1–9. https://doi.org/10.1109/FIE56618.2022.9962751 [GS Search].

Leinonen, J., Denny, P., MacNeil, S., Sarsa, S., Bernstein, S., Kim, J., Tran, A., & Hellas, A. (2023). Comparing code explanations created by students and large language models. Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1, 124–130. https://doi.org/10.1145/3587102.3588785 [GS Search].

Lima, J. (2023). Como o chatgpt afeta a educação e o desenvolvimento universitário. The Trends Hub, (3). https://doi.org/10.34630/tth.vi3.5020 [GS Search].

Lokkila, E., Christopoulos, A., & Laakso, M.-J. (2022). A clustering method to detect disengaged students from their code submission history. Proceedings of the 27th ACM Conference on Innovation and Technology in Computer Science Education Vol. 1, 228–234. https://doi.org/10.1145/3502718.3524754 [GS Search].

Luo, L., & Zeng, Q. (2016). Solminer: Mining distinct solutions in programs. Proceedings of the 38th International Conference on Software Engineering Companion, 481–490. https://doi.org/10.1145/2889160.2889202 [GS Search].

Lyu, W., Wang, Y., Chung, T., Sun, Y., & Zhang, Y. (2024). Evaluating the effectiveness of llms in introductory computer science education: A semester-long field study. Proceedings of the Eleventh ACM Conference on Learning@ Scale, 63–74. https://doi.org/10.1145/3657604.3662036 [GS Search].

MacNeil, S., Tran, A., Mogil, D., Bernstein, S., Ross, E., & Huang, Z. (2022). Generating diverse code explanations using the gpt-3 large language model. Proceedings of the 2022 ACM conference on international computing education research-volume 2, 37–39. https://doi.org/10.1145/3501709.3544280 [GS Search].

MacQueen, J., et al. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1(14), 281–297. [Link] [GS Search].

Mehta, A., Gupta, N., Balachandran, A., Kumar, D., Jalote, P., et al. (2023). Can ChatGPT play the role of a teaching assistant in an introductory programming course? arXiv preprint arXiv:2312.07343. https://doi.org/10.48550/arXiv.2312.07343 [GS Search].

Melo, R., Pessoa, M., & Fernandes, D. (2024). Clusterização de soluções de exercícios de programação: Um mapeamento sistemático da literatura. Simpósio Brasileiro de Informática na Educação (SBIE), 1715–1729. https://doi.org/10.5753/sbie.2024.242403 [GS Search].

Melo, R., Souza, T., Oliveira, E., Galvao, L., Pessoa, M., & Fernandes, D. (2025). Explorando o uso de llms para rotular estratégias de programação. Simpósio Brasileiro de Educação em Computação (EDUCOMP), 178–190. https://doi.org/10.5753/educomp.2025.5335 [GS Search].

Miguel, J., Martins, W., Benarrós, Í., & Duarte, J. C. (2025). Especialista em algoritmos para apoio interativo na aprendizagem de programação utilizando chatgpt. Simpósio Brasileiro de Educação em Computação (EDUCOMP), 204–215. https://doi.org/10.5753/educomp.2025.5378 [GS Search].

Neumann, A. T., Yin, Y., Sowe, S., Decker, S., & Jarke, M. (2024). An llm-driven chatbot in higher education for databases and information systems. IEEE Transactions on Education. https://doi.org/10.1109/TE.2024.3467912 [GS Search].

Paiva, J. C., Leal, J. P., & Figueira, Á. (2024). Clustering source code from automated assessment of programming assignments. International Journal of Data Science and Analytics, 1–12. https://doi.org/10.1007/s41060-024-00554-5 [GS Search].

Piscitelli, A., De Rosa, M., Fuccella, V., Costagliola, G., et al. (2025). Large language models for student code evaluation: Insights and accuracy. Proceedings of the 17th International Conference on Computer Supported Education-(Volume 2), 534–544. https://doi.org/10.5220/0013287500003932 [GS Search].

Rahman, M. M., Watanobe, Y., Matsumoto, T., Kiran, R. U., & Nakamura, K. (2022). Educational data mining to support programming learning using problem-solving data. IEEE Access, 10, 26186–26202. https://doi.org/10.1109/ACCESS.2022.3157288 [GS Search].

Rahman, M. M., Watanobe, Y., Rage, U. K., & Nakamura, K. (2021). A novel rule-based online judge recommender system to promote computer programming education. Advances and Trends in Artificial Intelligence. From Theory to Practice: 34th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2021, Kuala Lumpur, Malaysia, July 26–29, 2021, Proceedings, Part II 34, 15–27. https://doi.org/10.1007/978-3-030-79463-7_2 [GS Search].

Raihan, N., Siddiq, M. L., Santos, J. C., & Zampieri, M. (2025). Large language models in computer science education: A systematic literature review. Proceedings of the 56th ACM Technical Symposium on Computer Science Education V. 1, 938–944. https://doi.org/10.1145/3641554.3701863 [GS Search].

Rosales-Castro, L. F., Chaparro-Gutiérrez, L. A., Cruz-Salinas, A. F., Restrepo-Calle, F., Camargo, J., & González, F. A. (2016). An interactive tool to support student assessment in programming assignments. Advances in Artificial Intelligence-IBERAMIA 2016: 15th Ibero-American Conference on AI, San José, Costa Rica, November 23-25, 2016, Proceedings 15, 404–414. https://doi.org/10.1007/978-3-319-47955-2_33 [GS Search].

Silva, D. B., Carvalho, D. R., & Silla, C. N. (2023). A clustering-based computational model to group students with similar programming skills from automatic source code analysis using novel features. IEEE Transactions on Learning Technologies. https://doi.org/10.1109/TLT.2023.3273926 [GS Search].

Silva, D. B., & Silla, C. N. (2020). Evaluation of students programming skills on a computer programming course with a hierarchical clustering algorithm. 2020 IEEE Frontiers in Education Conference (FIE), 1–9. https://doi.org/10.1109/FIE44824.2020.9274130 [GS Search].

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. [GS Search].

Xu, D., & Tian, Y. (2015). A comprehensive survey of clustering algorithms. Annals of data science, 2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1 [GS Search].

Arquivos adicionais

Published

2026-02-09

Como Citar

MELO, R.; SOUZA, T.; PIRES, F.; OLIVEIRA, E.; CARVALHO, L.; PESSOA, M.; FERNANDES, D. Exploring the Use of Clustering Algorithms and LLMs to Identify Programming Strategies. Revista Brasileira de Informática na Educação, [S. l.], v. 34, p. 59–82, 2026. DOI: 10.5753/rbie.2026.6550. Disponível em: https://journals-sol.sbc.org.br/index.php/rbie/article/view/6550. Acesso em: 19 fev. 2026.

Issue

Section

Artigos Premiados :: EduComp

Artigos mais lidos pelo mesmo(s) autor(es)