Analysis of Responses from LLMs Regarding Introductory Programming Content: A Comparative Study between ChatGPT and Gemini

Authors

DOI:

https://doi.org/10.5753/rbie.2025.4477

Keywords:

Programming teaching, Programming for beginners, ChatGPT, Gemini, LLM

Abstract

Recently, Large Language Models for Natural Language Processing have stood out among current technologies. This technology has opened up a range of possibilities for use in various areas, including programming education, as these models can create program codes. Among these models, two are well-known: OpenAI's ChatGPT and Google's Gemini, both demonstrating abilities to create, correct, and explain programming codes in various languages. In a previous work, tests were conducted and the responses of ChatGPT were analyzed regarding introductory programming content from the perspective of beginners in the subject. This work extends the previous research and adds tests with Gemini, also concerning the same content. The goal is to determine whether these models are suitable for beginner programming students and whether they can be used for learning this content. As in the previous work, qualitative tests were conducted, in which some interactions with the model were made if the initial response was unsatisfactory, and quantitative tests, in which these interactions were not made. All tests were conducted on both ChatGPT and Gemini, and their responses were analyzed. Both showed potential to correctly respond to and explain generated codes, but there are caveats. The overall performance of the tested LLMs, in terms of correct responses, was ~78.2% for ChatGPT and ~69.6% for Gemini. Even with this potential to assist in the programming learning process, the responses generated by LLMs should not be considered entirely correct, demanding prior knowledge from those who use them to analyze and make use of them.

Downloads

Download data is not yet available.

References

Aljanabi, M., Ghazi, M., Ali, A. H., Abed, S. A., & ChatGpt. (2023). ChatGpt: Open Possibilities. Iraqi Journal For Computer Science and Mathematics, 4(1), 62–64. https://doi.org/10.52866/20ijcsm.2023.01.01.0018 [GS Search]

Cámara, J., Troya, J., Burgueño, L., & Vallecillo, A. (2023). On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML. Software and Systems Modeling, 22(3), 781–793. https://doi.org/10.1007/s10270-023-01105-5 [GS Search]

Dengel, A., Gehrlein, R., Fernes, D., Görlich, S., Maurer, J., Pham, H. H., Großmann, G., & Eisermann, N. D. g. (2023). Qualitative Research Methods for Large Language Models: Conducting Semi-Structured Interviews with ChatGPT and BARD on Computer Science Education. Informatics, 10(4). https://doi.org/10.3390/informatics10040078 [GS Search]

du Boulay, J. B. H. (1986). Some Difficulties of Learning to Program. Journal of Educational Computing Research, 2(1), 57–73. [Link] [GS Search]

Dunder, N., Lundborg, S., Wong, J., & Viberg, O. (2024). Katits vs ChatGPT: Assessment and Evaluation of Programming Tasks in the Age of Artificial Intelligence. Proceedings of the 14th Learning Analytics and Knowledge Conference, 821–827. https://doi.org/10.1145/3636555.3636882 [GS Search]

Finnie-Ansley, J., Denny, P., Becker, B. A., Luxton-Reilly, A., & Prather, J. (2022). The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming, 10–19. https://doi.org/10.1145/3511861.3511863 [GS Search]

Finnie-Ansley, J., Denny, P., Luxton-Reilly, A., Santos, E. A., Prather, J., & Becker, B. A. (2023). My AI Wants to Know If This Will Be on the Exam: Testing OpenAI’s Codex on CS2 Programming Exercises. Proceedings of the 25th Australasian Computing Education Conference, 97–104. https://doi.org/10.1145/3576123.3576134 [GS Search]

Gil, A. C. (2002). Como elaborar projetos de pesquisa (4ª ed.). Editora Atlas S.A.

Google. (2021). LaMDA: our breakthrough conversation technology [Acessado em: 25/04/2024]. [Link].

Google. (2023a). Bard now helps you code [Acessado em: 25/04/2024]. [Link].

Google. (2023b). A Message From Our CEO: An important next step on our AI journey [Acessado em: 25/04/2024]. [Link].

Google. (2024a). Bard becomes Gemini: Try Ultra 1.0 and a new mobile app today [Acessado em: 25/04/2024]. [Link].

Google. (2024b). How Gemini for Google Cloud works [Acessado em: 25/04/2024]. [Link].

Kiesler, N., & Schiffner, D. (2023). Large Language Models in Introductory Programming Education: ChatGPT's Performance and Implications for Assessments. ArXiv, abs/2308.08572. [Link] [GS Search]

Li, Y., Choi, D., Chung, J., Kushman, N., Schrittwieser, J., Leblond, R., Eccles, T., Keeling, J., Gimeno, F., Lago, A. D., Hubert, T., Choy, P., de Masson d'Autume, C., Babuschkin, I., Chen, X., Huang, P.-S., Welbl, J., Gowal, S., Cherepanov, A., ... Vinyals, O. (2022). Competition-level code generation with AlphaCode. Science, 378(6624), 1092–1097. https://doi.org/10.1126/science.abq1158 [GS Search]

Lo, C. K. (2023). What Is the Impact of ChatGPT on Education? A Rapid Review of the Literature. Education Sciences, 13(4). https://doi.org/10.3390/educsci13040410 [GS Search]

MacNeil, S., Tran, A., Mogil, D., Bernstein, S., Ross, E., & Huang, Z. (2022). Generating Diverse Code Explanations Using the GPT-3 Large Language Model. Proceedings of the 2022 ACM Conference on International Computing Education Research - Volume 2, 37–39. https://doi.org/10.1145/3501709.3544280 [GS Search]

Matthews, S. J., Newhall, T., & Webb, K. C. (2021). Dive into Systems: A Free, Online Textbook for Introducing Computer Systems. Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, 1110–1116. https://doi.org/10.1145/3408877.3432514 [GS Search]

Mizrahi, V. V. (2008). Treinamento em Linguagem C (Vol. 1). Person Prentice Hall.

OpenAI. (2024a). ChatGPT [Acessado em: 24/04/2024]. [Link].

OpenAI. (2024b). Model index for researchers [Acessado em: 24/04/2024]. [Link].

Ouh, E. L., Gan, B. K. S., Jin Shim, K., & Wlodkowski, S. (2023). ChatGPT, Can You Generate Solutions for my Coding Exercises? An Evaluation on its Effectiveness in an undergraduate Java Programming Course. Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1, 54–60. https://doi.org/10.1145/3587102.3588794 [GS Search]

Pereira Filho, L. C., Souza, T. P. C., & Paula, L. B. (2023). Analise das Respostas do ChatGPT em Relacao ao Conteudo de Programacao para Iniciantes. Anais do XXXIV Simposio Brasileiro de Informatica na Educacao, 1738–1748. https://doi.org/10.5753/sbie.2023.234870 [GS Search]

Piccolo, S. R., Denny, P., Luxton-Reilly, A., Payne, S. H., & Ridge, P. G. (2023). Evaluating a large language model's ability to solve programming exercises from an introductory bioinformatics course. PLOS Computational Biology, 19(9), 1–16. https://doi.org/10.1371/journal.pcbi.1011511 [GS Search]

Rasul, T., Nair, S., Kalendra, D., Robin, M., Santini, F., Ladeira, W., Sun, M., Day, I., Rather, A., & Heathcote, L. (2023). The Role of ChatGPT in Higher Education: Benefits, Challenges, and Future Research Directions. Journal of Applied Learning Teaching, 6, 41–56. https://doi.org/10.37074/jalt.2023.6.1.29 [GS Search]

Replit. (2024). Replit [Acessado em: 25/04/2024]. [Link].

Sarsa, S., Denny, P., Hellas, A., & Leinonen, J. (2022). Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models. Proc. of the 2022 ACM Conf. on International Computing Education Research V.1. https://doi.org/10.1145/3501385.3543957 [GS Search]

Sok, S., & Heng, K. (2023). ChatGPT for Education and Research: A Review of Benefits and Risks. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4378735 [GS Search]

Tsai, M.-L., Ong, C. W., & Chen, C.-L. (2023). Exploring the use of large language models (LLMs) in chemical engineering education: Building core course problem models with Chat-GPT. Education for Chemical Engineers, 44, 71–95. https://doi.org/10.1016/j.ece.2023.05.001 [GS Search]

Wermelinger, M. (2023). Using GitHub Copilot to Solve Simple Programming Problems. Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1, 172–178. https://doi.org/10.1145/3545945.3569830 [GS Search]

Published

2025-07-09

How to Cite

PEREIRA FILHO, L. C.; SOUZA, T. de P. C. de; PAULA, L. B. de. Analysis of Responses from LLMs Regarding Introductory Programming Content: A Comparative Study between ChatGPT and Gemini. Brazilian Journal of Computers in Education, [S. l.], v. 33, p. 722–747, 2025. DOI: 10.5753/rbie.2025.4477. Disponível em: https://journals-sol.sbc.org.br/index.php/rbie/article/view/4477. Acesso em: 18 dec. 2025.

Issue

Section

Awarded Papers :: CBIE