A detailed analysis of the learning performance teaching Machine Learning in K-12 Education applying Item Response Theory

Authors

  • Marcelo Fernando Rauber Programa de Pós-Graduação em Ciência da Computação – Universidade Federal de Santa Catarina (UFSC) - Florianópolis - SC - Brasil, e, Instituto Federal Catarinense (IFC) - Camboriú - SC - Brasil https://orcid.org/0000-0001-5653-7155
  • Christiane Gresse von Wangenheim Programa de Pós-Graduação em Ciência da Computação – Universidade Federal de Santa Catarina (UFSC) - Florianópolis - SC - Brasil https://orcid.org/0000-0002-6566-1606
  • Adriano Ferreti Borgatto Programa de Pós-Graduação em Métodos e Gestão em Avaliação - Universidade Federal de Santa Catarina (UFSC) - Florianópolis - SC - Brasil https://orcid.org/0000-0001-6280-2525
  • Ramon Mayor Martins Programa de Pós-Graduação em Ciência da Computação – Universidade Federal de Santa Catarina (UFSC) - Florianópolis - SC - Brasil https://orcid.org/0000-0002-1952-0909

DOI:

https://doi.org/10.5753/rbie.2023.3442

Keywords:

Learning Assessment, Machine Learning, Item Response Theory, IRT, Middle and High School

Abstract

The current insertion of Machine Learning (ML) in everyday life demonstrates the importance of introducing the teaching of ML concepts already in middle and high school. Accompanying this trend arises the need to assess this learning. In this paper we present the design, development and implementation of an ML learning assessment model, with emphasis on the evaluation of the validity and reliability of a rubric for the performance-based assessment of learning outcomes of the application of ML concepts by middle and high school students. Adopting Item Response Theory we present a preliminary proposal of the construction of a scale for the level of student learning. The results of the detailed analysis show that it is possible to calibrate the parameters of the Item Response Theory with satisfactory indices of reliability and validity, which demonstrates the potential of using the rubric in order to help both students and teachers to promote the teaching of ML at this educational stage.

Downloads

Download data is not yet available.

References

Alves, N. da C., Gresse von Wangenheim, C., Hauck, J. C. R., & Borgatto, A. F. (2021). An Item Response Theory Analysis of Algorithms and Programming Concepts in App Inventor Projects. Proc. of Brazilian Symposium on Computer Education, Jataí, Goiás, Brazil. https://doi.org/10.5753/educomp.2021.14466 [GS Search]

Alves, N. da C., Gresse von Wangenheim, C., Hauck, J. C. R., & Borgatto, A. F. (2020). A Large-scale Evaluation of a Rubric for the Automatic Assessment of Algorithms and Programming Concepts. Proc. of the 51st ACM Technical Symposium on Computer Science Education, Portland, USA, Pages 556–562. https://doi.org/10.1145/3328778.3366840 [GS Search]

Alves, N. da C., Solecki, I., Gresse von Wangenheim, C., Borgatto, A. F. Hauck, J. C. R., & Ferreira, M. N. F. (2020b). Análise do Nível de Dificuldade dos Conceitos de Design de Interface de Usuário usando a Teoria de Resposta ao Item. Proc. of Simpósio Brasileiro de Informática na Educação, Natal, Rio Grande do Norte, Brasil. https://doi.org/10.5753/cbie.sbie.2020.1563 [GS Search]

Alves, N. da C., Gresse von Wangenheim, C., Alberto, M., & Martins-Pacheco, L. H. (2020c). Uma Proposta de Avaliação da Originalidade do Produto no Ensino de Algoritmos e Programação na Educação Básica. Proc. of Simpósio Brasileiro de Informática na Educação, Natal, Rio Grande do Norte, Brasil. https://doi.org/10.5753/cbie.sbie.2020.41 [GS Search]

Alves, N. da C., Gresse von Wangenheim, C., Martins-Pacheco, L. H., & Borgatto, A. F. (2021b). Existem concordância e confiabilidade na avaliação da criatividade de resultados tangíveis da aprendizagem de computação na Educação Básica? Proc. of Simpósio Brasileiro de Educação em Computação, Jataí, Goiás. https://doi.org/10.5753/educomp.2021.14467 [GS Search]

Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Kamar, E., Nagappan, N., Nushi, B., & Zimmermann, T. (2019). Software Engineering for Machine Learning: A Case Study. Proc. of IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice, Montreal, Canada, 291–300. https://doi.org/10.1109/ICSE-SEIP.2019.00042 [GS Search]

Andrade, D. F., Tavares, H. R., & da Cunha Valle, R. (2000). Teoria da Resposta ao Item: conceitos e aplicações. São Paulo, SP, Brasil: ABE. [GS Search]

Avila, C., Cavalheiro, S., Bordini, A., Marques, M., Cardoso, M., & Feijo, G. (2017). Metodologias de Avaliação do Pensamento Computacional: Uma revisão sistemática. Proc. of Simpósio Brasileiro de Informática na Educação, Fortaleza, Ceará, Brasil, 28(1), 113. https://doi.org/10.5753/cbie.sbie.2017.113 [GS Search]

BRASIL, (1996). LEI Nº 9.394, de 20 de dezembro de 1996. Estabelece as diretrizes e bases da educação nacional. Retrieved 01/09/2022 from [Link]

Basili, V. R., Caldiera, G., & Rombach, H. D. (1994). Goal Question Metric Paradigm. In Encyclopedia of Software Engineering, Wiley. [GS Search]

Bennett, R. E., von Davier, M. (2017). Advancing human assessment: The methodological, psychological and policy contributions of ETS. Springer Nature. https://doi.org/10.1007/978-3-319-58689-2 [GS Search]

Bichi, A. A. (2016). Classical Test Theory: An Introduction to Linear Modeling Approach to Test and Item Analysis. International Journal for Social Studies, 2(9), 27-33. [GS Search]

Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). The Guilford Press. [GS Search]

Brennan, K. & Resnick, M. (2012). New frameworks for studying and assessing the development of computational thinking. Proc. of the Annual Meeting of the American Educational Research Association, Vancouver, Canada, 25. [GS Search]

Camada, M. Y. & Durães, G. M. (2020). Ensino da Inteligência Artificial na Educação Básica: um novo horizonte para as pesquisas brasileiras. Proc. of XXXI Brazilian Symposium on Informatics in Education. Porto Alegre, Brasil, 1553–1562. https://doi.org/10.5753/cbie.sbie.2020.1553 [GS Search]

Cappelleri, J. C., Jason Lundy, J., & Hays, R. D., (2014). Overview of Classical Test Theory and Item Response Theory for the Quantitative Assessment of Items in Developing Patient-Reported Outcomes Measures. Clinical Therapeutics, 36(5), 648–662. https://doi.org/10.1016/j.clinthera.2014.04.006 [GS Search]

Caruso, A. L. M., & Cavalheiro, S. A. da C. (2021). Integração entre Pensamento Computacional e Inteligência Artificial: uma Revisão Sistemática de Literatura. Proc. of XXXII Brazilian Symposium on Informatics in Education, Porto Alegre, Brasil, 1051–1062. https://doi.org/10.5753/sbie.2021.218125 [GS Search]

CGI (2019). TIC Educação 2019. São Paulo, SP, Brasil: Cetic. [Link]

DeVellis, R. F. (2017). Scale development: theory and applications (4th ed.). SAGE. [GS Search]

Finch, J. F. & West, SG (1997). The investigation of personality structure: statistical models. Journal of Research in Personality, 31(4), 439-485. https://doi.org/10.1006/jrpe.1997.2194 [GS Search]

Flora, D. B. (2020). Your Coefficient Alpha Is Probably Wrong, but Which Coefficient Omega Is Right? A Tutorial on Using R to Obtain Better Reliability Estimates. Advances in Methods and Practices in Psychological Science, 3(4), 484–501. https://doi.org/10.1177/2515245920951747 [GS Search]

Google, (2020), Google Teachable Machine. Retrieved 01/06/2022 from [Link]

Gresse von Wangenheim, C. G. von, Hauck, J. C. R., Demetrio, M. F., Pelle, R., Cruz Alves, N. da, Barbosa, H. & Azevedo, L. F. (2018). CodeMaster—Automatic Assessment and Grading of App Inventor and Snap! Programs. Informatics in Education, 17(1), 117–150. https://doi.org/10.15388/infedu.2018.08 [GS Search]

Gresse von Wangenheim, C., Alves, N. da C., Rauber, M. F., Hauck, J. C. R., & Yeter I. H. (2021a). A Proposal for Performance-based Assessment of the Learning of Machine Learning Concepts and Practices in K-12. Informatics in Education, 21(3), 479–500. https://doi.org/10.15388/infedu.2022.18 [GS Search]

Gresse von Wangenheim, C., Marques, L. S., & Hauck, J. C. R. (2020). Machine Learning for All – Introducing Machine Learning in K-12, SocArXiv, 1-10. https://doi.org/10.31235/osf.io/wj5ne [GS Search]

Grover, S., Pea, R., & Cooper, S. (2015). "Systems of Assessments” for deeper learning of computational thinking in K-12. Proc. of the Annual Meeting of the American Educational Research Association, Chicago, Illinois, USA, 15–20. [GS Search]

Hattie, J. & Timperley, H. (2007). The Power of Feedback. Review of Educational Research, 77(1), 81–112. https://doi.org/10.3102/003465430298487 [GS Search]

Hitron, T., Orlev, Y., Wald, I., Shamir, A., Erel, H., & Zuckerman, O. (2019). Can Children Understand Machine Learning Concepts?: The Effect of Uncovering Black Boxes, Proc. of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow Scotland, Uk, 1–11. https://doi.org/10.1145/3290605.3300645 [GS Search]

Ho, J. W., & Scadding, M. (2019). Classroom Activities for Teaching Artificial Intelligence to Primary School Students. Proc. of the Int. Conference on Computational Thinking, Hong Kong, China, 157-159. [GS Search]

House of Lords (2018). AI in the UK: ready, willing and able. London, UK: HL Paper 100. Retrieved 01/09/2022 from [Link] [GS Search]

Hsu, T.-C., Abelson, H., & van Brummelen, J. (2021). The Effects on Secondary School Students of Applying Experiential Learning to the Conversational AI Learning Curriculum. The International Review of Research in Open and Distributed Learning, 23(1), 82-103. https://doi.org/10.19173/irrodl.v22i4.5474 [GS Search]

Huba, M. E., & Freed, J. E. (2000). Learner-centered assessment on college campuses: Shifting the focus from teaching to learning. Allyn & Bacon. [GS Search]

Kandlhofer, M., Steinbauer, G., Hirschmugl-Gaisch, S., & Huber, P. (2016). Artificial intelligence and computer science in education: From kindergarten to university. Proc. of the Frontiers in Education Conference, Erie, PA, USA, 1–9. https://doi.org/10.1109/FIE.2016.7757570 [GS Search]

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539 [GS Search]

Lee, I., Martin, F., Denner, J., Coulter, B., Allan, W., Erickson, J., Malyn-Smith, J., & Werner, L. (2011). Computational thinking for youth in practice. ACM Inroads, 2(1), 32–37. https://doi.org/10.1145/1929887.1929902 [GS Search]

Long, D., & Magerko, B. (2020). What is AI literacy? Competencies and design considerations. Proc. of the Conference on Human Factors in Computing Systems, Honolulu, HA, USA, 1–16. https://doi.org/10.1145/3313831.3376727 [GS Search]

Lordelo, L. M. K., Hongyu, K., Borja, P. C., & Porsani, M. J. (2018). Análise Fatorial por Meio da Matriz de Correlação de Pearson e Policórica no Campo das Cisternas. E&S Engineering and Science, 7(1), 58–70. https://doi.org/10.18607/ES201875266 [GS Search]

Lwakatare, L. E., Raj, A., Bosch, J., Olsson, H. H., & Crnkovic, I. (2019). A taxonomy of software engineering challenges for machine learning systems: An empirical investigation. Proc. of the Int. Conference on Agile Software Development, Montréal, Canada, 227–243. https://doi.org/10.1007/978-3-030-19034-7_14 [GS Search]

Lye, S. Y. & Koh, J. H. L. (2014). Review on teaching and learning of computational thinking through programming: What is next for K-12? Computers in Human Behavior, 41, 51–61. https://doi.org/10.1016/j.chb.2014.09.012 [GS Search]

Lytle, N. et al. (2019). Use, modify, create: Comparing computational thinking lesson progressions for stem classes. Proc. of the ACM Conference on Innovation and Technology in Computer Science Education, Aberdeen, Scotland, UK, 395–401. https://doi.org/10.1145/3304221.3319786 [GS Search]

Marques, L. S., von Wangenheim, C. G., & Hauck, J. C. R. (2020). Ensino de Machine Learning na Educação Básica: um Mapeamento Sistemático do Estado da Arte. Proc. of XXXI Simpósio Brasileiro de Informática na Educação, Natal, Rio Grande do Norte, Brasil., 21–30. https://doi.org/10.5753/cbie.sbie.2020.21 [GS Search]

Martins, R. M., von Wangenheim, C. G., Rauber, M. F., & Hauck, J. C. (2023). Machine Learning for All!—Introducing Machine Learning in Middle and High School. International Journal of Artificial Intelligence in Education. 1-39. https://doi.org/10.1007/s40593-022-00325-y [GS Search]

McMillan, James H. (org.) (2013). Sage handbook of research on classroom assessment. Los Angeles, USA: Sage Publications. [GS Search]

Ministério da Educação (2018). Base Nacional Comum Curricular. Retrieved 01/05/2023 from [Link]

Ministério da Educação (2020). Census of Basic Education 2020. Retrieved 01/05/2023 from [Link]

Ministério da Educação (2022). Normas sobre Computação na Educação Básica – Complemento à Base Nacional Comum Curricular (BNCC). Parecer 02/2022 CNE/CEB/MEC. Retrieved 01/05/2023 from [Link]

Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2003). A Brief Introduction to Evidence-Centered Design. ETS Research Report Series, 2003(1), i–29. https://doi.org/10.1002/j.2333-8504.2003.tb01908.x [GS Search]

Mitchell, T. M. (1997). Machine Learning. New York, NY, USA: McGraw-Hill. [GS Search]

Morrison, G. R., Ross, S. M., Morrison, J. R., & Kalman, H. K. (2019). Designing effective instruction (8h ed.). Hoboken, NJ, USA: Wiley. [GS Search]

Moskal, B. M., & Leydens, J. A. (2000). Scoring rubric development: Validity and reliability. Practical assessment, research, and evaluation, 7(1), 10. https://doi.org/10.7275/Q7RM-GG74 [GS Search]

Mukaka, M. M. (2012). A guide to appropriate use of correlation coefficient in medical research. Malawi Medical journal, 24(3), 69–71. [GS Search]

Paek, I., & Cole, K. (2020). Using R for Item Response Theory Model Applications. New York, NY, USA: Routledge. https://doi.org/10.4324/9781351008167 [GS Search]

Ramos, G., Meek C., Simard P., Suh J., & Ghorashi S. (2020). Interactive machine teaching: a human-centered approach to building machine-learned models. Human–Computer Interaction, 35(5–6), 413–451. https://doi.org/10.1080/07370024.2020.1734931 [GS Search]

Rauber, M. F. & Gresse von Wangenheim, C. (2022). Assessing the Learning of Machine Learning in K-12: A Ten-Year Systematic Mapping. Informatics in Education, online. https://doi.org/10.15388/infedu.2023.11 [GS Search]

Rauber, M. F., Garcia, A. B., Gresse von Wangenheim, C., Borgatto, A.F, Martins, R.M., & Hauck, J.C. (2022). Confiabilidade e Validade da Avaliação do Desempenho de Aprendizagem de Machine Learning na Educação Básica. Proc. of XXXIII Simpósio Brasileiro de Informática na Educação, Manaus, AM, Brasil, online. https://doi.org/10.5753/sbie.2022.224688 [GS Search]

Royal Society (2017). Machine learning: the power and promise of computers that learn by example. Retrieved 01/06/2022 from [Link]

Rust, J., Kosinski, M., & Stillwell, D. (2020). Modern Psychometrics: The Science of Psychological Assessment (4th ed.). Routledge. [GS Search]

Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18(2), 119–144. https://doi.org/10.1007/BF00117714 [GS Search]

Santos, P. S., Araujo, L. G. J., & Bittencourt, R. A. (2018). A mapping study of computational thinking and programming in brazilian k-12 education. Proc. of Frontiers in Education Conference, San Jose, CA, USA, 1–8. [GS Search]

Seeratan, K. L., & Mislevy, R. J. (2008). Design patterns for assessing internal knowledge representations (PADI Technical Report 22). Menlo Park, USA: SRI International. [GS Search]

Shamir G. & Levin I. (2021). Neural Network Construction Practices in Elementary School. Künstliche Intelligenz, 35(2), 181–189. https://doi.org/10.1007/s13218-021-00729-3 [GS Search]

Solecki, I., Porto, J. A., Alves, N. D. C., Gresse von Wangenheim, C., Hauck, J. C. R., & Borgatto, A. F. (2020). Automated Assessment of the Visual Design of Android Apps Developed with App Inventor. Proc. of the 51st ACM Technical Symposium on Computer Science Education, Portland, OR, USA, 51–57. https://doi.org/10.1145/3328778.3366868 [GS Search]

Tang, X., Yin, Y., Lin, Q., Hadad, R., & Zhai, X. (2020). Assessing computational thinking: A systematic review of empirical studies. Computers & Education, 148, 103798. https://doi.org/10.1016/j.compedu.2019.103798 [GS Search]

Touretzky, D., Gardner-McCune, C., Martin, F., & Seehorn D. (2019). Envisioning AI for K-12: What Should Every Child Know about AI? Proc. of the AAAI Conference on Artificial Intelligence, Honolulu, HA, USA. https://doi.org/10.1609/aaai.v33i01.33019795 [GS Search]

Trochim, W. M. K., & Donnelly, J. P. (2008). The research methods knowledge base (3rd ed.). Mason, OH, USA: Atomic Dog/Cengage Learning. [GS Search]

UNESCO (2022). K-12 AI curricula: a mapping of government-endorsed AI curricula. Retrieved 06/06/2022 from [Link]

United Nations (2015). The 17 Goals. Department of Economic and Social Affairs, Sustainable Development. Retrieved 06/06/2022 from [Link]

Yasar, O., Veronesi, P., Maliekal, J., Little, L., Vattana, S., & Yeter I. (2016). Computational Pedagogy: Fostering a New Method of Teaching. Proc. of the Annual Conference & Exposition, New Orleans, LA, USA. https://doi.org/10.18260/p.26550 [GS Search]

Published

2023-12-18

How to Cite

RAUBER, M. F.; GRESSE VON WANGENHEIM, C.; BORGATTO, A. F.; MARTINS, R. M. A detailed analysis of the learning performance teaching Machine Learning in K-12 Education applying Item Response Theory . Brazilian Journal of Computers in Education, [S. l.], v. 31, p. 1031–1056, 2023. DOI: 10.5753/rbie.2023.3442. Disponível em: https://journals-sol.sbc.org.br/index.php/rbie/article/view/3442. Acesso em: 22 nov. 2024.

Issue

Section

Awarded Papers :: EduComp 2023

Most read articles by the same author(s)