Predição de Sucesso Acadêmico de Estudantes: Uma Análise sobre a Demanda por uma Abordagem baseada em Transfer Learning
DOI:
https://doi.org/10.5753/rbie.2019.27.01.01Keywords:
Predição de Sucesso Acadêmico, Learning Analytics, Mineração de Dados Educacionais, Transfer Learning, Covariate ShifAbstract
Interações de estudantes com Ambientes Virtuais de Aprendizagem (AVA) geram logs que permitem reconstruir cada atividade realizada. A análise destes dados tem proporcionando uma melhor compreensão do comportamento de estudantes e dos processos de ensino e aprendizagem. Neste contexto, inúmeros trabalhos têm relatado resultados promissores na tarefa de predição de desempenho de estudantes, permitindo que ações proativas possam ser tomadas no sentido de evitar insucessos acadêmicos. Usualmente, técnicas de mineração de dados empregadas na construção de modelos preditivos utilizam registros históricos (passados) de dados, assumindo-se, desta forma, a premissa de que o preditor construído irá realizar predições em contextos futuros que sejam similares aos contextos (passados) que foram utilizados na sua concepção. Ainda que seja razoável assumir que a diversidade de contextos educacionais existentes se reflita nos dados gerados, poucos são os trabalhos que discutem o impacto de tal premissa na área de Mineração de Dados Educacionais (MDE), o que resulta em modelos que podem apresentar desempenho insatisfatório quando utilizados em condições educacionais não previstas. Este trabalho propõe uma análise empírica no sentido de verificar indícios de diferenças entre dados provenientes de contextos educacionais distintos na tarefa de predição de insucesso acadêmico de estudantes. Emprega-se dados de logs de mais de 3.000 estudantes de ensino superior na modalidade EAD. A metodologia adotada é baseada na própria abordagem de classificação supervisionada, comumente utilizada em tarefas de predição, sendo que busca-se, especificamente, verificar se contextos educacionais distintos são de fato separáveis em termos dos dados que geram. Ainda que o cenário de dados envolva atividades comuns a estudantes de uma mesma disciplina, os experimentos indicam uma acurácia de até 83% na separação de dados provenientes de períodos letivos distintos. Embora empíricos, os resultados indicam uma direção similar àquela apontada por outros trabalhos, contribuindo sobre a necessidade da utilização de técnicas de transfer learning e/ou adaptação de domínio no projeto dos modelos preditivos voltados a prevenção de insucessos acadêmicos.
Downloads
Referências
Agudo-Peregrina, Á. F., Iglesias-Pradas, S., Conde-González, M. Á., & Hernández-García, Á. (2014). Can we predict success from log data in vles? classification of interactions for learning analytics and their relation with performance in vle-supported f2f and online learning. Computers in human behavior, 31, 542–550. [GS Search] doi: 10.1016/j.chb.2013.05.031
Baker, R., Isotani, S., Carvalho, A. (2011). Mineração de dados educacionais: Oportunidades para o brasil. Revista Brasileira de Informática na Educação, 19(02), 03. [GS Search] doi: 10.5753/RBIE.2011.19.02.03
Baradwaj, B. K., Pal, S. (2011). Mining educational data to analyze students’ performance. International Journal of Advanced Computer Science and Applications, 2(6). Retrieved from [Link] [GS Search]
Barber, R., Sharkey, M. (2012). Course correction: Using analytics to predict course success. In Proceedings of the 2nd international conference on learning analytics and knowledge (pp. 259–262). [GS Search] doi: 10.1145/2330601.2330664
Bickel, S., Brückner, M., Scheffer, T. (2007). Discriminative learning for differing training and test distributions. In Proceedings of the 24th international conference on machine learning (pp. 81–88). [GS Search] doi: 10.1145/1273496.1273507
Bousbia, N., Belamri, I. (2014). Which contribution does edm provide to computer-based learning environments? In Educational data mining (pp. 3–28). Springer. [GS Search] doi: 10.1007/978-3-319-02738-8_1
Boyer, S., Veeramachaneni, K. (2015). Transfer learning for predictive models in massive open online courses. In International conference on artificial intelligence in education (pp. 54–63). [GS Search] doi: 10.1007/978-3-319-19773-9_6
Cechinel, C., Araujo, R. M., Detoni, D. (2015). Modelling and prediction of distance learning students failure by using the count of interactions. Brazilian Journal of Computers in Education, 23(03), 1. [GS Search] doi: 10.5753/RBIE.2015.23.03.1
Chatti, M. A., Dyckhoff, A. L., Schroeder, U., Thüs, H. (2012). A reference model for learning analytics. International Journal of Technology Enhanced Learning, 4(5-6), 318–331. [GS Search] doi: 10.1504/IJTEL.2012.051815
Costa, E. B., Fonseca, B., Santana, M. A., de Araújo, F. F., Rego, J. (2017). Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Computers in Human Behavior, 73, 247–256. [GS Search] doi: 10.1016/j.chb.2017.01.047
Daume III, H., Marcu, D. (2006). Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research, 26, 101–126. [GS Search] doi: 10.1613/jair.1872
Dawson, S., Gašević, D., Siemens, G., Joksimovic, S. (2014). Current state and future trends: A citation network analysis of the learning analytics field. In Proceedings of the fourth inter- national conference on learning analytics and knowledge (pp. 231–240). [GS Search] doi: 10.1145/2567574.2567585
Duval, E. (2011). Attention please!: learning analytics for visualization and recommendation. In Proceedings of the 1st international
conference on learning analytics and knowledge (pp. 9–17). [GS Search] doi: 10.1145/2090116.2090118
Er, E. (2012). Identifying at-risk students using machine learning techniques: A case study with is 100. International Journal of Machine Learning and Computing, 2(4), 476. Retrieved from [Link] [GS Search]
Essa, A., Ayad, H. (2012). Improving student success using predictive models and data visualisations. Research in Learning Technology, 20(sup1), 19191. [GS Search] doi: 10.3402/rlt.v20i0.19191
Faceli, K., Lorena, A. C., Gama, J., Carvalho, A. (2011). Inteligência artificial: Uma abordagem de aprendizado de máquina. Rio de Janeiro: LTC. [GS Search]
Fayyad, U., Piatetsky-Shapiro, G., Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3), 37. [GS Search] doi: 10.1609/aimag.v17i3.1230
Ferguson, R. (2012). Learning analytics: drivers, developments and challenges. International Journal of Technology Enhanced Learning, 4(5-6), 304–317. [GS Search] doi: 10.1504/IJTEL.2012.051816
Fortenbacher, A., Beuster, L., Elkina, M., Kappe, L., Merceron, A., Pursian, A., . . . Wenzlaff, B. (2013). Lemo: A learning analytics application focussing on user path analysis and interactive visualization. In Intelligent data acquisition and advanced computing systems (idaacs), 2013 ieee 7th international conference on (Vol. 2, pp. 748–753). [GS Search] doi: 10.1109/IDAACS.2013.6663025
Gašević, D., Dawson, S., Rogers, T., Gasevic, D. (2016). Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher Education, 28, 68–84. [GS Search] doi: 10.1016/j.iheduc.2015.10.002
Gottardo, E., Kaestner, C. A. A., Noronha, R. V. (2014). Estimativa de desempenho acadêmico de estudantes: Análise da aplicação de técnicas de mineração de dados em cursos a distância. Revista Brasileira de Informática na Educação, 22(1). Retrieved from [Link] [GS Search]
Han, J., Pei, J., Kamber, M. (2011). Data mining: concepts and techniques. Elsevier. [GS Search]
He, H., Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), 1263–1284. [GS Search] doi: 10.1109/TKDE.2008.239
Hoang, N. D., Chau, V. T. N., Phung, N. H. (2016). Combining transfer learning and co-training for student classification in an academic credit system. In Computing communication technologies, research, innovation, and vision for the future (rivf), 2016 ieee rivf international conference on (pp. 55–60). [GS Search] doi: 10.1109/RIVF.2016.7800269
Hu, Y.-H., Lo, C.-L., Shih, S.-P. (2014). Developing early warning systems to predict students’ online learning performance. Computers in Human Behavior, 36, 469–478. [GS Search] doi: 10.1016/j.chb.2014.04.002
Jayaprakash, S. M., Moody, E. W., Lauría, E. J., Regan, J. R., Baron, J. D. (2014). Early alert of academically at-risk students: An open source analytics initiative. Journal of Learning Analytics, 1(1), 6–47. [GS Search] doi: 10.18608/jla.2014.11.3
Kampff, A. J. C. (2009). Mineração de dados educacionais para geração de alertas em ambientes virtuais de aprendizagem como apoio à prática docente. Retrieved from [Link] [GS Search]
Lagus, J. (2016). Course outcome prediction with transfer learning methods (Master’s thesis, University of Helsinki, Helsinki, Finland). [GS Search] doi: 10138/165915
Lara, J. A., Lizcano, D., Martínez, M. A., Pazos, J., & Riera, T. (2014). A system for knowledgediscovery in e-learning environments within the european higher education area–application tostudent data from open university of madrid, udima.Computers & Education,72, 23–36. [GS Search] doi: 10.1016/j.compedu.2013.10.009
Liu, B. (2011).Web data mining: Exploring hyperlinks, contents, and usage data. Springer Ber-lin Heidelberg. Retrieved from [Link] [GS Search]
Lu, J., Behbood, V., Hao, P., Zuo, H., Xue, S., & Zhang, G. (2015). Transfer learning usingcomputational intelligence: a survey. Knowledge-Based Systems, 80, 14–23. [GS Search] doi: 10.1016/j.knosys.2015.01.010
Macfadyen, L. P., & Dawson, S. (2010). Mining lms data to develop an “early warning system” for educators: A proof of concept. Computers & education, 54(2), 588–599. [GS Search] doi: 10.1016/j.compedu.2009.09.008
Manhães, L. M. B., Da Cruz, S. M. S., Costa, R. J. M., Zavaleta, J., & Zimbrão, G. (2011). Previsão de estudantes com risco de evasão utilizando técnicas de mineração de dados. In Brazilian symposium on computers in education (simpósio brasileiro de informática na educação-sbie) (Vol. 1). Retrieved from [Link] [GS Search]
Márquez-Vera, C., Cano, A., Romero, C., & Ventura, S. (2013). Predicting student failure atschool using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied intelligence, 38(3), 315–330. [GS Search] doi: 10.1007/s10489-012-0374-8
Martin, F. (2014 (accessed November 1, 2016)). A simple machine learning method to detectcovariate shift [Computer software manual]. Retrieved from [Link]
Moreno-Torres, J., Raeder, T., Alaiz-Rodríguez, R., Chawla, N., & Herrera, F. (2012). A unifyingview on dataset shift in classification. Pattern Recognition, 45(1), 521–530. [GS Search] doi: 10.1016/j.patcog.2011.06.019
Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on knowledgeand data engineering, 1345–1359. [GS Search] doi: 10.1109/TKDE.2009.191
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., . . . others (2011).Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(Oct), 2825–2830. Retrieved from [GS Search]
Peña-Ayala, A. (2013). Educational data mining: Applications and trends (Vol. 524). Springer. [GS Search]
Peña-Ayala, A. (2014). Educational data mining: A survey and a data mining-based analysis of recent works. Expert systems with applications, 41(4), 1432–1462. [GS Search] doi: 10.1016/j.eswa.2013.08.042
Quinlan, J. R. (1986, mar). Induction of decision trees. Machine Learning, 1(1), 81–106. [GSSearch] doi: 10.1007/BF00116251
Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. D. (2009). Dataset shiftin machine learning. The MIT Press. [GS Search]
Ramaswami, M., & Bhaskaran, R. (2009). A study on feature selection techniques in educationaldata mining.Journal of Computing,1(1), 7–11. Retrieved from [Link] [GS Search]
Raza, H., Prasad, G., & Li, Y. (2015). Ewma model based shift-detection methods for detecting covariate shifts in non-stationary environments. Pattern Recognition, 48(3), 659–669. [GS Search] doi: 10.1016/j.patcog.2014.07.028
Rigo, S. J., Cambruzzi, W., Barbosa, J. L., & Cazella, S. C. (2014). Educational data mining and learning analytics applications in dropout: opportunities and challenges. Brazilian Journal of Computers in Education, 22(01), 132. [GS Search] doi: 10.5753/rbie.2014.22.01.132
Romero, C., López, M.-I., Luna, J.-M., & Ventura, S. (2013). Predicting students’ final perfor-mance from participation in on-line discussion forums. Computers & Education, 68, 458–472. [GS Search] doi: 10.1016/j.compedu.2013.06.009
Santos, J. L., Govaerts, S., Verbert, K., & Duval, E. (2012). Goal-oriented visualizations of activity tracking: a case study with engineering students. In Proceedings of the 2nd international conference on learning analytics and knowledge (pp. 143–152). [GS Search] doi: 10.1145/2330601.2330639
Siemens, G., & Baker, R. S. (2012). Learning analytics and educational data mining: towards communication and collaboration. In Proceedings of the 2nd international conference on learning analytics and knowledge (pp. 252–254). [GS Search] doi: 10.1145/2330601.2330661
Siemens, G., & Long, P. (2011). Penetrating the fog: Analytics in learning and education. EDUCAUSE review, 46(5), 30. Retrieved from [Link] [GS Search]
Simpson, O. (2004). The impact on retention of interventions to support distance learning students. Open Learning: The Journal of Open, Distance and e-Learning, 19(1), 79–95. [GS Search] doi: 10.1080/0268051042000177863
Sugiyama, M., Nakajima, S., Kashima, H., & Buenau, P. V. (2008). Direct importance estimation with model selection and its application to covariate shift adaptation. In Advances in neural information processing systems (pp. 1433–1440). [GS Search] doi: 10.1007/s10463-008-0197-x
Sun, B., Feng, J., & Saenko, K. (2016). Return of frustratingly easy domain adaptation. In Aaai conference on artificial intelligence(Vol. 6, p. 8). Retrieved from https://arxiv.org/abs/1511.05547 [GS Search]
Thammasiri, D., Delen, D., Meesad, P., & Kasap, N. (2014). A critical assessment of imbalanced class distribution problem: The case of predicting freshmen student attrition. Expert Systems with Applications, 41(2), 321–330. [GS Search] doi: 10.1016/j.eswa.2013.07.046
Voß, L., Schatten, C., Mazziotti, C., & Schmidt-Thieme, L. (2015). A transfer learning approachfor applying matrix factorization to small its datasets. International Educational Data Mining Society. Retrieved from [Link] [GS Search]
You, J. W. (2016). Identifying significant indicators using lms data to predict course achievement in online learning. The Internet and Higher Education, 29, 23–30. [GS Search] doi: 10.1016/j.iheduc.2015.11.003
Zadrozny, B. (2004). Learning and evaluating classifiers under sample selection bias. In Proceedings of the 21th international conference on machine learning (p. 114). [GS Search] doi: 10.1145/1015330.101542525
Arquivos adicionais
Published
Como Citar
Issue
Section
Licença
Copyright (c) 2019 Daniel A. Guimarães De Los Reyes, Everton André Thomas, Lilian Landvoigt da Rosa, Wilson P. Gavião Neto
Este trabalho está licenciado sob uma licença Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.