Off-Topic Essay Detection: A comparative study on the Portuguese language

Authors

  • Guilherme Passero University of Vale do Itajaí (UNIVALI)
  • Rafael Ferreira Federal Rural University of Pernambuco (UFRPE)
  • Rudimar Luís Scaranto Dazzi University of Vale do Itajaí (UNIVALI)

DOI:

https://doi.org/10.5753/rbie.2019.27.03.177

Keywords:

Natural Language Processing, Semantic analysis, Text classification, Automated essay evaluation

Abstract

Advances in automated essay grading over the last sixty years enabled its application in real scenarios, such as classrooms and high-stakes testing. The recognition of off-topic essays is one of the tasks addressed in automated essay grading. An essay is regarded as off-topic when the student does not develop the expected prompt-related concepts, sometimes purposely. Off-topic essays may receive a zero score in high-stake tests. An off-topic essay detection mechanism may be used in parallel or embedded in an automated essay grading system to improve its performance. In this context, the main goal of this study is to evaluate the existing approaches for automated offtopic essay detection. A previous systematic review of the literature showed some deficiencies in the state of the art, including: the low accuracy of current approaches, the use of artificial validation sets, and the lack of studies focused on the Portuguese language. In this study, the approaches found in the literature, originally proposed for the English language, were adapted for the Portuguese language and compared in an experiment using a public corpus of 2164 essays related to 111 prompts. The experiment used a set of artificial off-topic examples and the best performing algorithm achieved higher accuracy than that found in the literature for the English language (96.76% vs. 94.75%). The results presented suggest the application of off-topic essay detection mechanisms in the Brazilian educational context in order to benefit the student, with computer generated feedback, and educational institutions, regarding automated essay grading. Some suggestions for future research are presented, including the need to address the task of off-topic essay detection as a multiclass problem, and to reproduce the experiment with a larger and more representative set of real off-topic essay examples.

Downloads

Download data is not yet available.

Author Biographies

Guilherme Passero, University of Vale do Itajaí (UNIVALI)

Laboratory of Applied Intelligence

Rafael Ferreira, Federal Rural University of Pernambuco (UFRPE)

Informatics Center

Rudimar Luís Scaranto Dazzi, University of Vale do Itajaí (UNIVALI)

Laboratory of Applied Intelligence

References

Brasil. (2016). ENEM 2016: Resultado Individual [ENEM 2016: Individual Result]. Disponível em [Link]

Chen, J., & Zhang, M. (2016). Identifying Useful Features to Detect Off-Topic Essays in Automated Scoring Without Using Topic-Specific Training Essays. Springer Proceedings in Mathematics and Statistics, 140(August), 315–326. DOI:10.1007/978-3-319-19977-1 [GS Search]

Dikli, S. (2006). An Overview of Automated Scoring of Essays. Journal Of Technology Learning And Assessment, 5(1). Disponível em [Link] [GS Search]

Hartmann, N. S. (2016). Solo Queue at ASSIN : Combinando Abordagens Tradicionais e Emergentes [Solo Queue at ASSIN: Combining Traditional and Emerging Approaches]. In PROPOR – International Conference on the Computational Processing of Portuguese (p. 6). http://propor2016.di.fc.ul.pt/wp-content/uploads/2015/10/ASSIN-2016-solo-queue.pdf [GS Search]

Hearst, M. (2000). The debate on automated essay grading. Intelligent Systems and Their Applications, IEEE, 15(5), 22–37. DOI:10.1109/5254.889104 [GS Search]

Higgins, D., Burstein, J., & Attali, Y. (2006). Identifying off-topic student essays without topic-specific training data. Natural Language Engineering, 12(2), 145–159. DOI:10.1017/S1351324906004189 [GS Search]

Higgins, D., & Heilman, M. (2014). Managing what we can measure: Quantifying the susceptibility of automated scoring systems to gaming behavior. Educational Measurement: Issues and Practice, 33(3), 36–46. DOI:10.1111/emip.12036 [GS Search]

Klebanov, B. B., Flor, M., & Gyawali, B. (2016). Topicality-Based Indices for Essay Scoring. Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications, 63–72. DOI:10.18653/v1/W16-0507 [GS Search]

Li, Y., & Yan, Y. (2012). An effective automated essay scoring system using support vector regression. Proceedings - 2012 5th International Conference on Intelligent Computation Technology and Automation, ICICTA 2012, 65–68. DOI:10.1109/ICICTA.2012.23 [GS Search]

Louis, A., & Higgins, D. (2010). Off-topic essay detection using short prompt texts. NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications, (June), 92–95. https://www.aclweb.org/anthology/W10-1013.pdf [GS Search]

Marino, E. R. (1980). Estudos de Português para o 2o Grau [Portuguese Studies for Highschool]. Editora do Brasil, 1st ed. São Paulo. [GS Search]

National Center for Education Statistics. (2012). The Nation’s Report Card: Writing 2011. The Nation’s Report Card: Writing 2011 (NCES 2012–470). https://nces.ed.gov/nationsreportcard/pdf/main2011/2012470.pdf [GS Search]

Page, E. B. (1968). The use of the computer in analyzing student essays. International Review of Education, 14(2), 210–225. DOI:10.1007/BF01419938 [GS Search]

Passero, G., Ferreira, R., Haendchen Filho, A., & Dazzi, R. (2017). Off-Topic Essay Detection: A Systematic Review. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE) (Vol. 28, p. 51). DOI:10.5753/cbie.sbie.2017.51 [GS Search]

Persing, I., & Ng, V. (2014). Modeling Prompt Adherence in Student Essays. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, (June), 1534–1543. DOI:10.3115/v1/P15-1053 [GS Search]

Rei, M., & Cummins, R. (2016). Sentence Similarity Measures for Fine-Grained Estimation of Topical Relevance in Learner Essays. Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications, 283–288. DOI:10.18653/v1/W16-0533 [GS Search]

Rocco, M. T. F. (2011). Crise na linguagem: a redação no vestibular [Crisis in language: the essay in the vestibular exam]. Em Aberto, 2(12). [GS Search]

Wilson, J., & Andrada, G. N. (2016). Using Automated Feedback to Improve Writing Quality: Opportunities and Challenges. In Handbook of Research on Technology Tools for Real-World Skill Development (pp. 678–703). Hershey: Information Science Reference. DOI:10.4018/978-1-4666-9441-5.ch026 [GS Search]

Zupanc, K., & Bosnić, Z. (2017). Automated essay evaluation with semantic analysis. Knowledge-Based Systems, 120, 118–132. DOI:10.1016/j.knosys.2017.01.006 [GS Search]

Additional Files

Published

2019-09-01

How to Cite

PASSERO, G.; FERREIRA, R.; DAZZI, R. L. S. Off-Topic Essay Detection: A comparative study on the Portuguese language. Brazilian Journal of Computers in Education, [S. l.], v. 27, n. 3, p. 177–190, 2019. DOI: 10.5753/rbie.2019.27.03.177. Disponível em: https://journals-sol.sbc.org.br/index.php/rbie/article/view/4727. Acesso em: 21 dec. 2024.

Issue

Section

Articles

Most read articles by the same author(s)