Hate Speech Detection Against Women in Brazilian Portuguese Texts: Construction of the MINA-BR Database and Classification Model


  • Hannah O. Plath Universidade Estadual de Campinas
  • Maria Estela O. Paiva State University of Campinas
  • Danielle L. Pinto State University of Campinas
  • Paula D. P. Costa State University of Campinas


Hate Speech, Database, Misogyny, Machine Learning


Due to the wide use of social networks, among other reasons, hate speech has gained prominence, sometimes motivated by impunity, sometimes associated with freedom of expression. One of the reasons why hate speech recognition is a difficult task is the scarcity of adequate databases, especially in languages other than English or when we refer to a specific domain, such as misogyny. This article describes a database in the Brazilian Portuguese language, which can be useful to classify hate speech against women. This work also reports a preliminary study where established hate speech classification algorithms were used to determine a baseline for the dataset. The highest F1-score obtained was 0.57 by the SVM algorithm.


Download data is not yet available.


O. Plath, H., O. Paiva, M. E., L. Pinto, D., & D. P. Costa, P. (2022). Hate Speech Detection Against Women in Brazilian Portuguese Texts: Construction of the MINA-BR Database and Classification Model. Eletronic Journal of Undergraduate Research on Computing, 20(3). Retrieved from https://journals-sol.sbc.org.br/index.php/reic/article/view/2696



