Stance Detection and Twitter Users Automatic Labelling: the case of Covid-19's CPI

Authors

DOI:

https://doi.org/10.5753/isys.2023.3008

Keywords:

Stance Detection, User Automatic Labelling, Polarization, Twitter

Abstract

With the increasing influence of social media on public opinion, automated identification of political positions has become a crucial challenge for the fields of information systems and political science. In this context, this article aims to detect the stance and label a large number of Twitter users in Brazil on a politically controversial topic, in an automated and non-dependent way on annotated databases. For this purpose, an automated method is proposed to detect and label the stance of Twitter users on a politically controversial and polarized topic, using the Covid-19 Parliamentary Inquiry Committee (CPI) as a case study. The classification is based on a spectrum that encompasses both favorable and unfavorable positions towards the CPI, through the calculation of a sentiment score and two complementary metrics: balance degree and engagement. Unsupervised computational approaches, such as dimensionality reduction methods and clustering algorithms, were combined with minimally supervised techniques such as topic modeling and contextualized embeddings. This approach, along with social factors such as homophily and network structure, allowed for the automatic labeling of approximately 98% of the users in the studied databases with minimal supervision. This strategy may have significant implications for analyzing public opinion on controversial issues, providing insights into the distribution of political positions and the engagement strategies that social media users may employ.

Downloads

Download data is not yet available.

Author Biographies

Patrícia Dias dos Santos, Universidade Federal do ABC (UFABC)

Doutora em Ciência da Computação (2023), Mestre em Engenharia da Informação e Bacharela em Ciência da Computação pela Universidade Federal do ABC (UFABC). 

Tópicos de Interesse: aprendizado de máquina, processamento de linguagem natural, análise de redes sociais, computação social.

Denise Hideko Goya, Universidade Federal do ABC (UFABC)

Doutora em Ciência da Computação pela Universidade de São Paulo (2011), mestre e graduada pela mesma instituição (Departamento de Ciência da Computação do Instituto de Matemática e Estatística da USP). Tem atuação em Criptografia e segurança da informação, Ciência de dados e mídias sociais, Tecnologias educacionais, Gênero e ciência & tecnologia.   Tópicos de interesse: criptografia pós-quântica, segurança demonstrável, criptografia baseada em curvas elípticas, análise de fenômenos sociais a partir de mídias de comunicação e informação, meta-aprendizagem, metodologias em jogos sérios, ensino a distância, aspectos socioculturais relacionados a gênero, ciências e tecnologias.

References

AlDayel, A. and Magdy, W. (2021). Stance detection on social media: State of the art and trends. Information Processing & Management, 58(4):102597.

Alturayeif, N., Luqman, H., and Ahmed, M. (2023). A systematic review of machine learning techniques for stance detection and its applications. Neural Computing and Applications, pages 1–32.

Awadallah, R., Ramanath, M., and Weikum, G. (2012). Harmony and dissonance: organizing the people’s voices on political controversies. In Proceedings of the fifth ACM international conference on Web search and data mining, pages 523–532.

Barros, C. C. and do Vale, R. P. G. (2021). ”tchau, pfizer!”: Uma análise discursiva de charges publicadas durante a comissão parlamentar de inquérito da covid-19. Revista de Ciências Humanas, 3(21).

Bechini, A., Ducange, P., Marcelloni, F., and Renda, A. (2020). Stance analysis of twitter users: the case of the vaccination topic in italy. IEEE Intelligent Systems, 36(5):131–139.

Christhie, W., Reis, J. C., Moro, F. B. M. M., and Almeida, V. (2018). Detecção de posicionamento em tweets sobre política no contexto brasileiro. In Anais do VII Brazilian Workshop on Social Network Analysis and Mining. SBC.

Cinelli, M., Morales, G. D. F., Galeazzi, A., Quattrociocchi, W., and Starnini, M. (2020). Echo chambers on social media: A comparative analysis. arXiv preprint arXiv:2004.09603.

Conover, M., Ratkiewicz, J., Francisco, M., Gonc¸alves, B., Menczer, F., and Flammini, A. (2011). Political polarization on twitter. In Proceedings of the International AAAI Conference on Web and Social Media, volume 5, pages 89–96.

D’Andrea, E., Ducange, P., Bechini, A., Renda, A., and Marcelloni, F. (2019). Monitoring the public opinion about the vaccination topic from tweets analysis. Expert Systems with Applications, 116:209–226.

Dantas, A. and Nippes, G. (2022). Quando as máscaras vão cair? Revista Pet Economia UFES, 2(2):32–36.

Darwish, K., Stefanov, P., Aupetit, M., and Nakov, P. (2020). Unsupervised user stance detection on twitter. In Proceedings of the International AAAI Conference on Web and Social Media, volume 14, pages 141–152.

de Carvalho Mendes, G. P., Orso, M., and Alves, M. K. F. (2022). A cpi da covid-19 sob a ótica da extrema-direita: análise do perfil brazilfight no twitter. Tríade: Comunicação, Cultura e Mídia, 10(23):e022024–e022024.

Dori-Hacohen, S., Yom-Tov, E., and Allan, J. (2015). Navigating controversy as a complex search task. In SCST@ ECIR. Citeseer.

Ebeling, R., Saenz, C. A. C., Nobre, J., and Becker, K. (2020). Quarenteners vs. chloroquiners: A framework to analyze how political polarization affects the behavior of groups. In 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), pages 203–210. IEEE.

Figeac, J. and Favre, G. (2021). How behavioral homophily on social media influences the perception of tie-strengthening within young adults’ personal networks. New Media & Society, page 14614448211020691.

Grootendorst, M. (2022). Bertopic: Neural topic modeling with a class-based tf-idf procedure. arXiv e-prints, pages arXiv–2203.

Jungherr, A., Schoen, H., Posegga, O., and Jurgens, P. (2017). Digital trace data in the study of public opinion: An indicator of attention toward politics rather than political support. Social Science Computer Review, 35(3):336–356.

Kamienski, C., de Camargo Penteado, C. L., Goya, D., Rocha, R. V., de Souza, L. M., di Genova, D. V. B., Ramos, D. F. S., de Franc¸a, F. O., Horita, F., and dos Santos, C. d. S. (2022). Measuring network polarization and political sectarianism during the 2020 pandemic. IEEE Transactions on Computational Social Systems.

Kamienski, C., Mazim, L., Penteado, C., Goya, D., Di Genova, D., De Franca, F., Ramos, D., and Horita, F. (2021). A polarization approach for understanding online conflicts in times of pandemic: A brazilian case study.

Kucuk, D. and Can, F. (2020). Stance detection: A survey. ACM Computing Surveys (CSUR), 53(1):1–37.

Lin, J., Mao, W., and Zhang, Y. (2017). An enhanced topic modeling approach to multiple stance identification. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pages 2167–2170.

Magdy, W., Darwish, K., and Weber, I. (2016). # failedrevolutions: Using twitter to study the antecedents of isis support. In 2016 AAAI Spring Symposium Series.

Maia, M., Oliveira, E., and Gallegos, L. (2021). Covid-19 e tweets no brasil: coleta, tratamento e analise de textos com evidências de estados afetivos alterados em momentos impactantes. In Anais do X Brazilian Workshop on Social Network Analysis and Mining, pages 79–90. SBC.

Malagoli, L., Stancioli, J., Ferreira, C., Vasconcelos, M., Silva, A. P., and Almeida, J. (2021). Caracterização do debate no twitter sobre a vacinação contra a covid-19 no brasil. In Anais do X Brazilian Workshop on Social Network Analysis and Mining, pages 55–66, Porto Alegre, RS, Brasil. SBC.

McInnes, L., Healy, J., Saul, N., and Großberger, L. (2018). Umap: Uniform manifold approximation and projection. Journal of Open Source Software, 3(29).

McPherson, M., Smith-Lovin, L., and Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual review of sociology, 27(1):415–444.

Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., and Cherry, C. (2016). Semeval-2016 task 6: Detecting stance in tweets. In Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), pages 31–41.

Mohammad, S. M., Sobhani, P., and Kiritchenko, S. (2017). Stance and sentiment in tweets. ACM Transactions on Internet Technology (TOIT), 17(3):1–23.

Morosini, C. (2022). Discursos em conflito: estratégias bolsonaristas para deslegitimar a comissão parlamentar de inquérito (cpi) da pandemia através do twitter. Revista Investigações, 35(2):1–26.

Popat, K., Mukherjee, S., Yates, A., and Weikum, G. (2019). Stancy: Stance classification based on consistency cues. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 6413–6418.

Rashed, A., Kutlu, M., Darwish, K., Elsayed, T., and Bayrak, C. (2021). Embeddingsbased clustering for target specific stances: The case of a polarized turkey. In Proceedings of the International AAAI Conference on Web and Social Media, volume 15, pages 537–548.

Saenz, C. A. C. and Becker, K. (2021). Interpreting bert-based stance classification: a case study about the brazilian covid vaccination. In Anais do XXXVI Simposio Brasileiro de Bancos de Dados, pages 73–84. SBC.

Samih, Y. and Darwish, K. (2021). A few topical tweets are enough for effective user stance detection. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2637–2646.

Santos, P. and Goya, D. (2022). Detecção de posicionamento e rotulação automática de usuários do twitter: estudo sobre o embate científico-político no contexto da cpi da covid-19. In Anais do XI Brazilian Workshop on Social Network Analysis and Mining, pages 49–60, Porto Alegre, RS, Brasil. SBC.

Santos, A. (2022). Reflexões sobre a importância do pânico sexual para a ascensão do bolsonarismo ao poder. Lumina, 16(3):92–111.

Santos, P. D. and Goya, D. H. (2021). Automatic twitter stance detection on politically controversial issues: A study on covid-19’s cpi. In Anais do XVIII Encontro Nacional de Inteligencia Artificial e Computacional, pages 524–535. SBC.

Sirrianni, J. W., Liu, X., and Adams, D. (2021). Predicting stance polarity and intensity in cyber argumentation with deep bidirectional transformers. IEEE Transactions on Computational Social Systems, 8(3):655–667.

Sobhani, P., Inkpen, D., and Matwin, S. (2015). From argumentation mining to stance classification. In Proceedings of the 2nd Workshop on Argumentation Mining, pages 67–77.

Sobhani, P., Inkpen, D., and Zhu, X. (2017). A dataset for multi-target stance detection. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 551–557.

Souza, F., Nogueira, R., and Lotufo, R. (2020). BERTimbau: pretrained BERT models for Brazilian Portuguese. In 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, October 20-23 (to appear).

Stefanov, P., Darwish, K., Atanasov, A., and Nakov, P. (2020). Predicting the topical stance and political leaning of media using tweets. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 527–537.

Vamvas, J. and Sennrich, R. (2020). X-stance: A multilingual multi-target dataset for stance detection. In 5th SwissText & 16th KONVENS Joint Conference 2020, page 9.

Wagner Filho, J. A., Wilkens, R., Idiart, M., and Villavicencio, A. (2018). The brwac corpus: A new open resource for brazilian portuguese. In Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018).

Wischnewski, M., Ngo, T., Bernemann, R., Jansen, M., and Kramer, N. (2022). “i agree with you, bot!” how users (dis) engage with social bots on twitter. New Media & Society, page 14614448211072307.

Wojatzki, M. and Zesch, T. (2016). ltl. uni-due at semeval-2016 task 6: Stance detection in social media using stacked classifiers. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pages 428–433.

Zarrella, G. and Marsh, A. (2016). Mitre at semeval-2016 task 6: Transfer learning for stance detection. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pages 458–463.

Published

2024-01-14

How to Cite

Dias dos Santos, P., & Hideko Goya, D. (2024). Stance Detection and Twitter Users Automatic Labelling: the case of Covid-19’s CPI. ISys - Brazilian Journal of Information Systems, 16(1), 15:1 – 15:24. https://doi.org/10.5753/isys.2023.3008

Issue

Section

Extended versions of selected articles