Detecção de fake news em Tweets: Desafios e Adaptações impostas pela COVID-19

Authors

DOI:

https://doi.org/10.5753/isys.2022.2286

Keywords:

fake news, covid, desinformação, detecção

Abstract

Uma grande quantidade de desinformação tem atormentado a vida dos cidadãos, especialmente nas redes sociais. Durante a pandemia do coronavírus, a grande quantidade de notícias falsas ou imprecisas sobre o vírus levou a Organização Mundial da Saúde a declarar a situação como uma infodemia. No entanto, poucos recursos estão disponíveis para combater a desinformação em um domínio novo e em evolução, como a pandemia do coronavírus. Esse fato é agravado quando essa desinformação está associada à velocidade de difusão das redes sociais. Nesse caso, a falta de recursos, como métodos genéricos, ferramentas e corpora sobre coronavírus, dificulta o combate a tal desinformação. Nosso trabalho teve como objetivo avaliar os recursos existentes e seus usos potenciais para o domínio específico e efêmero de Covid-19. Além disso, analisamos diferentes estilos de escrita e a necessidade de criar um mecanismo de anotação do conjunto de dados COVID em português para melhorar os mecanismos de detecção. Nossos resultados indicaram o tipo de recursos para combater a desinformação na pandemia e a pontuação F1 de nossas abordagens para detectar desinformação relacionada à rede social Twitter no domínio COVID.

Downloads

Não há dados estatísticos.

Referências

Al-Rakhami, M. S. and Al-Amri, A. M. (2020). Lies kill,facts save: Detecting covid-19 misinformation in twitter. IEEE Access, 8:155961–155970.

Almatarneh, S., Gamallo, P., ALshargabi, B., Al-Khassawneh, Y., and Alzubi, R. (2021). Comparing traditional machine learning methods for covid-19 fake news. In 2021 22nd International Arab Conference on Information Technology (ACIT), pages 1–4.

Anoop, K., Deepak, P., and V, L. L. (2020). Emotion cognizance im-proves health fake news identification. In Proceedings of the 24th Symposium on International Database Engineering & Applications, IDEAS ’20, New York, NY, USA. Association for Computing Machinery.

Bastos, M. T. and Mercea, D. (2019). The brexit botnet and user-generated hyperpartisan news. Social Science Computer Review, 37(1):38–54.

Boleda, G. (2020). Distributional semantics and linguistic theory. Annual Review of Linguistics, 6(1):213–234.

Cabral, L., Monteiro, J. M. S., da Silva, J. W. F., Mattos, C. L. C., and Mourão, P. J. C. (2021). Fakewhastapp.br: Nlp and machine learning techniques for misinformation detection in brazilian portuguese whatsapp messages. In ICEIS.

Ciampaglia, G. L., Shiralkar, P., Rocha, L. M., Bollen, J., Menczer,F., and Flammini, A. (2015). Computational fact checking from knowledge networks. PloS one, 10(6):e0128193.

Cordeiro, P. R. and Pinheiro, V. (2019). Um corpus denot ́ıcias falsas do twitter e verificação automática de rumores em lingua portuguesa. InSTIL-Brazilian Symposium in Information and Human Language Technology. IEEE, Salvaldor, BA, Brazil, pages 220–228.

Cruz, R., Neto, G. N., and Anchiêta, R. (2021). Detecting misinformation in tweets related to covid-19. In Anais do XVIII Encontro Nacional de Inteligência Artificial e Computacional, pages 280–289, Porto Alegre, RS, Brasil. SBC.

Dagan, I., Glickman, O., and Magnini, B. (2005). The pascal recognising textual entailment challenge. In Machine Learning Challenges Workshop, pages177–190. Springer.

Dahlgren, P. (2018). Media, knowledge and trust: The deepening epistemic crisis of democracy.Javnost - The Public, 25(1-2):20–27.

Dantas, L. F. S. and Deccache-Maia, E. (2020).Divulgação científica no combate às fake news em tempos de covid-19. Research,Society and Development, 9(7):e797974776–e797974776.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprintarXiv:1810.04805.

Dryhurst, S., Schneider, C. R., Kerr, J., Freeman, A. L., Recchia, G.,Van Der Bles, A. M., Spiegelhalter, D., and van der Linden, S. (2020). Risk perceptionsof covid-19 around the world. Journal of Risk Research, pages 1–13.

Dumitrache, A., Aroyo, L., and Welty, C. (2018). Crowdsourcingsemantic label propagation in relation classification. CoRR, abs/1809.00537.

Evert, S. (2010). Distributional semantic models. InNAACL HLT 2010 Tuto-rial Abstracts, pages 15–18, Los Angeles, California. Association for Computational Linguistics.

Fonseca, E., Santos, L., Criscuolo, M., and Aluisio, S. (2016). Assin: Avaliação de similaridade semântica e inferência textual. In Computational Processingof the Portuguese Language-12th International Conference, Tomar, Portugal, pages 13–15.

Forelle, M., Howard, P., Monroy-Hern ́andez, A., and Savage, S. (2015). Political bots and the manipulation of public opinion in venezuela. arXiv preprintarXiv:1507.07109.

Giordano, G., Mottola, S., and Beatrice, P. (2020). A short reviewof some mathematical methods to detect fake news. International Journal of Circuits, Systems and Signal Processing, 14:255–265

Hartmann, N., Fonseca, E., Shulby, C., Treviso, M., Rodrigues, J., and Aluisio, S. (2017). Portuguese word embeddings: Evaluating on word analogies and natural language tasks. arXiv preprint arXiv:1708.06025.

Jacobsen, K. H. and Vraga, E. K. (2020). Improving communication about covid-19 and emerging infectious diseases. European journal of clinicalinvestigation, 50.

Joachims, T. (2002). Learning to classify text using support vector machines - methods, theory and algorithms. In The Kluwer international series in engineering and computer science.

Jones, M. O. (2019). The gulf information war— propaganda, fake news, and fake trends: The weaponization of twitter bots in the gulf crisis. International journal of communication, 13:27.

Krause, N. M., Freiling, I., Beets, B., and Brossard, D. (2020). Fact-checking as risk communication: the multi-layered risk of misinformation in times of covid-19. Journal of Risk Research, pages 1–8.

Li, Y., Bandar, Z., and Mclean, D. (2003). An approach for measuringsemantic similarity between words using multiple information sources. IEEE Trans-actions on Knowledge and Data Engineering, 15(4):871–882.

Lorena, A. C. and Carvalho, A. C. P. d. L. F. (2003). Introdução às máquinas de vetores suporte (support vector machines).

Marín, I. P. and Arroyo, D. (2019). Fake news detection. In Computational Intelligence in Security for Information Systems Conference, pages 229–238. Springer.

Medeiros, F. and Braga, R. (2020). Fake news detection in so-cial media: A systematic review. In Anais do XVI Simpósio Brasileiro de Sistemas de Informação, Porto Alegre, RS, Brasil. SBC.

Meleo-Erwin, Z., Basch, C., MacLean, S. A., Scheibner, C., and Cadorett, V. (2017). “to each his own”: Discussions of vaccine decision-making in topparenting blogs. Human vaccines & immunotherapeutics, 13(8):1895–1901.

Messeder Neto, H. (2019). A divulgação científica em tempos deobscurantismo e de fake news: contribuições histórico-críticas. In Rocha, M. and Oliveira, R., editors, Divulgação Científica: Textos E Contextos. Livraria da Física, São Paulo, 1 edition.

Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficientestimation of word representations in vector space. arXiv preprint arXiv:1301.3781

Mitra, B., Craswell, N., et al. (2018). An introduction to neural information retrieval. Now Foundations and Trends Boston, MA.

Monteiro, R. A., Santos, R. L., Pardo, T. A., De Almeida, T. A., Ruiz,E. E., and Vale, O. A. (2018). Contributions to the study of fake news in portuguese: New corpus and automatic detection results. In International Conference on Computational Processing of the Portuguese Language, pages 324–334. Springer.

Oshikawa, R., Qian, J., and Wang, W. Y. (2020). A survey on natural language processing for fake news detection. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 6086–6093.

Paka, W. S., Bansal, R., Kaushik, A., Sengupta, S., and Chakraborty, T. (2021). Cross-sean: A cross-stitch semi-supervised neural attention model for covid- 19 fake news detection. Applied Soft Computing, 107:107393.

Pawar, S., Ramrakhiyani, N., Hingmire, S., and Palshikar, G. K. (2017). Topics and label propagation: Best of both worlds for weakly supervised text classification.

Perez-Rosas, V., Kleinberg, B., Lefevre, A., and Mihalcea, R.(2018). Automatic detection of fake news. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3391–3401. Association for Computational Linguistics.

Plous, S. (1993).The psychology of judgment and decision making. Mcgraw-Hill Book Company.

Priya, A. and Kumar, A. (2021). Deep ensemble approach for covid-19 fake news detection from social media. In 2021 8th International Conference on Signal Processing and Integrated Networks (SPIN), pages 396–401.

Ruediger, M. A. (2017). Robôs, redes sociais e política no brasil: estudo sobre interferências ilegítimas no debate público na web, riscos à democracia e processo eleitoral de 2018.

Ruiz, E. and Okano, E. (2019). Using linguistic cues to detect fakenews on the brazilian portuguese parallel corpus fake. br. In Proceedings of the 12th Brazilian Symposium in Information and Human Language Technology, pages 181–189.

Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019). Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter.arXiv preprintarXiv:1910.01108.

Schmidt, A. L., Zollo, F., Scala, A., Betsch, C., and Quattrociocchi,W. (2018). Polarization of the vaccination debate on facebook. Vaccine, 36(25):3606–3612.

Shao, C., Ciampaglia, G. L., Varol, O., Flammini, A., and Menczer, F. (2017). The spread of fake news by social bots. arXiv preprint arXiv:1707.07592, 96:104.

Shao, C., Ciampaglia, G. L., Varol, O., Yang, K.-C., Flammini, A., andMenczer, F. (2018). The spread of low-credibility content by social bots. Nature communications, 9(1):1–9.

Sharma, K., Seo, S., Meng, C., Rambhatla, S., Dua, A., and Liu, Y.(2020). Coronavirus on social media: Analyzing misinformation in twitter conversations. CoRR, abs/2003.12309.

Silva, R. M., Santos, R. L., Almeida, T. A., and Pardo, T. A. (2020). Towards automatically filtering fake news in portuguese. Expert Systems with Appli-cations, 146:113199.

Tandoc Jr, E. C., Lim, Z. W., and Ling, R. (2018). Defining “fakenews” a typology of scholarly definitions. Digital journalism, 6(2):137–153.

Uscinski, J. E. and Butler, R. W. (2013). The epistemology offact checking. Critical Review, 25(2):162–180.

van Dijck, J. and Alinejad, D. (2020). Social media and trustin scientific expertise: Debating the covid-19 pandemic in the netherlands. SocialMedia+ Society, 6(4):2056305120981057.

Vosoughi, S., Roy, D., and Aral, S. (2018). The spread of true andfalse news online. Science, 359(6380):1146–1151.

Vraga, E. K. and Bode, L. (2017). Using expert sources to correct health misinformation in social media. Science Communication, 39(5):621–645.

Wadden, D., Lin, S., Lo, K., Wang, L. L., van Zuylen, M., Cohan, A.,and Hajishirzi, H. (2020). Fact or fiction: Verifying scientific claims. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7534–7550.

Wang, B., Shen, Y., and Liu, Y. (2011). Integrating distance metric learning into label propagation model for multi-label image annotation. In2011 18th IEEE International Conference on Image Processing, pages 3649–3652.

World Health Organization (2020). Novel coronavirus (2019-ncov) situation report - 13. Disponível em:

link.

Yiannakoulias, N., Slavik, C. E., and Chase, M. (2019). Expressions of pro-and anti-vaccine sentiment on youtube. Vaccine, 37(15):2057–2064.

Zanzotto, F. M. (2019). Human-in-the-loop artificial intelligence. Journalof Artificial Intelligence Research, 64:243–252.

Zhou, X., Mulay, A., Ferrara, E., and Zafarani, R. (2020). Recovery. Proceedings of the 29th ACM International Conference on Information & Knowledge Management.

Zhu, X. and Ghahramani, Z. (2002). Learning from labeled and unlabeled data with label propagation.

Downloads

Published

2022-10-18

Como Citar

Santana, C., Claro, D. B., & Souza, M. (2022). Detecção de fake news em Tweets: Desafios e Adaptações impostas pela COVID-19. ISys - Revista Brasileira De Sistemas De Informação, 15(1), 11:1–11:26. https://doi.org/10.5753/isys.2022.2286

Issue

Section

Artigos de Edição Especial

Artigos mais lidos pelo mesmo(s) autor(es)