Fake News Detection in Tweets: Challenges and Adaptations imposed by the COVID-19

Authors

DOI:

https://doi.org/10.5753/isys.2022.2286

Keywords:

fake news, COVID, misinformation, detection

Abstract

Misinformation has plagued citizens’ lives, especially on social networks. During the COVID-19 pandemic, the proliferation of competing narratives and dissemination of false or inaccurate news about the pandemic has reached such a state that led the World Health Organization to classify it as an infodemic. However, few resources are available to combat misinformation in this new and evolving domain, especially considering how social networks allow the rapid spreading of false narratives. In this case, the lack of resources, such as methods, tools, and reliable information on the virus, hinders our ability to combat this misinformation. In this work, we investigate the application of Text Analysis methods to help health-related scientific communicators produce educational material to combat misinformation. This study was conducted in association with the Scientific Communication sector of FIOCRUZ, a health research institution in Brazil, aiming to monitor COVID-19-related fake news and produce educational material to combat misinformation in a weekly manner due to the ephemeral nature of COVID-19 misinformation in social media. As the main findings of this work, we provide (1) a pipeline for automatically collecting and analyzing news and social media posts regarding COVID-19 in order
to provide science communicators with a weekly contextualized view of topics related to COVID-19 in social media; (2) we analyzed the effect of different resources and methods in the analytical tools employed in this work for detecting health-related misinformation in the Portuguese language, and finally, (3) we provided to journalists and science communicators in FIOCRUZ computational tools to automatically monitor COVID-related misinformation in social media, focusing on Twitter, aiming to contribute to definition of the weekly science communication agenda of the institution. Indeed, we indicate the type of resources to combat misinformation in the pandemic, and our approach can handle the detection of misinformation on Twitter social networks within the COVID-19 domain.

Downloads

Download data is not yet available.

References

Al-Rakhami, M. S. and Al-Amri, A. M. (2020). Lies kill,facts save: Detecting covid-19 misinformation in twitter. IEEE Access, 8:155961–155970.

Almatarneh, S., Gamallo, P., ALshargabi, B., Al-Khassawneh, Y., and Alzubi, R. (2021). Comparing traditional machine learning methods for covid-19 fake news. In 2021 22nd International Arab Conference on Information Technology (ACIT), pages 1–4.

Anoop, K., Deepak, P., and V, L. L. (2020). Emotion cognizance im-proves health fake news identification. In Proceedings of the 24th Symposium on International Database Engineering & Applications, IDEAS ’20, New York, NY, USA. Association for Computing Machinery.

Bastos, M. T. and Mercea, D. (2019). The brexit botnet and user-generated hyperpartisan news. Social Science Computer Review, 37(1):38–54.

Boleda, G. (2020). Distributional semantics and linguistic theory. Annual Review of Linguistics, 6(1):213–234.

Cabral, L., Monteiro, J. M. S., da Silva, J. W. F., Mattos, C. L. C., and Mourão, P. J. C. (2021). Fakewhastapp.br: Nlp and machine learning techniques for misinformation detection in brazilian portuguese whatsapp messages. In ICEIS.

Ciampaglia, G. L., Shiralkar, P., Rocha, L. M., Bollen, J., Menczer,F., and Flammini, A. (2015). Computational fact checking from knowledge networks. PloS one, 10(6):e0128193.

Cordeiro, P. R. and Pinheiro, V. (2019). Um corpus denot ́ıcias falsas do twitter e verificação automática de rumores em lingua portuguesa. InSTIL-Brazilian Symposium in Information and Human Language Technology. IEEE, Salvaldor, BA, Brazil, pages 220–228.

Cruz, R., Neto, G. N., and Anchiêta, R. (2021). Detecting misinformation in tweets related to covid-19. In Anais do XVIII Encontro Nacional de Inteligência Artificial e Computacional, pages 280–289, Porto Alegre, RS, Brasil. SBC.

Dagan, I., Glickman, O., and Magnini, B. (2005). The pascal recognising textual entailment challenge. In Machine Learning Challenges Workshop, pages177–190. Springer.

Dahlgren, P. (2018). Media, knowledge and trust: The deepening epistemic crisis of democracy.Javnost - The Public, 25(1-2):20–27.

Dantas, L. F. S. and Deccache-Maia, E. (2020).Divulgação científica no combate às fake news em tempos de covid-19. Research,Society and Development, 9(7):e797974776–e797974776.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprintarXiv:1810.04805.

Dryhurst, S., Schneider, C. R., Kerr, J., Freeman, A. L., Recchia, G.,Van Der Bles, A. M., Spiegelhalter, D., and van der Linden, S. (2020). Risk perceptionsof covid-19 around the world. Journal of Risk Research, pages 1–13.

Dumitrache, A., Aroyo, L., and Welty, C. (2018). Crowdsourcingsemantic label propagation in relation classification. CoRR, abs/1809.00537.

Evert, S. (2010). Distributional semantic models. InNAACL HLT 2010 Tuto-rial Abstracts, pages 15–18, Los Angeles, California. Association for Computational Linguistics.

Fonseca, E., Santos, L., Criscuolo, M., and Aluisio, S. (2016). Assin: Avaliação de similaridade semântica e inferência textual. In Computational Processingof the Portuguese Language-12th International Conference, Tomar, Portugal, pages 13–15.

Forelle, M., Howard, P., Monroy-Hern ́andez, A., and Savage, S. (2015). Political bots and the manipulation of public opinion in venezuela. arXiv preprintarXiv:1507.07109.

Giordano, G., Mottola, S., and Beatrice, P. (2020). A short reviewof some mathematical methods to detect fake news. International Journal of Circuits, Systems and Signal Processing, 14:255–265

Hartmann, N., Fonseca, E., Shulby, C., Treviso, M., Rodrigues, J., and Aluisio, S. (2017). Portuguese word embeddings: Evaluating on word analogies and natural language tasks. arXiv preprint arXiv:1708.06025.

Jacobsen, K. H. and Vraga, E. K. (2020). Improving communication about covid-19 and emerging infectious diseases. European journal of clinicalinvestigation, 50.

Joachims, T. (2002). Learning to classify text using support vector machines - methods, theory and algorithms. In The Kluwer international series in engineering and computer science.

Jones, M. O. (2019). The gulf information war— propaganda, fake news, and fake trends: The weaponization of twitter bots in the gulf crisis. International journal of communication, 13:27.

Krause, N. M., Freiling, I., Beets, B., and Brossard, D. (2020). Fact-checking as risk communication: the multi-layered risk of misinformation in times of covid-19. Journal of Risk Research, pages 1–8.

Li, Y., Bandar, Z., and Mclean, D. (2003). An approach for measuringsemantic similarity between words using multiple information sources. IEEE Trans-actions on Knowledge and Data Engineering, 15(4):871–882.

Lorena, A. C. and Carvalho, A. C. P. d. L. F. (2003). Introdução às máquinas de vetores suporte (support vector machines).

Marín, I. P. and Arroyo, D. (2019). Fake news detection. In Computational Intelligence in Security for Information Systems Conference, pages 229–238. Springer.

Medeiros, F. and Braga, R. (2020). Fake news detection in so-cial media: A systematic review. In Anais do XVI Simpósio Brasileiro de Sistemas de Informação, Porto Alegre, RS, Brasil. SBC.

Meleo-Erwin, Z., Basch, C., MacLean, S. A., Scheibner, C., and Cadorett, V. (2017). “to each his own”: Discussions of vaccine decision-making in topparenting blogs. Human vaccines & immunotherapeutics, 13(8):1895–1901.

Messeder Neto, H. (2019). A divulgação científica em tempos deobscurantismo e de fake news: contribuições histórico-críticas. In Rocha, M. and Oliveira, R., editors, Divulgação Científica: Textos E Contextos. Livraria da Física, São Paulo, 1 edition.

Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficientestimation of word representations in vector space. arXiv preprint arXiv:1301.3781

Mitra, B., Craswell, N., et al. (2018). An introduction to neural information retrieval. Now Foundations and Trends Boston, MA.

Monteiro, R. A., Santos, R. L., Pardo, T. A., De Almeida, T. A., Ruiz,E. E., and Vale, O. A. (2018). Contributions to the study of fake news in portuguese: New corpus and automatic detection results. In International Conference on Computational Processing of the Portuguese Language, pages 324–334. Springer.

Oshikawa, R., Qian, J., and Wang, W. Y. (2020). A survey on natural language processing for fake news detection. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 6086–6093.

Paka, W. S., Bansal, R., Kaushik, A., Sengupta, S., and Chakraborty, T. (2021). Cross-sean: A cross-stitch semi-supervised neural attention model for covid- 19 fake news detection. Applied Soft Computing, 107:107393.

Pawar, S., Ramrakhiyani, N., Hingmire, S., and Palshikar, G. K. (2017). Topics and label propagation: Best of both worlds for weakly supervised text classification.

Perez-Rosas, V., Kleinberg, B., Lefevre, A., and Mihalcea, R.(2018). Automatic detection of fake news. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3391–3401. Association for Computational Linguistics.

Plous, S. (1993).The psychology of judgment and decision making. Mcgraw-Hill Book Company.

Priya, A. and Kumar, A. (2021). Deep ensemble approach for covid-19 fake news detection from social media. In 2021 8th International Conference on Signal Processing and Integrated Networks (SPIN), pages 396–401.

Ruediger, M. A. (2017). Robôs, redes sociais e política no brasil: estudo sobre interferências ilegítimas no debate público na web, riscos à democracia e processo eleitoral de 2018.

Ruiz, E. and Okano, E. (2019). Using linguistic cues to detect fakenews on the brazilian portuguese parallel corpus fake. br. In Proceedings of the 12th Brazilian Symposium in Information and Human Language Technology, pages 181–189.

Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019). Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter.arXiv preprintarXiv:1910.01108.

Schmidt, A. L., Zollo, F., Scala, A., Betsch, C., and Quattrociocchi,W. (2018). Polarization of the vaccination debate on facebook. Vaccine, 36(25):3606–3612.

Shao, C., Ciampaglia, G. L., Varol, O., Flammini, A., and Menczer, F. (2017). The spread of fake news by social bots. arXiv preprint arXiv:1707.07592, 96:104.

Shao, C., Ciampaglia, G. L., Varol, O., Yang, K.-C., Flammini, A., andMenczer, F. (2018). The spread of low-credibility content by social bots. Nature communications, 9(1):1–9.

Sharma, K., Seo, S., Meng, C., Rambhatla, S., Dua, A., and Liu, Y.(2020). Coronavirus on social media: Analyzing misinformation in twitter conversations. CoRR, abs/2003.12309.

Silva, R. M., Santos, R. L., Almeida, T. A., and Pardo, T. A. (2020). Towards automatically filtering fake news in portuguese. Expert Systems with Appli-cations, 146:113199.

Tandoc Jr, E. C., Lim, Z. W., and Ling, R. (2018). Defining “fakenews” a typology of scholarly definitions. Digital journalism, 6(2):137–153.

Uscinski, J. E. and Butler, R. W. (2013). The epistemology offact checking. Critical Review, 25(2):162–180.

van Dijck, J. and Alinejad, D. (2020). Social media and trustin scientific expertise: Debating the covid-19 pandemic in the netherlands. SocialMedia+ Society, 6(4):2056305120981057.

Vosoughi, S., Roy, D., and Aral, S. (2018). The spread of true andfalse news online. Science, 359(6380):1146–1151.

Vraga, E. K. and Bode, L. (2017). Using expert sources to correct health misinformation in social media. Science Communication, 39(5):621–645.

Wadden, D., Lin, S., Lo, K., Wang, L. L., van Zuylen, M., Cohan, A.,and Hajishirzi, H. (2020). Fact or fiction: Verifying scientific claims. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7534–7550.

Wang, B., Shen, Y., and Liu, Y. (2011). Integrating distance metric learning into label propagation model for multi-label image annotation. In2011 18th IEEE International Conference on Image Processing, pages 3649–3652.

World Health Organization (2020). Novel coronavirus (2019-ncov) situation report - 13. Disponível em:

link.

Yiannakoulias, N., Slavik, C. E., and Chase, M. (2019). Expressions of pro-and anti-vaccine sentiment on youtube. Vaccine, 37(15):2057–2064.

Zanzotto, F. M. (2019). Human-in-the-loop artificial intelligence. Journalof Artificial Intelligence Research, 64:243–252.

Zhou, X., Mulay, A., Ferrara, E., and Zafarani, R. (2020). Recovery. Proceedings of the 29th ACM International Conference on Information & Knowledge Management.

Zhu, X. and Ghahramani, Z. (2002). Learning from labeled and unlabeled data with label propagation.

Downloads

Published

2022-10-18

How to Cite

Santana, C., Claro, D. B., & Souza, M. (2022). Fake News Detection in Tweets: Challenges and Adaptations imposed by the COVID-19. ISys - Brazilian Journal of Information Systems, 15(1), 11:1–11:26. https://doi.org/10.5753/isys.2022.2286

Issue

Section

Special issues articles

Most read articles by the same author(s)