Foreword:
We are delighted to present this special issue of the Journal of the Brazilian Computer Society dedicated to Language Models in Portuguese. The recent advancement of large language models (LLMs) has profoundly transformed the way we interact with information, create content, and develop intelligent applications. However, much of this progress has been concentrated in languages with ample data and resource availability, highlighting the need for initiatives focused on languages with less technological coverage, such as Portuguese.
Portuguese is currently the fifth most spoken language in the world, present on multiple continents and cultural contexts, and represents a fertile field for artificial intelligence research. The challenge of building, evaluating, and applying robust language models for Portuguese involves not only technical aspects, such as corpora availability, efficient architectures, and evaluation metrics, but also social, ethical, and cultural dimensions.
This special issue brings together articles that explore different perspectives on the topic, including:
- Comparative and critical analyses of language models
- Social, ethical, financial, and ecological issues related to language models
- Discussion of alternative solutions for language models
- Domain-specific language models
- Suitability of narrow language models for specific tasks
- Multilingual vs. Portuguese-specific models
- Semantic issues in language models
- Cultural issues in language models
- Resources for training language models
- Language model evaluation
By bringing together contributions from researchers from Brazil and abroad, this edition seeks to strengthen the Natural Language Processing (NLP) community in Portuguese, foster collaborations, and highlight scientific advances that position the Portuguese language as a key player in the era of language models.
We thank the authors who submitted their papers, the reviewers for their thoughtful and generous efforts, and the scientific community that has been mobilizing to consolidate this emerging field. We are convinced that the articles published here will contribute not only to academic advancement but also to the development of more inclusive, sustainable, and culturally sensitive technologies.
Articles:
Renato Moraes Silva, Hazem Amamou, Lucca Baptista Silva Ferraz, Fabio Kauê Araujo da Silva, Anderson Raymundo Avila. Fake News Detection in Portuguese Under Large Language Model-Generated Content. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1150–1167. DOI:https://doi.org/10.5753/jbcs.2025.5525
Emanuelle Marreira, Tiago de Melo, Miguel de Oliveira, Carlos M. S. Figueiredo. Rating Prediction in Brazilian Portuguese: A Benchmark of Large Language Models. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 828–839. DOI:https://doi.org/10.5753/jbcs.2025.5667
Breno O. Funicheli, Kenzo Sakiyama, Rodrigo Nogueira, Roseli A. F. Romero. Enhancing Brazilian Legal Information Retrieval: An Automated Keyphrase Generation. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1188–1202. DOI:https://doi.org/10.5753/jbcs.2025.5711
Eduardo Darrazão, Krerley Oliveira, Luiz Celso Gomes-Jr. Sequence Labeling in Product Descriptions on Invoices: Comparing LLM-based settings with a CRF baseline. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1203–1212. DOI:https://doi.org/10.5753/jbcs.2025.5743
Pablo Rodríguez, Pablo Gamallo, Daniel Santos, Susana Sotelo, Silvia Paniagua, José Ramom Pichel, Pedro Salgueiro, Vítor Nogueira, Paulo Quaresma, Marcos Garcia, Senén Barro. Enhancing Large Language Models for Underrepresented Varieties: Pretraining Strategies in the Galician-Portuguese Diasystem. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1050–1063. DOI:https://doi.org/10.5753/jbcs.2025.5766
Letícia C. Navarro, Filipe Mutz, Thiago M. Paixão, Guilherme G. Zanetti, Claudine Badue, Alberto F. De Souza, Thiago Oliveira-Santos. RagPharma: A RAG-Based Chatbot for Medicine Leaflets with a Dual-Dataset Evaluation Framework. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1137–1149. DOI:https://doi.org/10.5753/jbcs.2025.5767
Thales Sales Almeida, Rodrigo Nogueira, Helio Pedrini. Building High-Quality Datasets for Portuguese LLMs: From Common Crawl Snapshots to Industrial-Grade Corpora. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1247–1263. DOI:https://doi.org/10.5753/jbcs.2025.5788
William Alberto Cruz-Castañeda, Marcellus Amadeus. Large Languages Models in Brazilian Portuguese: A Chronological Survey. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1168–1187. DOI:https://doi.org/10.5753/jbcs.2025.5789
Felipe Oliveira do Espírito Santo, Sarajane Marques Peres, Bernardo Gonçalves, Fabio José Muneratti Ortega, Vinícius Bitencourt Matos, André Paulino Lima, Anarosa Alves Franco Brandão, Fábio Gagliardi Cozman. The Cocoruta Hub: Open and Curated Corpora, Datasets and Language Models on Brazilian Ocean Law. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 808–827. DOI:https://doi.org/10.5753/jbcs.2025.5791
João Gondim, Daniela Barreiro Claro, Marlo Souza. A bilingual analysis of multi-head attention mechanism for image captioning based on morphosyntactic information. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1064-1077. DOI:https://doi.org/10.5753/jbcs.2025.5792
André da Fonseca Schuck, Gabriel Lino Garcia, João Renato Ribeiro Manesco, Pedro Henrique Paiola, João Paulo Papa. Evaluating Large Language Models for Brazilian Portuguese Sentiment Analysis: A Comparative Study of Multilingual State-of-the-Art vs. Brazilian Portuguese Fine-Tuned LLMs. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 885–917. DOI:https://doi.org/10.5753/jbcs.2025.5793
René Vieira Santin, Ricardo Marcondes Marcacini, Solange Oliveira Rezende. Domain Learning from Data for Large Language Model Translation and Adaptation. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1089–1119. DOI:https://doi.org/10.5753/jbcs.2025.5795
Thales Sales Almeida, Giovana Kerche Bonás, João Guilherme Alves Santos. BRoverbs - Measuring how much LLMs understand Portuguese proverbs. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1078–1088. DOI:https://doi.org/10.5753/jbcs.2025.5797
Mariana O. Silva, Michele A. Brandão, Mirella M. Moro. Rewriting Stories with LLMs: Gender Bias in Generated Portuguese-language Narratives. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1120–1136. DOI:https://doi.org/10.5753/jbcs.2025.5799
Sérgio S. Mucciaccia, Thiago M. Paixão, Filipe Mutz, Alberto F. De Souza, Claudine S. Badue, Thiago Oliveira-Santos. Pt-HotpotQA: Evaluating Multi-Hop Question Answering on Original and Portuguese-translated Datasets Using LLMs. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 872–884. DOI:https://doi.org/10.5753/jbcs.2025.5801
Felipe Coelho de Abreu Pinna, Victor Takashi Hayashi, João Carlos Néto, Isabella Sadakata Takara, Stephan Kovach, Lucas Gaspar Mendonça, Romeo Bulla Junior, João Victor Sá, Wilson Vicente Ruggiero. Complex Interactions in Dialog Systems for Brazilian Portuguese: A Comparison of RAG Approaches. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1301–1319. DOI:https://doi.org/10.5753/jbcs.2025.5806
Hugo A. P. G. de Camargo, Pedro Henrique Paiola, Gabriel Lino Garcia, João Paulo Papa. Abstractive Summarization with LLMs for Texts in Brazilian Portuguese. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1031–1049. DOI:https://doi.org/10.5753/jbcs.2025.5811
Pedro Henrique Paiola, Gabriel Lino Garcia, João Vitor Mariano Correia, João Renato Ribeiro Manesco, Ana Lara Alves Garcia, João Paulo Papa. The Bode Family of Large Language Models: Investigating the Frontiers of LLMs in Brazilian Portuguese. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 918–939. DOI:https://doi.org/10.5753/jbcs.2025.5812
Gabriel Assis, Cláudia Freitas, Aline Paes. Exploring Brazil's LLM Fauna: Investigating the Generative Performance of Large Language Models in Portuguese. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 940–972. DOI:https://doi.org/10.5753/jbcs.2025.5814
José Victor de Souza, Hazem Amamou, Rubing Chen, Elmira Salari, Reto Gubelmann, Christina Niklaus, Talita Serpa, Marcela Marques de Freitas Lima, Paula Tavares Pinto, Shruti Kshirsagar, Alan Davoust, Siegfried Handschuh, Anderson Raymundo Avila. Cross-Lingual Keyword Extraction for Pesticide Terminology in Brazilian Portuguese and English. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 973–990. DOI:https://doi.org/10.5753/jbcs.2025.5815
André Barbosa, Igor Cataneo Silveira, Denis Deratani Mauá. An Empirical Analysis of Large Language Models for Automated Cross-Prompt Essay Trait Scoring in Brazilian Portuguese. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 858–871. DOI:https://doi.org/10.5753/jbcs.2025.5817
Eugénio Ribeiro, David Antunes, Nuno Mamede, Jorge Baptista. Exploring Few-Shot Approaches to Automatic Text Complexity Assessment in European Portuguese. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 690–710. DOI:https://doi.org/10.5753/jbcs.2025.5820
David Eduardo Pereira, Daniela Thuaslar Simão Gomes, Claudio E. C. Campelo. Evaluating LLMs on Argument Mining Tasks in Brazilian Portuguese Debate Data. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1280–1300. DOI:https://doi.org/10.5753/jbcs.2025.5824
Uriel Lasheras, Elioenai Alves, Caio Ponte, Carlos Caminha, Vládia Pinheiro. Open LLMs Meet Causality in Portuguese: A Corpus-Based Fine-Tuning Approach. Journal of the Brazilian Computer Society, Vol. 31, No. 1 (2025), 1005–1030. DOI:https://doi.org/10.5753/jbcs.2025.5825
Vicentini, J., Rodrigues, R. B. de M., Junior, A. C. ., & Guilherme, I. R. (2026). Comparing Explainable AI Techniques In Language Models: A Case Study For Fake News Detection in Portuguese. Journal of the Brazilian Computer Society, 32(1), 01–12. https://doi.org/10.5753/jbcs.2026.5787

