Abstractive Summarization with LLMs for Texts in Brazilian Portuguese

Authors

DOI:

https://doi.org/10.5753/jbcs.2025.5811

Keywords:

Large Language Models, Abstractive Summarization, Machine Learning, Generative AI, Natural Language Processing

Abstract

This study aims to compare large language models (LLMs) in the task of text summarization for Portuguese-language texts. A dataset of 8,116 samples was used, containing the original texts and their corresponding reference summaries. Initially, an experiment was conducted comparing three different prompts using zero-shot, one-shot, and few-shot techniques, processing 100 samples for four out of the six models (those that accept instructions as part of their input). The goal of this preliminary experiment was to determine an optimal prompt for conducting the full-scale experiment. After selecting the prompt, a second experiment was performed, running all six models on the 8,116 samples and evaluating summarization quality using metrics such as BLEU and ROUGE, as well as Compression Rate and Inference Time for the generated summaries. Finally, an experiment was conducted to analyze the impact of 4-bit and 8-bit quantization, assessing how these different configurations affect the generated summaries, evaluation metrics, Compression Rate, and Inference Time.

Downloads

Download data is not yet available.

References

Akashvarma, M. et al. (2024). A comprehensive review of large language models in abstractive summarization of news articles. In Proceedings of the 2024 Asia Pacific Conference on Innovation in Technology (APCIT), pages 1-6. IEEE. DOI: 10.1109/apcit62007.2024.10673650.

Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Koehn, P., Palmer, M., and Schneider, N. (2013). Abstract meaning representation for sembanking. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pages 178-186, Bulgaria. Association for Computational Linguistics. Available online [link].

Barzilay, R. and Lapata, M. (2008). Modeling local coherence: An entity-based approach. Computational Linguistics, 34(1):1-34. DOI: 10.1162/coli.2008.34.1.1.

Basyal, L. and Sanghvi, M. (2023). Text summarization using large language models: A comparative study of mpt-7b-instruct, falcon-7b-instruct, and openai chat-gpt models. DOI: 10.48550/arXiv.2310.10449.

Brown, T. et al. (2020). Language models are few-shot learners. In NeurIPS 2020, v. 33. DOI: 10.48550/arxiv.2005.14165.

Cardoso, P. C., Maziero, E. G., Jorge, M. L. C., Seno, E. M., Di Felippo, A., Rino, L. H. M., Nunes, M. d. G. V., and Pardo, T. A. (2011). Cstnews-a discourse-annotated corpus for single and multi-document summarization of news texts in brazilian portuguese. In Proceedings of the 3rd RST Brazilian Meeting, pages 88-105. sn. Available online [link].

Collovini, S., Carbonel, T. I., Fuchs, J. T., Coelho, J. C., Rino, L., and Vieira, R. (2007). Summ-it: Um corpus anotado com informações discursivas visando a sumarização automática. In Proceedings of the 5th Workshop in Information and Human Language Technology (NILC). Available online [link].

de Vargas Feijó, D. and Moreira, V. P. (2018). Rulingbr: A summarization dataset for legal texts. In International Conference on Computational Processing of the Portuguese Language, pages 255-264. Springer. DOI: 10.1007/978-3-319-99722-3_26.

Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., Yang, A., Fan, A., et al. (2024). The llama 3 herd of models. arXiv preprint arXiv:2407.21783. DOI: 10.48550/arXiv.2407.21783.

Erkan, G. and Radev, D. R. (2004). Lexrank: Graph-based lexical centrality as salience in text summarization. In Journal of Artificial Intelligence Research, volume 22, pages 457-479. DOI: 10.1613/jair.1523.

Fabbri, A. R., Kryscinski, W., McCann, B., Xiong, C., Socher, R., and Radev, D. (2021). Summeval: Re-evaluating summarization evaluation. In Transactions of the Association for Computational Linguistics (TACL), volume 9, pages 391-409. DOI: 10.1162/tacl_a_00373.

Fonseca, E. B., Antonitsch, A., Collovini, S., Amaral, D., Vieira, R., and Figueira, A. (2016). Summ-it++: an enriched version of the summ-it corpus. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 2047-2051. Available online [link].

Galley, M. (2006). Skip-chain conditional random fields for ranking meeting utterances by importance. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 364-372. DOI: 10.7916/D8RV0X5W.

Garcia, G. L., Paiola, P. H., Garcia, E., Manesco, J. R. R., and Papa, J. P. (2024a). Gembode and phibode: Adapting small language models to brazilian portuguese. In CIARP. Accepted for publication. DOI: 10.1007/978-3-031-76607-7_17.

Garcia, G. L., Paiola, P. H., and Papa, J. P. (2024b). Chatbode. Available online [link].

Gehring, J., Auli, M., Grangier, D., and Dauphin, Y. N. (2017). Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning, pages 1243-1252. DOI: 10.48550/arxiv.1705.03122.

Ghalandari, D. G., Hokamp, C., Pham, N. T., Glover, J., and Ifrim, G. (2020). A large-scale multi-document summarization dataset from the wikipedia current events portal. arXiv preprint arXiv:2005.10070. DOI: 10.48550/arXiv.2005.10070.

Gong, Y. and Liu, X. (2001). Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 19-25. DOI: 10.1145/383952.383955.

Gupta, S. and Gupta, S. (2019). Abstractive summarization: An overview of the state of the art. Expert Systems with Applications, 121:49-65. DOI: 10.1016/j.eswa.2018.12.011.

Kupiec, J., Pedersen, J., and Chen, F. (1995). A trainable document summarizer. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 68-73, New York. ACM. DOI: 10.1145/215206.215333.

Kågebäck, M., Mogren, O., Tahmasebi, N., and Dubhashi, D. (2014). Extractive summarization using continuous vector space models. In Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC), pages 31-39. DOI: 10.3115/v1/w14-1504.

Kågebäck, M., Mogren, O., Tahmasebi, N., and Dubhashi, D. (2014). Extractive summarization using continuous vector space models. In Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC), pages 31-39. DOI: 10.3115/v1/w14-1504.

Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., and Zettlemoyer, L. (2020). Bart: Denoising sequence-to-sequence pretraining for natural language generation, translation, and comprehension. In arXiv preprint arXiv:1910.13461. DOI: 10.18653/v1/2020.acl-main.703.

Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74-81, Barcelona, Espanha. Association for Computational Linguistics. Available online [link].

Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., and Tang, J. (2021). Gpt understands, too. ArXiv, abs/2103.10385. DOI: 10.1016/j.aiopen.2023.08.012.

Liu, Y., Jia, Q., and Zhu, K. (2022). Reference-free summarization evaluation via semantic correlation and compression ratio. In Proceedings of NAACL-HLT, pages 2109-2115, Seattle, WA, USA. Association for Computational Linguistics. DOI: 10.18653/v1/2022.naacl-main.153.

Liu, Y. and Lapata, M. (2019). Fine-tune bert for extractive summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pages 214-224. DOI: 10.48550/arxiv.1903.10318.

Marcu, D. (2000). The theory and practice of discourse parsing and summarization. MIT press. DOI: 10.7551/mitpress/6754.001.0001.

Maynez, J., Narayan, S., Bohnet, B., and McDonald, R. (2020). On faithfulness and factuality in abstractive summarization. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1906-1919. DOI: 10.18653/v1/2020.acl-main.173.

Mihalcea, R. and Tarau, P. (2004). Textrank: Bringing order into texts. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 404-411. Available online [link].

Nallapati, R., Zhai, F., Zhou, B., Gulcehre, C., and Xiang, B. (2016). Abstractive text summarization using sequence-to-sequence rnns and beyond. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pages 280-290. DOI: 10.18653/v1/k16-1028.

Nenkova, A. and McKeown, K. (2011). Automatic summarization. Now Publishers Inc, Boston. DOI: 10.1561/9781601984715.

Paiola, P. H., de Rosa, G. H., and Papa, J. P. (2022). Deep learning-based abstractive summarization for brazilian portuguese texts. In Xavier-Junior, J. C. and Rios, R. A., editors, BRACIS 2022: Intelligent Systems, pages 479-493. Springer International Publishing, Cham. DOI: 10.1007/978-3-031-21689-3_34.

Paiola, P. H., Garcia, G. L., Jodas, D. S., Correia, J. V. M., Sugi, L. A., and Papa, J. P. (2024). RecognaSumm: A novel Brazilian summarization dataset. In Gamallo, P., Claro, D., Teixeira, A., Real, L., Garcia, M., Oliveira, H. G., and Amaro, R., editors, Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1, pages 575-579, Santiago de Compostela, Galicia/Spain. Association for Computational Lingustics. Available online [link].

Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pages 311-318. Association for Computational Linguistics. DOI: 10.3115/1073083.1073135.

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. DOI: 10.48550/arxiv.1910.10683.

Sellam, T., Das, D., and Parikh, A. P. (2020). Bleurt: Learning robust metrics for text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), pages 7881-7892. DOI: 10.18653/v1/2020.acl-main.704.

Team, G., Mesnard, T., Hardin, C., Dadashi, R., Bhupatiraju, S., Sifre, L., Rivière, M., Kale, M. S., Love, J., Tafti, P., Hussenot, L., et al. (2024). Gemma. Available online [link].

Van Veen, D., Van Uden, C., Blankemeier, L., Delbrouck, J.-B., Aali, A., Bluethgen, C., Pareek, A., Polacin, M., Reis, E. P., Seehofnerova, A., et al. (2023). Clinical text summarization: adapting large language models can outperform human experts. Research square, pages rs-3. DOI: 10.21203/rs.3.rs-3483777/v1.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, ., and Polosukhin, I. (2017). Attention is all you need. In NeurIPS. DOI: h10.48550/arxiv.1706.03762.

Wang, P., Li, J., and Zhu, X. (2008). Multi-document summarization using sentence-based topic models. In 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, volume 1, pages 551-554. IEEE. DOI: 10.3115/1667583.1667675.

Widyassari, A. P. et al. (2022). Review of automatic text summarization techniques & methods. Journal of King Saud University - Computer and Information Sciences, 34(4):1029-1046. DOI: 10.1016/j.jksuci.2020.05.006.

Zhang, H., Yu, P. S., and Zhang, J. (2024). A systematic survey of text summarization: From statistical methods to large language models. DOI: 10.1145/3731445.

Zhang, J., Zhao, Y., Saleh, M., and Liu, P. (2020). Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In III, H. D. and Singh, A., editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 11328-11339. PMLR. Online. DOI: 10.48550/arxiv.1912.08777.

Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., and Artzi, Y. (2019). Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations (ICLR). DOI: h10.48550/arxiv.1904.09675.

Zhao, W., Peyrard, M., Liu, F., Gao, Y., Meyer, C. M., and Eger, S. (2019). Moverscore: Text generation evaluating with contextualized embeddings and earth mover distance. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 563-578. DOI: 10.18653/v1/d19-1053.

Zhao, W. et al. (2021). Calibrate before use: Improving few-shot performance of language models. arXiv preprint arXiv:2102.09690. DOI: 10.48550/arxiv.2102.09690.

Zhong, M., Liu, P., Chen, Y., Wang, D., Qiu, X., and Huang, X. (2020). Extractive summarization as text matching. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6197-6208, Online. Association for Computational Linguistics. DOI: 10.18653/v1/2020.acl-main.552.

Downloads

Published

2025-10-14

How to Cite

de Camargo, H. A. P. G., Paiola, P. H., Garcia, G. L., & Papa, J. P. (2025). Abstractive Summarization with LLMs for Texts in Brazilian Portuguese. Journal of the Brazilian Computer Society, 31(1), 1031–1049. https://doi.org/10.5753/jbcs.2025.5811

Issue

Section

Articles