Complex Interactions in Dialog Systems for Brazilian Portuguese: A Comparison of RAG Approaches
DOI:
https://doi.org/10.5753/jbcs.2025.5806Keywords:
Banking, Dialog Systems, Finance, Generative AI, LLM, RAG, Brazilian PortugueseAbstract
Retrieval-Augmented Generation (RAG) has emerged as a key technique in enhancing the capabilities of Large Language Models (LLMs) by incorporating external knowledge sources into the response generation process. This paper presents a comparative analysis of various RAG approaches applied to dialog systems in Brazilian Portuguese. The study explores multiple retrieval strategies, including VectorRAG, GraphRAG, MemoRAG, HybridRAG, and HippoRAG, assessing their performance in handling complex queries, multi-turn conversations, and contextual disambiguation.We evaluate these models in the banking context using real-world datasets from two case studies. The analysis highlights the strengths and limitations of each method.Experimental results indicate that context-aware retrieval strategies improve response accuracy when addressing ambiguous or multi-faceted user queries. However, trade-offs in computational efficiency and response time remain critical challenges. Our findings provide insights into optimizing dialog systems for Brazilian Portuguese, paving the way for domain-specific conversational agents in financial and other specialized applications.
Downloads
References
Almazrouei, E., Alobeidli, H., Alshamsi, A., Cappelli, A., Cojocaru, R., Debbah, M., Goffinet, É., Hesslow, D., Launay, J., Malartic, Q., et al. (2023). The falcon series of open language models. arXiv preprint arXiv:2311.16867. DOI: 10.48550/arxiv.2311.16867.
Chan, B. J., Chen, C.-T., Cheng, J.-H., and Huang, H.-H. (2025). Don't do rag: When cache-augmented generation is all you need for knowledge tasks. In Companion Proceedings of the ACM on Web Conference 2025, pages 893-897. DOI: 10.1145/3701716.3715490.
Edge, D., Trinh, H., Cheng, N., Bradley, J., Chao, A., Mody, A., Truitt, S., Metropolitansky, D., Ness, R. O., and Larson, J. (2024). From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130. DOI: 10.48550/arxiv.2404.16130.
Guo, Y., Tao, Y., Ming, Y., Nowak, R. D., and Liang, Y. (2025). Retrieval-augmented generation as noisy in-context learning: A unified theory and risk bounds. arXiv preprint arXiv:2506.03100. DOI: 10.48550/arXiv.2506.03100.
Gutiérrez, B. J., Shuv, Y., Gu, Y., Yasunaga, M., and Su, Y. (2024). Hipporag: Neurobiologically inspired long-term memory for large language models. arXiv preprint arXiv:2405.14831. DOI: 10.48550/arxiv.2405.14831.
Hadi, M. U., Qureshi, R., Shah, A., Irfan, M., Zafar, A., Shaikh, M. B., Akhtar, N., Wu, J., Mirjalili, S., et al. (2023). A survey on large language models: Applications, challenges, limitations, and practical usage. Authorea Preprints, 3. DOI: 10.36227/techrxiv.23589741.v1.
Liang, L., Bo, Z., Gui, Z., Zhu, Z., Zhong, L., Zhao, P., Sun, M., Zhang, Z., Zhou, J., Chen, W., et al. (2025). Kag: Boosting llms in professional domains via knowledge augmented generation. In Companion Proceedings of the ACM on Web Conference 2025, pages 334-343. DOI: 10.1145/3701716.3715240.
Liu, Y., Han, T., Ma, S., Zhang, J., Yang, Y., Tian, J., He, H., Li, A., He, M., Liu, Z., et al. (2023). Summary of chatgpt-related research and perspective towards the future of large language models. Meta-radiology, 1(2):100017. DOI: 10.1016/j.metrad.2023.100017.
Manning, C. D. (2009). An introduction to information retrieval. Cambridge University Press. Book.
Meta (2025). Llama3.2. Available at: [link].
Moreira, V. P. (2023). Capítulo 19 recuperação de informação. Brasileiras em PLN. Available at: [link].
OpenAI (2025). Models. Available at: [link].
Ozdemir, S. (2023). Quick start guide to large language models: strategies and best practices for using ChatGPT and other LLMs. Addison-Wesley Professional. Book.
Peng, B., Zhu, Y., Liu, Y., Bo, X., Shi, H., Hong, C., Zhang, Y., and Tang, S. (2024). Graph retrieval-augmented generation: A survey. arXiv preprint arXiv:2408.08921. DOI: 10.48550/arxiv.2408.08921.
Phan, H., Acharya, A., Meyur, R., Chaturvedi, S., Sharma, S., Parker, M., Nally, D., Jannesari, A., Pazdernik, K., Halappanavar, M., et al. (2024). Examining long-context large language models for environmental review document comprehension. arXiv preprint arXiv:2407.07321. DOI: 10.48550/arXiv.2407.07321.
Pinna, F. C. d. A., Hayashi, V. T., Néto, J. C., Marquesone, R. d. F. P., Duarte, M. C., Okada, R. S., and Ruggiero, W. V. (2024). A modular framework for domain-specific conversational systems powered by never-ending learning. Applied Sciences, 14(4). DOI: 10.3390/app14041585.
Qian, H., Zhang, P., Liu, Z., Mao, K., and Dou, Z. (2024). Memorag: Moving towards next-gen rag via memory-inspired knowledge discovery. arXiv preprint arXiv:2409.05591. DOI: 10.48550/arxiv.2409.05591.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8):9. Available at:[link].
Sarmah, B., Hall, B., Rao, R., Patel, S., Pasquali, S., and Mehta, D. (2024). Hybridrag: Integrating knowledge graphs and vector retrieval augmented generation for efficient information extraction. arXiv preprint arXiv:2408.04948. DOI: 10.1145/3677052.3698671.
Sarmah, B., Zhu, T., Mehta, D., and Pasquali, S. (2023). Towards reducing hallucination in extracting information from financial reports using large language models. in: Proceedings of the third international conference on ai-ml systems. In In: Proceedings of the Third International Conference on AI-ML Systems, pages 1-5. DOI: 10.48550/arXiv.2310.10760.
Shahul Es, Jithin James, L. E.-A. S. S. (2023). Ragas: Automated evaluation of retrieval augmented generation. arXiv preprint arXiv:2309.15217. DOI: 10.18653/v1/2024.eacl-demo.16.
Soudani, H., Kanoulas, E., and Hasibi, F. (2024). Fine tuning vs. retrieval augmented generation for less popular knowledge. In Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, pages 12-22. DOI: 10.1145/3673791.3698415.
Team, D. S. (2024). Docling technical report. DOI: 10.48550/arXiv.2408.09869.
Ultes, S., Barahona, L. M. R., Su, P.-H., Vandyke, D., Kim, D., Casanueva, I., Budzianowski, P., Mrkšić, N., Wen, T.-H., Gasic, M., et al. (2017). Pydial: A multi-domain statistical dialogue system toolkit. In Proceedings of ACL 2017, System Demonstrations, pages 73-78. DOI: 10.18653/v1/p17-4013.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. DOI: 10.48550/arxiv.1706.03762.
Wang, J., Ma, W., Sun, P., Zhang, M., and Nie, J.-Y. (2024). Understanding user experience in large language model interactions. arXiv preprint arXiv:2401.08329. DOI: 10.48550/arXiv.2401.08329.
Yang, A. (2024). Old wine in a new bottle: How hipporag revolutionizes retrieval with knowledge graphs. Available at: [link] March 30, 2025.
Yu, H., Gan, A., Zhang, K., Tong, S., Liu, Q., and Liu, Z. (2024). Evaluation of retrieval-augmented generation: A survey. In CCF Conference on Big Data, pages 102-120. Springer. DOI: 10.1007/978-981-96-1024-2_8.
Zhang, B., Liu, Z., Cherry, C., and Firat, O. (2024). When scaling meets llm finetuning: The effect of data, model and finetuning method. arXiv preprint arXiv:2402.17193. DOI: 10.48550/arXiv.2402.17193.
Zhao, S., Yang, Y., Wang, Z., He, Z., Qiu, L. K., and Qiu, L. (2024). Retrieval augmented generation (rag) and beyond: A comprehensive survey on how to make your llms use external data more wisely. arXiv preprint arXiv:2409.14924. DOI: 10.48550/arXiv.2409.14924.
Zhu, Y., Yuan, H., Wang, S., Liu, J., Liu, W., Deng, C., Chen, H., Liu, Z., Dou, Z., and Wen, J.-R. (2023). Large language models for information retrieval: A survey. arXiv preprint arXiv:2308.07107. DOI: 10.1145/3748304.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Felipe Coelho de Abreu Pinna, Victor Takashi Hayashi, João Carlos Néto, Isabella Sadakata Takara, Stephan Kovach, Lucas Gaspar Mendonça, Romeo Bulla Junior, João Victor Sá, Wilson Vicente Ruggiero

This work is licensed under a Creative Commons Attribution 4.0 International License.

