Topic Taxonomy Generation with LLMs for Enriched Transaction Tagging
DOI:
https://doi.org/10.5753/jbcs.2025.5470Keywords:
Taxonomy Generation, Large Language Models, Natural Language Processing, Web Scraping, Transactions, TaggingAbstract
This work presents an unsupervised method for tagging banking consumers’ transactions using automatically constructed and expanded topic taxonomies. Initially, we enrich the bank transactions via web scraping to collect relevant descriptions, which are then preprocessed using NLP techniques to generate candidate terms. Topic taxonomies are created using instruction-based fine-tuned LLMs (Large Language Models). To expand existing taxonomies with new terms, we use zero-shot prompting to determine where to add new nodes. The resulting taxonomies are used to assign descriptive tags that characterize the transactions in the retail bank dataset. For evaluation, 12 volunteers completed a two-part form assessing the quality of the taxonomies and the tags assigned to merchants. The evaluation revealed a coherence rate exceeding 90% for the chosen taxonomies. Additionally, taxonomy expansion using LLMs demonstrated promising results for parent node prediction, with F1-scores of 89% and 70% for Food and Shopping taxonomies, respectively.
Downloads
References
Anil, R., Borgeaud, S., Wu, Y., Alayrac, J.-B., Yu, J., Soricut, R., Schalkwyk, J., Dai, A. M., Hauth, A., et al. (2023). Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805. DOI: 10.48550/arXiv.2312.11805.
Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan):993-1022. DOI: 10.7551/mitpress/1120.003.0082.
Bordea, G., Lefever, E., and Buitelaar, P. (2016). SemEval-2016 task 13: Taxonomy extraction evaluation (TExEval-2). In Bethard, S., Carpuat, M., Cer, D., Jurgens, D., Nakov, P., and Zesch, T., editors, Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pages 1081-1091, San Diego, California. Association for Computational Linguistics. DOI: 10.18653/v1/S16-1168.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., and Amodei, D. (2020). Language models are few-shot learners. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., editors, Advances in Neural Information Processing Systems, volume 33, pages 1877-1901. Curran Associates, Inc.. DOI: 10.48550/arxiv.2005.14165.
Busson, A. J. G., Rocha, R., Gaio, R., Miceli, R., Pereira, I., Moraes, D. d. S., Colcher, S., Veiga, A., Rizzi, B., Evangelista, F., Santos, L., Marques, F., Rabaioli, M., Feldberg, D., Mattos, D., Pasqua, J., and Dias, D. (2023). Hierarchical classification of financial transactions through context-fusion of transformer-based embeddings and taxonomy-aware attention layer. In Anais do II Brazilian Workshop on Artificial Intelligence in Finance (BWAIF 2023), BWAIF 2023. Sociedade Brasileira de Computação. DOI: 10.5753/bwaif.2023.229322.
Campos, R., Mangaravite, V., Pasquali, A., Jorge, A., Nunes, C., and Jatowt, A. (2020). Yake! keyword extraction from single documents using multiple local features. Information Sciences, 509:257-289. DOI: 10.1016/j.ins.2019.09.013.
Chen, B., Yi, F., and Varró, D. (2023). Prompting or fine-tuning? a comparative study of large language models for taxonomy construction. In 2023 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C). DOI: 10.1109/MODELS-C59198.2023.00097.
Ekin, S. (2023). Prompt engineering for chatgpt: a quick guide to techniques, tips, and best practices. Authorea Preprints. DOI: 10.36227/techrxiv.22683919.v2.
Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2021). Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685. DOI: 10.48550/arXiv.2106.09685.
Jiang, A. Q., Sablayrolles, A., Roux, A., Mensch, A., Savary, B., Bamford, C., Chaplot, D. S., Casas, D. d. l., Hanna, E. B., Bressand, F., et al. (2024). Mixtral of experts. arXiv preprint arXiv:2401.04088. DOI: 10.48550/arXiv.2401.04088.
Lee, D., Shen, J., Kang, S., Yoon, S., Han, J., and Yu, H. (2022). Taxocom: Topic taxonomy completion with hierarchical discovery of novel topic clusters. In Proceedings of the ACM Web Conference 2022, pages 2819-2829, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/3485447.3512002.
Li, Y. (2023). A practical survey on zero-shot prompt design for in-context learning. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 641-647, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Liu, A., Feng, B., Xue, B., Wang, B., Wu, B., Lu, C., Zhao, C., Deng, C., Zhang, C., Ruan, C., et al. (2024). Deepseek-v3 technical report. arXiv preprint arXiv:2412.19437. DOI: h10.48550/arXiv.2412.19437.
Moraes, D., Costa, P., Santos, P., Pinto, I., Colcher, S., Busson, A., Pinto, M., Rocha, R., Gaio, R., Tourinho, G., Rabaioli, M., and Favaro, D. (2024). Tagging enriched bank transactions using llm-generated topic taxonomies. In Proceedings of the 30th Brazilian Symposium on Multimedia and the Web, pages 267-274, Porto Alegre, RS, Brasil. SBC. DOI: 10.5753/webmedia.2024.243267.
Nikishina, I., Logacheva, V., Panchenko, A., and Loukachevitch, N. (2020). Russe'2020: Findings of the first taxonomy enrichment task for the russian language. arXiv preprint arXiv:2005.11176. DOI: 10.48550/arXiv.2005.11176.
OpenAI (2023). Gpt-4 technical report. DOI: 10.48550/arxiv.2303.08774.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485-5551. DOI: 10.48550/arxiv.1910.10683.
Reynolds, L. and McDonell, K. (2021). Prompt programming for large language models: Beyond the few-shot paradigm. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA '21, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/3411763.3451760.
Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., and Chadha, A. (2024). A systematic survey of prompt engineering in large language models: Techniques and applications. arXiv preprint arXiv:2402.07927. DOI: 10.48550/arXiv.2402.07927.
Shen, Y., Zhang, Y., Zhang, Y., and Han, J. (2024). A unified taxonomy-guided instruction tuning framework for entity set expansion and taxonomy expansion. arXiv preprint arXiv:2402.13405. DOI: 10.48550/arXiv.2402.13405.
Snow, R., Jurafsky, D., and Ng, A. (2004). Learning syntactic patterns for automatic hypernym discovery. In Advances in Neural Information Processing Systems, volume 17. MIT Press. Available online [link].
Takeoka, K., Akimoto, K., and Oyamada, M. (2021). Low-resource taxonomy enrichment with pretrained language models. In Moens, M.-F., Huang, X., Specia, L., and Yih, S. W.-t., editors, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2747-2758, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics. DOI: 10.18653/v1/2021.emnlp-main.217.
Tam, A. (2023). What are zero-shot prompting and few-shot prompting. Available online [link].
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al. (2023). Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. DOI: 10.48550/arXiv.2302.13971.
Vollset, E., Folkestad, E., Gallala, M. R., and Gulla, J. A. (2017). Making use of external company data to improve the classification of bank transactions. In Advanced Data Mining and Applications: 13th International Conference, ADMA 2017, Singapore, November 5-6, 2017, Proceedings 13, pages 767-780. Springer. DOI: 10.1007/978-3-319-69179-4_54.
Wallach, H., Mimno, D., and McCallum, A. (2009). Rethinking lda: Why priors matter. In Advances in Neural Information Processing Systems. Curran Associates, Inc. Available online [link].
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., and Zhou, D. (2022). Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171. DOI: 10.48550/arXiv.2203.11171.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., ichter, b., Xia, F., Chi, E., Le, Q. V., and Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems, volume 35, pages 24824-24837. Curran Associates, Inc.. DOI: 10.48550/arxiv.2201.11903.
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., and Cao, Y. (2023). React: Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR). DOI: 10.48550/arxiv.2210.03629.
Zeng, Q., Bai, Y., Tan, Z., Feng, S., Liang, Z., Zhang, Z., and Jiang, M. (2024). Chain-of-layer: Iteratively prompting large language models for taxonomy induction from limited examples. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, CIKM '24, page 3093–3102, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/3627673.3679608.
Zhang, C., Tao, F., Chen, X., Shen, J., Jiang, M., Sadler, B., Vanni, M., and Han, J. (2018). Taxogen: Unsupervised topic taxonomy construction by adaptive term embedding and clustering. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD '18, page 2701–2709, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/3219819.3220064.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Daniel de S. Moraes, Polyana B. da Costa, Pedro T. Cutrim dos Santos, Ivan de J. P. Pinto, Sergio Colcher, Antonio J. G. Busson, Matheus A. S. Pinto, Rafael H. Rocha , Rennan Gaio, Gabriela Tourinho, Marcos Rabaioli, David Favaro

This work is licensed under a Creative Commons Attribution 4.0 International License.

