AI-Driven Hierarchical Taxonomy Generation from Emergency Call Transcripts
DOI:
https://doi.org/10.5753/jbcs.2026.6635Keywords:
Hierarchical Text Classification, Emergency Call Analysis, BERTopic, Large Language Models, Natural Language Processing, Multilingual NLP, Emergency Communication SystemsAbstract
This article presents a case study on hierarchical topic modeling for emergency call transcripts from Ecuador's ECU 911 service. We introduce a hybrid methodology that first generates a taxonomy from unlabeled data using BERTopic and agglomerative clustering, and then employs embedding-based similarity for multi-label classification. By leveraging multilingual embeddings (LaBSE) and clustering algorithms (UMAP & HDBSCAN), we identified 23 coherent topics, demonstrating a practical balance between accuracy and operational applicability. The key result is a significant reduction in Hamming Loss and an F1-score of 0.4951, achieved without the need for pre-labeled data. This underscores the method's primary practical significance: offering a scalable, automated solution for emergency management centers to rapidly categorize complex incidents, thereby enhancing situational awareness and resource allocation. The integration of LLaMA 3 for automated label generation further optimized semantic interpretation, highlighting the potential of language models in critical, resource-constrained domains.
Downloads
References
Andirov, M., Assan, Z. Z., Nopembri, S., Seilkhan, A., and Myrzakhmetov, D. (2023). Classification of texts on emergency situations in almaty. Kompleksnoe Ispolzovanie Mineralnogo Syra= Complex use of mineral resources, 327(4):23-31. DOI: 10.31643/2023/6445.36.
Egger, R. and Yu, J. (2022). A topic modeling comparison between lda, nmf, top2vec, and bertopic to demystify twitter posts. Frontiers in sociology, 7:886498. DOI: 10.3389/fsoc.2022.886498.
Gargiulo, F., Silvestri, S., Ciampi, M., and De Pietro, G. (2019). Deep neural network for hierarchical extreme multi-label text classification. Applied Soft Computing, 79:125-138. DOI: 10.1016/j.asoc.2019.03.041.
Haj-Yahia, Z., Sieg, A., and Deleris, L. A. (2019). Towards unsupervised text classification leveraging experts and word embeddings. In Proceedings of the 57th annual meeting of the Association for Computational Linguistics, pages 371-379. DOI: 10.18653/v1/P19-1036.
Jiang, T., Wang, D., Sun, L., Chen, Z., Zhuang, F., and Yang, Q. (2022). Exploiting global and local hierarchies for hierarchical text classification. arXiv preprint arXiv:2205.02613. DOI: 10.48550/arXiv.2205.02613.
Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., and Brown, D. (2019). Text classification algorithms: A survey. Information, 10(4):150. DOI: 10.3390/info10040150.
Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., Yu, P. S., and He, L. (2022). A survey on text classification: From traditional to deep learning. ACM Transactions on Intelligent Systems and Technology (TIST), 13(2):1-41. DOI: 10.1145/3495162.
Li, Z., Zhu, H., Lu, Z., and Yin, M. (2023). Synthetic data generation with large language models for text classification: Potential and limitations. arXiv preprint arXiv:2310.07849. DOI: 10.18653/v1/2023.emnlp-main.647.
Liu, Y. and Wan, F. (2024). Unveiling temporal and spatial research trends in precision agriculture: A bertopic text mining approach. Heliyon. DOI: 10.1016/j.heliyon.2024.e36808.
Malzer, C. and Baum, M. (2020). A hybrid approach to hierarchical density-based cluster selection. In 2020 IEEE international conference on multisensor fusion and integration for intelligent systems (MFI), pages 223-228. IEEE. DOI: 10.1109/MFI49285.2020.9235263.
McInnes, L., Healy, J., and Melville, J. (2018). Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426. DOI: 10.48550/arXiv.1802.03426.
Orellana, M., Molina Pinos, P. A., García-Montero, P. S., and Zambrano-Martinez, J. L. (2024). Pre-processing of the text of ecu 911 emergency calls. In Conference on Information and Communication Technologies of Ecuador, pages 271-284. Springer. DOI: 10.1007/978-3-031-75431-9_18.
Pacheco, S. A. d. J. S., Romero, F. C., Domíınguez, R. G., and Vasconcelos, M. P. (2023). Clasificación jerárquica de texto con machine learning en la industria petrolera. Innovación y Desarrollo Tecnológico. Available at:[link].
Palanivinayagam, A., El-Bayeh, C. Z., and Damaševičius, R. (2023). Twenty years of machine-learning-based text classification: A systematic review. Algorithms, 16(5):236. DOI: 10.3390/a16050236.
Rosner, F., Hinneburg, A., Röder, M., Nettling, M., and Both, A. (2014). Evaluating topic coherence measures. arXiv preprint arXiv:1403.6397. DOI: 10.48550/arXiv.1403.6397.
Stammbach, D. and Ash, E. (2021). Docscan: Unsupervised text classification via learning from neighbors. arXiv preprint arXiv:2105.04024. DOI: 10.48550/arXiv.2105.04024.
Tang, Z., Pan, X., and Gu, Z. (2024). Analyzing public demands on china’s online government inquiry platform: A bertopic-based topic modeling study. Plos one, 19(2):e0296855. DOI: 10.1371/journal.pone.0296855.
Topal, M. O., Bas, A., and van Heerden, I. (2021). Exploring transformers in natural language generation: Gpt, bert, and xlnet. arXiv preprint arXiv:2102.08036. DOI: 10.48550/arXiv.2102.08036.
Wang, Z., Wang, L., Huang, C., Sun, S., and Luo, X. (2023). Bert-based chinese text classification for emergency management with a novel loss function. Applied Intelligence, 53(9):10417-10428. DOI: 10.1007/s10489-022-03946-x.
Yao, Y., Duan, J., Xu, K., Cai, Y., Sun, Z., and Zhang, Y. (2024). A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. High-Confidence Computing, page 100211. DOI: 10.1016/j.hcc.2024.100211.
Yuan, S. and Wang, Q. (2022). Imbalanced traffic accident text classification based on bert-rcnn. In Journal of Physics: Conference Series, number 1 in 2170, page 012003. IOP Publishing. DOI: 10.1088/1742-6596/2170/1/012003.
Zhang, Y., Yang, R., Xu, X., Li, R., Xiao, J., Shen, J., and Han, J. (2025). Teleclass: Taxonomy enrichment and llm-enhanced hierarchical text classification with minimal supervision. In Proceedings of the ACM on Web Conference 2025, pages 2032-2042. DOI: 10.1145/3696410.3714940.
Zhou, J., Ma, C., Long, D., Xu, G., Ding, N., Zhang, H., Xie, P., and Liu, G. (2020). Hierarchy-aware global model for hierarchical text classification. In Proceedings of the 58th annual meeting of the association for computational linguistics, pages 1106-1117. DOI: 10.18653/v1/2020.acl-main.104.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Juan Gabriel Flores Sanchez, Marcos Orellana, Patricio Santiago García-Montero, Jorge Luis Zambrano-Martinez

This work is licensed under a Creative Commons Attribution 4.0 International License.

