Advancing Chatbot Conversations: A Review of Knowledge Update Approaches




Chatbots, Natural Language Processing, Artificial Intelligence, Data Extraction


Conversational systems like chatbots have emerged as powerful tools for automating interactive tasks traditionally confined to human involvement. Fundamental to chatbot functionality is their knowledge base, the foundation of their reasoning processes. A pivotal challenge resides in chatbots' innate incapacity to seamlessly integrate changes within their knowledge base, thereby hindering their ability to provide real-time responses. The increasing literature attention dedicated to effective knowledge base updates, which we term content update, underscores the significance of this topic. This work provides an overview of content update methodologies in the context of conversational agents. We delve into the state-of-the-art approaches for natural language understanding, such as language models and alike, which are essential for turning data into knowledge. Additionally, we discuss turning point strategies and primary resources, such as deep learning, which are crucial for supporting language models. As our principal contribution, we review and discuss the core techniques underpinning information extraction as well as knowledge base representation and update in the context of conversational agents.



Download data is not yet available.


(2024). Rasa. Available online [link] Accessed in: Accessed in: 16th July 2021.

Abidi, S. S. R. (2007). Healthcare knowledge management: The art of the possible. In K4CARE. DOI: 10.1007/978-3-540-78624-5_.

Ahmed, M. and Pathan, A.-S. K. (2018). Data analytics: concepts, techniques, and applications. Crc Press. Book.

Alammar, J. (2018). The illustrated transformer. Available online [link]Accesed in: 22th July 2021.

Bagwan, F., Phalnikar, R., and Desai, S. (2021). Artificially intelligent health chatbot using deep learning. In 2021 2nd International Conference for Emerging Technology (INCET), pages 1-5. DOI: 10.1109/INCET51464.2021.9456195.

Bauer, L. (2021). Identify, align, and integrate: Matching knowledge graphs to commonsense reasoning tasks. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, page 2259–2272, Online. Association for Computational Linguistics. DOI: 10.48550/arXiv.2104.10193.

Bender, E. M., Gebru, T., McMillan-Major, A., and Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 610-623. DOI: 10.1145/3442188.3445922.

Bizer, C., Heath, T., and Berners-Lee, T. (2011). Linked data: The story so far. In Semantic services, interoperability and web applications: emerging concepts, pages 205-227. IGI global. DOI: 10.4018/978-1-60960-593-3.ch008.

Bollacker, K., Evans, C., Paritosh, P., Sturge, T., and Taylor, J. (2008). Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD '08, page 1247–1250, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/1376616.1376746.

Bordes, A., Usunier, N., Garcia-Dur'an, A., Weston, J., and Yakhnenko, O. (2013). Translating embeddings for modeling multi-relational data. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS'13, page 2787–2795, Red Hook, NY, USA. Curran Associates Inc. Available online [link].

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33:1877-1901. Available online [link].

Bry, F., Manthey, R., and Martens, B. (1992). Integrity verification in knowledge bases. In Logic Programming, pages 114-139. Springer. DOI: 10.1007/3-540-55460-2_9.

Campos, R., Mangaravite, V., Pasquali, A., Jorge, A., Nunes, C., and Jatowt, A. (2020). Yake! keyword extraction from single documents using multiple local features. Information Sciences, 509:257-289. DOI: 10.1016/j.ins.2019.09.013.

Damerau, Fred J. Indurkhya, N. (2010). Handbook of natural language processing. Chapman & Hall/ CRC Press. DOI: 10.1201/9781420085938.

Danushka Bollegala, Huda Hakami, Y. Y. and ichi Kawarabayashi, K. (2021). Relwalk – a latent variable model approach to knowledge graph embedding. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, page 1551–1565, Online. Association for Computational Linguistics. DOI: 10.48550/arXiv.2101.10070.

Das, R., Munkhdalai, T., Yuan, X., Trischler, A., and McCallum, A. (2018). Building dynamic knowledge graphs from text using machine reading comprehension. DOI: 10.48550/arXiv.1810.05682.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding.

Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., and Zhang, W. (2014). Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '14, page 601–610, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/2623330.2623623.

Endris, K. M., Faisal, S., Orlandi, F., Auer, S., and Scerri, S. (2015). Interest-based rdf update propagation. In International Semantic Web Conference, pages 513-529. Springer. DOI: 10.1007/978-3-319-25007-6_30.

Firoozeh, N., Nazarenko, A., Alizon, F., and Daille, B. (2020). Keyword extraction: Issues and methods. Natural Language Engineering, 26(3):259-291. Available online [link].

Frank, E., Paynter, G. W., Witten, I. H., Gutwin, C., and Nevill-Manning, C. G. (1999). Domain-specific keyphrase extraction. Available online [link].

Garcia-Olano, D., Onoe, Y., and Ghosh, J. (2021). Improving and diagnosing knowledge-based visual question answering via entity enhanced knowledge injection. Companion Proceedings of the Web Conference 2022. DOI: 10.1145/3487553.3524648.

Google (2014). Google freebase. url. Available online [link] Accessed in: 7th July 2021.

Google (2019). Google wikidata. url Available online [link] Accessed in: 7th July 2021.

Heinzerling, B. and Inui, K. (2021). Language models as knowledge bases: On entity representations, storage capacity, and paraphrased queries. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, page 1772–1791, Online. Association for Computational Linguistics. DOI: 10.48550/arXiv.2008.09036.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9:1735-80. DOI: 10.1162/neco.1997.9.8.1735.

Horne, R., Sassone, V., and Gibbins, N. (2011). Operational semantics for sparql update. In Joint International Semantic Technology Conference, pages 242-257. Springer. DOI: 10.1007/978-3-642-29923-0_16.

IBM (2006). Watson assistant. url Available online [link] Accesed in: 16th July 2021.

IBM (2021). Watson assistant v2 api reference. Available online [link] Accessed in: 20th July 2021.

Insa, D., Silva, J., and Tamarit, S. (2013). Using the words/leafs ratio in the dom tree for content extraction. The Journal of Logic and Algebraic Programming, 82(8):311-325. Automated Specification and Verification of Web Systems. DOI: 10.1016/j.jlap.2013.01.002.

Jabbari, A., Sauvage, O., and Cabioch, N. (2019). Towards a knowledge base of financial relations: Overview and project description. In 2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), pages 313-316. IEEE. DOI: 10.1109/AIKE.2019.00063.

Jain, S. and Wallace, B. C. (2019). Attention is not explanation. DOI: 10.48550/arXiv.1902.10186.

Kacupaj, E., Plepi, J., Singh, K., Harsh Thakkar, J. L., and Maleshkova, M. (2021). Conversational question answering over knowledge graphs with transformer and graph attention networks. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, page 850–862, Online. Association for Computational Linguistics. DOI: 10.48550/arXiv.2104.01569.

Karpathy, A. (2015). The unreasonable effectiveness of recurrent neural networks. url Available online [link] Accessed in: 23th July 2021.

Kowalski, R. (1992). Database updates in the event calculus. The Journal of Logic Programming, 12(1-2):121-146. DOI: 10.1016/0743-1066(92)90041-Z.

Kumar, T. G. and Nair, R. R. (2021). Conserving knowledge heritage: opportunities and challenges in conceptualizing cultural heritage information system (chis) in the indian context. Global Knowledge, Memory and Communication. DOI: 10.1108/GKMC-02-2021-0020.

Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. (2020). Albert: A lite bert for self-supervised learning of language representations. DOI: 10.48550/arXiv.1909.11942.

Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P., Hellmann, S., Morsey, M., Van Kleef, P., Auer, S., and Bizer, C. (2014). Dbpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web Journal, 6.

Levesque, H. J. (1986). Knowledge representation and reasoning. Annual review of computer science, 1(1):255-287. Book.

Levesque, H. J., Davis, E., and Morgenstern, L. (2012). The winograd schema challenge. KR'12, page 552–561. AAAI Press. Available online [link] Accessed in: July 2021.

Liang, J., Zhang, S., and Xiao, Y. (2017). How to keep a knowledge base synchronized with its encyclopedia source. In International Joint Conference on Artificial Intelligence. Available online [link] Accessed in: July 2021.

Lin, Y., Liu, Z., Sun, M., Liu, Y., and Zhu, X. (2015). Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI'15, page 2181–2187. AAAI Press. DOI: 10.1609/aaai.v29i1.9491.

Litvak, M. and Last, M. (2008). Graph-based keyword extraction for single-document summarization. In Coling 2008: Proceedings of the workshop Multi-source Multilingual Information Extraction and Summarization, pages 17-24. Available online [link] Accessed in: July 2021.

Liu, Q., Shao, M., Wu, L., Zhao, G., Fan, G., and Li, J. (2017). Main content extraction from web pages based on node characteristics. Journal of Computing Science and Engineering, 11(2):39-48. DOI: 10.5626/JCSE.2017.11.2.39.

Lou, Y. F., Zhang, Y. C., and Yuan, Z. J. (2013). Website information extraction based on dom-model. In Applied Mechanics and Materials, volume 347, pages 2889-2893. Trans Tech Publ. DOI: 10.4028/

Mayol, E. and Teniente, E. (1999). A survey of current methods for integrity constraint maintenance and view updating. In International Conference on Conceptual Modeling, pages 62-73. Springer. DOI: 10.1007/3-540-48054-4_6.

Mayol, E., Teniente, E., and Gargallo, P. (1993). Incorporating modification requests in updating consistent knowledge bases. In DAISD, pages 335-359. Citeseer. Available online [link] Accessed in: July 2021.

Microsoft (2016). Language understanding (luis). Available online [link] Accessed in: Accessed in: 16th July 2021.

Microsoft (2024). Microsoft bot framework. Available online [link] Accessed in: Accessed in: 19th July 2021.

Microsoft (2024a). Project answer search. Available online [link] Accessed in: Accessed in: 19th July 2021.

Microsoft (2024b). Qna maker. Available online [link] Accessed in: Accessed in: 19th July 2021.

Mihalcea, R. and Tarau, P. (2004). Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing, pages 404-411. Available online [link] Accessed in: Accessed in: July 2021.

Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Yang, B., Betteridge, J., Carlson, A., Dalvi, B., Gardner, M., Kisiel, B., Krishnamurthy, J., Lao, N., Mazaitis, K., Mohamed, T., Nakashole, N., Platanios, E., Ritter, A., Samadi, M., Settles, B., Wang, R., Wijaya, D., Gupta, A., Chen, X., Saparov, A., Greaves, M., and Welling, J. (2018). Never-ending learning. Commun. ACM, 61(5):103–115. DOI: 10.1145/3191513.

Nakashole, N. and Weikum, G. (2012). Real-time population of knowledge bases: opportunities and challenges. In Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-Scale Knowledge Extraction (AKBC-WEKEX), pages 41-45. Available online [link] Accessed in: Accessed in: July 2021.

Neumann, T. and Weikum, G. (2010). x-rdf-3x: Fast querying, high update rates, and consistency for rdf databases. Proceedings of the VLDB Endowment, 3(1-2):256-263. DOI: 10.14778/1920841.1920877.

Ngai, E. W., Lee, M. C., Luo, M., Chan, P. S., and Liang, T. (2021). An intelligent knowledge-based chatbot for customer service. Electronic Commerce Research and Applications, 50:101098. DOI: 10.1016/j.elerap.2021.101098.

Nicolas, J.-M. (1982). Logic for improving integrity checking in relational data bases. Acta Informatica, 18(3):227-253. DOI: 10.1007/BF00263192.

Olah, C. (2015). Understanding lstm networks. Available online [link] Accessed in: Accessed in: 23th July 2021.

Olivé, A. (1991). Integrity constraints checking in deductive databases. In VLDB, pages 513-523. Citeseer. Available online [link] Accessed in: Accessed in: July 2021.

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., et al. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730-27744. Available online [link] Accessed in: Accessed in: July 2021.

Pandya, S. S. and Kalani, N. B. (2021). Preprocessing phase of text sequence generation for gujarati language. In 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), pages 749-752. DOI: 110.1109/ICCMC51019.2021.9418046.

Qi, H., Pan, L., Sood, A., Shah, A., Kunc, L., and Potdar, S. (2020). Benchmarking intent detection for task-oriented dialog systems. CoRR, abs/2012.03929. DOI: 10.18653/v1/2021.naacl-industry.38.

Rasa (2024a). The future of conversational ai is open. Available online [link] Accessed in: Accessed in: 20th July 2021.

Rasa (2024b). Let your business start a conversation. Available online [link] Accessed in: Accessed in: 20th July 2021.

Reis, E. S. D., Costa, C. A. D., Silveira, D. E. D., Bavaresco, R. S., Righi, R. D. R., Barbosa, J. L. V., Antunes, R. S., Gomes, M. M., and Federizzi, G. (2021). Transformers aftermath: Current research and rising trends. Commun. ACM, 64(4):154–163. DOI: 10.1145/3430937.

Rinne, M. (2012). Sparql update for complex event processing. In International Semantic Web Conference, pages 453-456. Springer. DOI: 10.1007/978-3-642-35173-0_38.

Rocktäschel, T., Singh, S., and Riedel, S. (2015). Injecting logical background knowledge into embeddings for relation extraction. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1119-1129, Denver, Colorado. Association for Computational Linguistics. DOI: 10.3115/v1/N15-1118.

Rossi, R. G., Marcacini, R. M., and Rezende, S. O. (2014). Analysis of domain independent statistical keyword extraction methods for incremental clustering. Learning and Nonlinear Models, 12(1):17-37. Available online [link] Accessed in: Accessed in: July 2021.

Russel, Stuart J.; Norvig, P. (2021). Artificial Intelligence - A Modern Approach. Pearson Education. Book.

Sakama, C. and Inoue, K. (2003). An abductive framework for computing knowledge base updates. Theory and Practice of Logic Programming, 3:671 - 715. Available online [link] Accessed in: Accessed in: July 2021.

Saxena, A., Tripathi, A., and Talukdar, P. (2020). Improving multi-hop question answering over knowledge graphs using knowledge base embeddings. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online. Association for Computational Linguistics. DOI: 10.18653/v1/2020.acl-main.412.

Shoeybi, M., Patwary, M., Puri, R., LeGresley, P., Casper, J., and Catanzaro, B. (2020). Megatron-lm: Training multi-billion parameter language models using model parallelism. DOI: 10.48550/arXiv.1909.08053.

Singhal, A. (2012). Introducing the knowledge graph: things, not strings. Available online [link] Accessed in: Accessed in: 14th July 2021.

Slota, M. and Leite, J. (2012). A unifying perspective on knowledge updates. In European Conference on Logics in Artificial Intelligence. DOI: 10.1007/978-3-642-33353-8_29.

Tang, R., Lu, Y., Liu, L., Mou, L., Vechtomova, O., and Lin, J. (2019). Distilling task-specific knowledge from bert into simple neural networks. DOI: 10.48550/arXiv.1903.12136.

Tanon, T. P., Weikum, G., and Suchanek, F. (2020). Yago 4: A reason-able knowledge base. In European Semantic Web Conference, pages 583-596. Springer. DOI: 10.1007/978-3-030-49461-2.

Teniente, E. and Olivé, A. (1995). Updating knowledge bases while maintaining their consistency. The VLDB Journal, 4(2):193-241. DOI: 10.1007/BF01237920.

Toutanova, K., Lin, V., Yih, W.-t., Poon, H., and Quirk, C. (2016). Compositional learning of embeddings for relation paths in knowledge base and text. pages 1434-1444. DOI: 10.18653/v1/P16-1136.

Tumey, P. D. (1999). Learning to extract keyphrases from text. NRC Technical Report ERB-l 057. National Research Council, Canada, pages 1-43. DOI: 10.48550/arXiv.cs/0212013.

Unbehauen, J., Hellmann, S., Auer, S., and Stadler, C. (2012). Knowledge extraction from structured sources. Search Computing: Broadening Web Search, pages 34-52. DOI: 10.1007/978-3-642-34213-4_3.

UNESCO (1975). Unisist - indexing principles. Book.

Vajjala, S., Majumder, B., Gupta, A., and Surana, H. (2020). Practical Natural Language Processing: A Comprehensive Guide to Building Real-world NLP Systems. O Reilly Media. Book.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. DOI: 10.48550/arXiv.1706.03762.

Vega-Oliveros, D. A., Gomes, P. S., Milios, E. E., and Berton, L. (2019). A multi-centrality index for graph-based keyword extraction. Information Processing & Management, 56(6):102063. DOI: 10.1016/j.ipm.2019.102063.

Wang, Q. and Hao, Y. (2020). Alstm: An attention-based long short-term memory framework for knowledge base reasoning. Neurocomputing, 399:342-351. DOI: 10.1016/j.neucom.2020.02.065.

Wang, Q., Mao, Z., Wang, B., and Guo, L. (2017). Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 29(12):2724-2743. DOI: 10.1109/TKDE.2017.2754499.

Wang, Q., Wang, B., and Guo, L. (2015). Knowledge base completion using embeddings and rules. In Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI'15, page 1859–1865. AAAI Press. Available online [link].

Wang, X., Gao, T., Zhu, Z., Zhang, Z., Liu, Z., Li, J., and Tang, J. (2021). Kepler: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics, 9:176-194. DOI: 10.48550/arXiv.1911.0613.

Wang, Z., Zhang, J., Feng, J., and Chen, Z. (2014a). Knowledge graph and text jointly embedding. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1591-1601, Doha, Qatar. Association for Computational Linguistics. DOI: 10.3115/v1/D14-1167.

Wang, Z., Zhang, J., Feng, J., and Chen, Z. (2014b). Knowledge graph embedding by translating on hyperplanes. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, AAAI'14, page 1112–1119. AAAI Press. DOI: 10.1609/aaai.v28i1.8870.

Wei, Z., Zhao, J., Liu, K., Qi, Z., Sun, Z., and Tian, G. (2015). Large-scale knowledge base completion: Inferring via grounding network sampling over selected instances. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, CIKM '15, page 1331–1340, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/2806416.2806513.

Xie, Q., Ma, X., Dai, Z., and Hovy, E. (2017). An interpretable knowledge transfer model for knowledge base completion. arXiv preprint arXiv:1704.05908. DOI: 10.48550/arXiv.1704.05908.

Xie, R., Liu, Z., Jia, J., Luan, H., and Sun, M. (2016). Representation learning of knowledge graphs with entity descriptions. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI'16, page 2659–2665. AAAI Press. DOI: 10.1609/aaai.v30i1.10329.

Xu, F., Pan, Z., and Xia, R. (2020). E-commerce product review sentiment classification based on a na"ive bayes continuous learning framework. Information Processing & Management, 57(5):102221. DOI: 10.1016/j.ipm.2020.102221.

Yu, A. W., Dohan, D., Luong, M.-T., Zhao, R., Chen, K., Norouzi, M., and Le, Q. V. (2018). Qanet: Combining local convolution with global self-attention for reading comprehension. DOI: 10.48550/arXiv.1804.09541.

Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., and Auer, S. (2016). Quality assessment for linked data: A survey. Semantic Web, 7(1):63-93. DOI: 10.3233/SW-150175.

Zhao, L., Alhoshan, W., Ferrari, A., Letsholo, K. J., Ajagbe, M. A., Chioasca, E.-V., and Batista-Navarro, R. T. (2021). Natural language processing for requirements engineering: A systematic mapping study. ACM Computing Surveys (CSUR), 54(3):1-41. DOI: 10.1145/3444689.

Zhou, K. Z. and Li, C. B. (2012). How knowledge affects radical innovation: Knowledge base, market knowledge acquisition, and internal knowledge sharing. Southern Medical Journal, 33:1090-1102. DOI: 10.1002/smj.1959.




How to Cite

da Costa, L. A. L. F., Melchiades, M. B., Girelli, V. S., Colombelli, F., Araújo, D. A. de, Rigo, S. J., Ramos, G. de O., da Costa, C. A., Righi, R. da R., & Barbosa, J. L. V. (2024). Advancing Chatbot Conversations: A Review of Knowledge Update Approaches. Journal of the Brazilian Computer Society, 30(1), 55–68.