A hereditary attentive question answering framework for knowledge bases

Authors

DOI:

https://doi.org/10.5753/jis.2025.5428

Keywords:

KBQA, C-KBQA, Entity Recognition, Property Extraction, LC-QUAD

Abstract

Background. The rapid growth of online data has made retrieving relevant information a challenging task, prompting the rise of Knowledge Base Question Answering (KBQA) systems that handle complex, multi-hop queries. Purpose. This extended work refines our previous pipeline by introducing structured dummy templates, a Hereditary Tree-LSTM (HTL) for classification, and more comprehensive analyses of entity recognition, property extraction, and SPARQL assembly. Methods. We enhanced the LC-QUAD 2.1 dataset with standardized templates and evaluated a flexible pipeline that integrates DeepPavlov, Falcon, SpaCy, qualifiers constraints, and reverse lookups. Results. Our experiments reveal that multi-tool entity recognition outperforms single-tool methods, while property extraction benefits from extended property sets and refined ranking strategies. Overall SPARQL correctness reaches up to 70–80% in mid-complex queries but remains lower in domain-specific subsets. Conclusion. The proposed synergy of NLP tools and refined dummy templates increases coverage for complex KBQA, though further improvements in morphological handling and specialized embeddings may be needed to address challenging multi-hop or niche queries comprehensively.

Downloads

Download data is not yet available.

References

Cao, S., Shi, J., Pan, L., Nie, L., Xiang, Y., Hou, L., Li, J., He, B., and Zhang, H. (2022). KQA pro: A dataset with explicit compositional programs for complex question answering over knowledge base. In Muresan, S., Nakov, P., and Villavicencio, A., editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6101–6119, Dublin, Ireland. Association for Computational Linguistics. DOI: https://doi.org/10.18653/v1/2022.acl-long.422.

Chen, S., Liu, Q., Yu, Z., Lin, C.-Y., Lou, J.-G., and Jiang, F. (2021). Retrack: A flexible and efficient framework for knowledge base question answering. In Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing: system demonstrations, pages 325–336.

Daull, X., Bellot, P., Bruno, E., Martin, V., and Murisasco, E. (2023). Complex qa and language models hybrid architectures, survey. arXiv preprint arXiv:2302.09051.

DeepPavlov Team (20182024). Entity linking — deeppavlov 0.14.1 documentation. [link]. Access on 16 August 2025.

Dileep, A. K., Mishra, A., Mehta, R., Uppal, S., Chakraborty, J., and Bansal, S. K. (2021). Template-based question answering analysis on the lc-quad2.0 dataset. In 2021 IEEE 15th International Conference on Semantic Computing (ICSC), pages 443–448. DOI: https://doi.org/10.1109/ICSC50631.2021.00079.

Dubey, M., Banerjee, D., Abdelkawi, A., and Lehmann, J. (2019). Lc-quad 2.0: A large dataset for complex question answering over wikidata and dbpedia. In International Semantic Web Conference, pages 69–78. Springer.

Dutt, R., Khosla, S., Bannihatti Kumar, V., and Gangadharaiah, R. (2023). GrailQA++: A challenging zero-shot benchmark for knowledge base question answering. In Park, J. C., Arase, Y., Hu, B., Lu, W., Wijaya, D., Purwarianti, A., and Krisnadhi, A. A., editors, Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 897–909, Nusa Dua, Bali. Association for Computational Linguistics. DOI: https://doi.org/10.18653/v1/2023.ijcnlp-main.58.

Ferragina, P. and Scaiella, U. (2010). Tagme: On-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM ’10, page 1625–1628, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/1871437.1871689.

Gomes, J., de Mello, R. C., Ströele, V., and de Souza, J. F. (2022). A hereditary attentive template-based approach for complex knowledge base question answering systems. Expert Systems with Applications, 205:117725. DOI: https://doi.org/10.1016/j.eswa.2022.117725.

Gomes Jr, J., de Mello, R. C., Ströele, V., and de Souza, J. F. (2022). A study of approaches to answering complex questions over knowledge bases. Knowledge and Information Systems, 64(11):2849–2881.

Gomes Jr., J., de Mello, R. C., Ströele, V., and de Souza, J. F. (2021). Lc-quad 2.1.

Hu, X., Wu, X., Shu, Y., and Qu, Y. (2022). Logical form generation via multi-task learning for complex question answering over knowledge bases. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1687–1696, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.

Jang, H., Oh, Y., Jin, S., Jung, H., Kong, H., Lee, D., Jeon, D., and Kim, W. (2017). Kbqa: Constructing structured query graph from keyword query for semantic search. In Proceedings of the International Conference on Electronic Commerce, ICEC ’17, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3154943.3154955.

Jiménez, G., Leme, P. P., and Casanova, T. I. N. (2022). A framework to compute entity relatedness in large rdf knowledge bases. Journal of Information and Data Management, 13(2). DOI: https://doi.org/10.5753/jidm.2022.2435.

Jiménez, J., Leme, L., and Casanova, M. (2021a). Coepinkb: A framework to understand the connectivity of entity pairs in knowledge bases. In Anais do XLVIII Seminário Integrado de Software e Hardware, pages 97–105, Porto Alegre, RS, Brasil. SBC. DOI: https://doi.org/10.5753/semish.2021.15811.

Jiménez, J. G., Leme, L. P. P., Izquierdo, Y. T., Neves, A. B., and Casanova, M. (2021b). A distributed framework to investigate the entity relatedness problem in large rdf knowledge bases. In Anais do XXXVI Simpósio Brasileiro de Bancos de Dados, pages 121–132, Porto Alegre, RS, Brasil. SBC. DOI: https://doi.org/10.5753/sbbd.2021.17871.

Lan, Y., He, G., Jiang, J., Jiang, J., Zhao, W. X., and Wen, J. (2021). A survey on complex knowledge base question answering: Methods, challenges and solutions. CoRR, abs/2105.11644.

Mello, R., Jr., J. G., Souza, J., and Ströele, V. (2024). Constructing a kbqa framework: Design and implementation. In Proceedings of the 30th Brazilian Symposium on Multimedia and the Web, pages 89–97, Porto Alegre, RS, Brasil. SBC. DOI: https://doi.org/10.5753/webmedia.2024.243150.

Mihindukulasooriya, N., Rossiello, G., Kapanipathi, P., Abdelaziz, I., Ravishankar, S., Yu, M., Gliozzo, A., Roukos, S., and Gray, A. G. (2020). Leveraging semantic parsing for relation linking over knowledge bases. CoRR, abs/2009.07726.

Ngomo, N. (2018). 9th challenge on question answering over linked data (qald-9). language, 7(1):58–64.

Perevalov, A., Diefenbach, D., Usbeck, R., and Both, A. (2022). Qald-9-plus: A multilingual dataset for question answering over dbpedia and wikidata translated by native speakers. CoRR, abs/2202.00120.

Qin, K., Wang, Y., Li, C., Gunaratna, K., Jin, H., Pavlu, V., and Aslam, J. A. (2020). A complex KBQA system using multiple reasoning paths. CoRR, abs/2005.10970.

Rolim, T., Avila, C. V., Junior, N. A., Costa, F., Mariano, R., Calixto, T., and Vidal, V. M. (2021). Kg-e: Um grafo de conhecimento semântico baseado na integração de dados de empresas e sancionados. In Anais do IX Workshop de Computação Aplicada em Governo Eletrônico, pages 155–166, Porto Alegre, RS, Brasil. SBC. DOI: https://doi.org/10.5753/wcge.2021.15985.

Rossiello, G., Mihindukulasooriya, N., Abdelaziz, I., Bornea, M., Gliozzo, A., Naseem, T., and Kapanipathi, P. (2021). Generative relation linking for question answering over knowledge bases. In Hotho, A., Blomqvist, E., Dietze, S., Fokoue, A., Ding, Y., Barnaghi, P., Haller, A., Dragoni, M., and Alani, H., editors, The Semantic Web – ISWC 2021, pages 321–337, Cham. Springer International Publishing.

Sakor, A., Singh, K., Patel, A., and Vidal, M. (2019). FALCON 2.0: An entity and relation linking tool over wikidata. CoRR, abs/1912.11270.

Sorokin, D. and Gurevych, I. (2018). Modeling semantics with gated graph neural networks for knowledge base question answering. In Bender, E. M., Derczynski, L., and Isabelle, P., editors, Proceedings of the 27th International Conference on Computational Linguistics, pages 3306–3317, Santa Fe, New Mexico, USA. Association for Computational Linguistics.

Tan, Y., Min, D., Li, Y., Li, W., Hu, N., Chen, Y., and Qi, G. (2023). Evaluation of chatgpt as a question answering system for answering complex questions. arXiv preprint arXiv:2303.07992.

Trivedi, P., Maheshwari, G., Dubey, M., and Lehmann, J. (2017). Lc-quad: A corpus for complex question answering over knowledge graphs. In International Semantic Web Conference, pages 210–218, Cham. Springer, Springer International Publishing.

Usbeck, R., Ngomo, A.-C. N., Haarmann, B., Krithara, A., Röder, M., and Napolitano, G. (2017). 7th open challenge on question answering over linked data (qald-7). In Semantic Web Challenges: 4th SemWebEval Challenge at ESWC 2017, Portoroz, Slovenia, May 28–June 1, 2017, Revised Selected Papers, pages 59–69. Springer.

Xie, Z., Zeng, Z., Zhou, G., and He, T. (2016). Knowledge base question answering based on deep learning models. In Natural Language Understanding and Intelligent Applications: 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, Kunming, China, December 2–6, 2016, Proceedings 24, pages 300–311. Springer.

Ye, X., Yavuz, S., Hashimoto, K., Zhou, Y., and Xiong, C. (2021). Rng-kbqa: Generation augmented iterative ranking for knowledge base question answering. arXiv preprint arXiv:2109.08678.

Downloads

Published

2025-08-22

How to Cite

MELLO, R. C. de; GOMES JR., J.; SOUZA, J. F. de; STRÖELE, V. A hereditary attentive question answering framework for knowledge bases. Journal on Interactive Systems, Porto Alegre, RS, v. 16, n. 1, p. 606–620, 2025. DOI: 10.5753/jis.2025.5428. Disponível em: https://journals-sol.sbc.org.br/index.php/jis/article/view/5428. Acesso em: 5 dec. 2025.

Issue

Section

Regular Paper