Semantic Search to Foster Scientific Findability: A Systematic Literature Review
DOI:
https://doi.org/10.5753/jidm.2021.1919Keywords:
Open Science, Semantic Search, Scientific collaborationAbstract
One of the main goals of the Open Science movement is to leverage scientific collaboration through, among others, promoting the sharing and reuse of research outputs, such as publications, data and software. Sharing is enabled by public and accessible scientific repositories where these outputs are managed throughout their lifecycle. In this context, finding these digital artifacts has become a key problem. Semantic search mechanisms have risen as a means to solve this issue. However, implementing and integrating them into scientific repositories presents many challenges. This article presents a systematic literature review of research efforts on mechanisms for supporting search for scientific papers, data and processes. Our investigation is based on extracting and analyzing the entire contents of nine digital libraries using the associated search engines – in alphabetical order: ACM Digital Library, arXiV, Engineering Village, IEEE Xplore, SBC OpenLib, Springer Link, Scopus, Wiley Online Library and Web of Science. After retrieving a combined amount of 5012 documents, we identified 2054 unique papers that were used as a basis for our analysis. Our findings provide, among others, a new categorization of literature on search and discuss unexplored gaps, thereby contributing to advancing research on semantic search mechanisms to support Open Science.
Downloads
References
Adams, B. and Janowicz, K. Constructing geo-ontologies by reification of observation data. In ACM GIS 2011. GIS’11. ACM, New York, NY, USA, pp. 309–318, 2011. http://doi.org/10.1145/2093973.2094015.
Adamusiak, T., Burdett, T., Kurbatova, N., Joeri van der Velde, K., Abeygunawardena, N., Antonakaki, D., Kapushesky, M., Parkinson, H., and Swertz, M. A. Ontocat – simple ontology search and integration in java, r and rest/javascript. BMC Bioinformatics 12 (1): 218, May, 2011. http://doi.org/10.1186/1471-2105-12-218.
Ahmed, I. and Afzal, M. T. A systematic approach to map the research articles’ sections to imrad. IEEE Access vol. 8, pp. 129359–129371, 2020. http://doi.org/10.1109/ACCESS.2020.3009021.
Annane, A., Emonet, V., Azouaou, F., and Jonquet, C. Multilingual mapping reconciliation between english-french biomedical ontologies. In WIMS 2016. ACM, 2016. http://doi.org/10.1145/2912845.2912847.
Brinkley, J., Borromeo, C., Clarkson, M., Cox, T., Cunningham, M., Detwiler, L., Heike, C., Hochheiser, H., Mejino, J., Travillian, R., and Shapiro, L. The ontology of craniofacial development and malformation for translational craniofacial research. AJMG 163 (4): 232–245, 2013. http://doi.org/10.1002/ajmg.c.31377.
Budak Arpinar, I., Sheth, A., Ramakrishnan, C., Lynn Usery, E., Azami, M., and Kwan, M.-P. Geospatial ontology development and semantic analytics. Transactions in GIS 10 (4): 551–575, 2006. http://doi.org/10.1111/j.1467-9671.2006.01012.x.
Chua, W. W. K. and Kim, J.-j. Semantic querying over knowledge in biomedical text corpora annotated with multiple ontologies. In ACM BCB 2012. BCB ’12. ACM, New York, NY, USA, pp. 400–407, 2012. http://doi.org/10.1145/2382936.2382987.
de la Villa, M., Aparicio, F., Maña, M. J., and de Buenaga, M. A learning support tool with clinical cases based on concept maps and medical entity recognition. In ACM IUI 2012. ACM, New York, NY, USA, pp. 61–70,2012. http://doi.org/10.1145/2166966.2166978.
Deus, H., Zhao, J., McCusker, J., Fox, R., Prud’hommeaux, E., Malone, J., Das, S., Miller, M., Adamusiak, T., Rocca Serra, P., and Marshall, M. Translating standards into practice - one semantic web api for gene expression. In SWAT4LS ’11. ACM, London, UK, 2012. http://doi.org/10.1145/2166896.2166900.
Djokic-Petrovic, M., Cvjetkovic, V., Yang, J., Zivanovic, M., and Wild, D. J. Pibas fedsparql: a web-based platform for integration and exploration of bioinformatics datasets. J. of Biomedical Semantics 8 (1): 42, Sep, 2017. http://doi.org/10.1186/s13326-017-0151-z.
Figueroa, C., Vagliano, I., Rocha, O. R., and Morisio, M. A systematic literature review of linked data-based recommender systems. CPE 27 (17): 4659–4684, 2015. https://doi.org/10.1002/cpe.3449.
Gacitua, R., Mazon, J. N., and Cravero, A. Using semantic web technologies in the development of data warehouses: A systematic mapping. WIREs Data Mining and Knowledge Discovery 9 (3): e1293, 2019. http://doi.org/10.1002/widm.1293.
Gil, Y., Garijo, D., Mishra, S., and Ratnakar, V. Ontosoft: A distributed semantic registry for scientific software. In e-Science. IEEE, Baltimore, MD, USA, pp. 331–336, 2017. http://doi.org/10.1109/eScience.2016.7870916.
Gottardi, T. Support data and documentation on meta-analysis on semantic search, 2021. https://doi.org/10.25824/redu/89NUBJ.
Gottardi, T., Medeiros, C. B., and Reis, J. D. Semantic search on scientific repositories: A systematic literature review. In SBBD 2020. SBC, Salvador, Brazil, 2020a. http://doi.org/10.5753/sbbd.2020.13653.
Gottardi, T., Medeiros, C. B., and Reis, J. D. Understanding semantic search on scientific repositories: Steps towards meaningful findability. ZBMed, Athens, Greece, 2020b.
Gusenbauer, M. Google scholar to overshadow them all? comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics 118 (1): 177–214, Jan, 2019. https://doi.org/10.1007/s11192-018-2958-5.
Gusenbauer, M. and Haddaway, N. R. Which academic search systems are suitable for systematic reviews or meta-analyses? evaluating retrieval qualities of google scholar, pubmed, and 26 other resources. Research Synthesis Methods 11 (2): 181–217, 2020. http://doi.org/10.1002/jrsm.1378.
Gustafsson, M., Falkman, G., Lindahl, F., and Torgersson, O. Enabling an online community for sharing oral medicine cases using semantic web technologies. In The Semantic Web - ISWC 2006. Springer, Berlin, Heidelberg, pp. 820–832, 2006.
Havukkala, I. Ontologies and semantic mining for bio-technology and chemistry data and patents. In PaIR 2009. PaIR ’09. ACM, pp. 41–42, 2009. http://doi.org/10.1145/1651343.1651354.
Hummel, P., Braun, M., Tretter, M., and Dabrock, P. Data sovereignty: A review. Big Data & Society 8 (1): 1–17, 2021. http://doi.org/10.1177/2053951720982012.
Jonquet, C., Toulet, A., Dutta, B., and Emonet, V. Harnessing the power of unified metadata in an ontology repository: The case of agroportal. Journal on Data Semantics 7 (4): 191–221, Dec, 2018. http://doi.org/10.1007/s13740-018-0091-5.
Karimi, E., Babaei, M., and Beheshti, M. The study of semantic and ontological features of thesaurus and ontology-based information retrieval systems. JIPM 34 (4): 1579–1606, 2019. https://jipm.irandoc.ac.ir/article-1-3463-en.html.
Khattak, A. M., Ahmad, N., Mustafa, J., Pervez, Z., Latif, K., and Lee, S. Y. Context-aware search in dynamic repositories of digital documents. In IEEE CSE. IEEE, Sydney, Australia, 2013. http://doi.org/10.1109/CSE.2013.59.
Kitchenham, B. and Charters, S. Guidelines for performing systematic literature reviews in software engineering. Tech. Rep. EBSE 2007-001, Keele University and Durham University Joint Report, UK, 2007. http://www.dur.ac.uk/ebse/resources/guidelines/Systematic-reviews-5-8.pdf.
Kraines, S. B. Active computer-mediated sharing and discovery of scientific knowledge through ontologies and logical inference. In Knowledge Management. Vol. 7. World Scientific, Singapore, pp. 195–206, 2008. http://doi.org/10.1142/9789812837578_0017.
Kumazawa, T., Saito, O., Kozaki, K., Matsui, T., and Mizoguchi, R. Toward knowledge structuring of sustainability science based on ontology engineering. Sustainability Science 4 (1): 99, Feb, 2009. http://doi.org/10.1007/s11625-008-0063-z.
Luo, Y., Yu, Z., Zhuang, Y., and Zheng, Z. Dynamic mapping processing between global ontology and local ontologies in grid environment. ITJ 12 (12): 2454–2459, 2013. http://doi.org/10.3923/itj.2013.2454.2459.
Mulwad, V. Dc proposal: Graphical models and probabilistic reasoning for generating linked data from tables. In ISWC 2011. Springer, Berlin, Heidelberg, pp. 317–324, 2011.
Muresan, S. and Klavans, J. L. Inducing terminologies from text: A case study for the consumer health domain. ASI 64 (4): 727–744, 2013. http://doi.org/10.1002/asi.22787.
Neri, M. A. Knowledge integration through semantic query rewriting. In Proceedings of the 9th WSEAS International Conference on Applied Computer Science, ACS ’09. WSEAS, Morioka City, Iwate, Japan, pp. 229 – 234, 2009.
Nguyen, S. H. and Chowdhury, G. Interpreting the knowledge map of digital library research (1990–2010). Journal of the ASIST 64 (6): 1235–1258, 2013. http://doi.org/10.1002/asi.22830.
Oakley, A., Gough, D., Oliver, S., and Thomas, J. The politics of evidence and methodology: lessons from the eppi-centre. Evidence & Policy 1 (1): 5–32, 2005. http://doi.org/10.1332/1744264052703168.
Omar, Y. M. K., el Moneim, A. A., and Mohamed, K. Building a framework for mapping rdbms to rdf with semantic query capabilities. In ITMS 2019. IEEE, Riga, Latvia, pp. 1–5, 2019. http://doi.org/10.1109/ITMS47855.2019.8940733.
Pirrò, G., Ruffolo, M., and Talia, D. Advanced semantic search and retrieval in a collaborative peer-to-peer system. In UPGRADE’08. ACM, Boston, MA, USA, pp. 65–71, 2008. http://doi.org/10.1145/1384209.1384222.
Portilla Herrera, N. A., Gomez, F. L., Bucheli, V. A., and Pabón, O. S. Semantic annotation and retrieval of scientific documents in a big data environment. In LACNEM 2017. IET, Valparaiso, Chile, pp. 1–6, 2017. http://doi.org/10.1049/ic.2017.0032.
Seifer, P., Leinberger, M., Lämmel, R., and Staab, S. Semantic query integration with reason. The Art, Science, and Engineering of Programming 3 (3): 1–28, 2019. http://programming-journal.org/2019/3/13.
Sengloiluean, K. and Khuntong, R. Ontology-based semantic integration of heterogeneous data sources using ontology mapping approach. J. of Theoretical and Applied Information Technology 98 (22): 3489–3502, 2020. http://www.jatit.org/volumes/Vol98No22/13Vol98No22.pdf.
Thomas, P., Starlinger, J., and Leser, U. Experiences from developing the domain-specific entity search engine gene view. Lecture Notes in Informatics (LNI) vol. P-214, pp. 225–239, 2013.
Urdidiales-Nieto, D., Navas-Delgado, I., and Aldana-Montes, J. F. Biological web service repositories review. Molecular Informatics 36 (5-6): 1600035, 2017. http://doi.org/10.1002/minf.201600035.
Wilkinson, M., Dumontier, M., Aalbersberg, J., Appleton, G., and et al. The fair guiding principles for scientific data management and stewardship. Nature Data 3 (1): 1–9, 2016. http://doi.org/10.1038/sdata.2016.18.
Woelfle, M., Olliaro, P., and Todd, M. H. Open science is a research accelerator. Nature Chemistry vol. 3, pp. 745–748, October, 2011. http://doi.org/10.1038/nchem.1149.
Xiaoming, Z., Changjun, H., Qian, Z., and Chongchong, Z. Material scientific data integration for semantic grid. In SKG 2007. IEEE, Xi’an, China, pp. 414–417, 2007. http://doi.org/10.1109/SKG.2007.95.
Xu, H., Sun, L., Zou, M., and Meng, A. A survey of scientific metadata schema. Applied Mechanics and Materials vol. 411-414, pp. 349–352, 2013. http://www.scientific.net/AMM.411-414.349.
Zhang, W., Byna, S., Niu, C., and Chen, Y. Exploring metadata search essentials for scientific data management. In 2019 IEEE HiPC. IEEE, Hyderabad, India, pp. 83–92, 2019. http://doi.org/10.1109/HiPC.2019.00021.
Zheng, S., Wang, F., and Lu, J. Enabling ontology based semantic queries in biomedical database systems. International Journal of Semantic Computing 8 (1): 67–83, 2014. http://doi.org/10.1142/S1793351X14500032.
Zhizhin, M., Kihn, E., Lyutsarev, V., Berezin, S., Poyda, A., Mishin, D., Medvedev, D., and Voitsekhovsky, D. Environmental scenario search and visualization. In ACM GIS 2007. GIS ’07. ACM, New York, NY, USA, 2007. http://doi.org/10.1145/1341012.1341047.