A framework for automatic construction of search profiles based on semantic topic modeling
DOI:
https://doi.org/10.5753/reic.2022.2687Keywords:
Topic modeling, Word embeddingAbstract
Recent efforts have focused on identifying multidisciplinary teams and detecting co-Authorship Networks based on exploring topic modeling to identify researchers' expertise. Though promising, none of these efforts perform a real-life evaluation of the quality of the built topics. This paper proposes a framework that allows summarizing articles written by researchers to automatically build research profiles and perform online evaluations. We perform a set of experiments, considering the Lattes repository, contrasting two types of evaluation: (1) an offline in which we exploit a traditional metric (NPMI); and (2) an online where researchers evaluate their own built profiles. We observed that using both together is very important for a comprehensive quality evaluation.
Downloads
References
Bangor, A., Kortum, P. T., and Miller, J. T. (2008). An empirical evaluation of the system usability scale. International Journal of Human–Computer Interaction, 24(6):574–594.
de Siqueira, G. O., Canuto, S. D., Gonc¸alves, M. A., and Laender, A. H. F. (2020). A pragmatic approach to hierarchical categorization of research expertise in the presence of scarce information. Int. J. Digit. Libr., 21(1):61–73.
Gusenbauer, M. (2019). Google scholar to overshadow them all? comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics, 118(1):177–214.
Kocaballi, A. B., Laranjo, L., and Coiera, E. (2018). Measuring user experience in conversational interfaces: a comparison of six questionnaires. In Proceedings of the 32nd International BCS Human Computer Interaction Conference, page 21. BCS Learning & Development Ltd.
Lee, D. D. and Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788–791.
Nikolenko, S. I. (2016). Topic quality metrics based on distributed word representations. In SIGIR’16.
Nunes, D., Matos, D., Gomes, J., and Neto, F. (2021a). Chronic pain and language: A topic modelling approach to personal pain descriptions. [link].
Nunes, D. A. P., de Matos, D. M., Ferreira-Gomes, J., and Neto, F. (2021b). Chronic pain and language: A topic modelling approach to personal pain descriptions. CoRR, abs/2109.00402.
Pedro, A., Pereira, A., Cecilio, P., Pena, N., Viegas, F., Tuler, E., Dias, D. R., and Rocha, L. (2021). An Article-Oriented Framework for Automatic Semantic Analysis of COVID-19 Researches, pages 172–187. Springer International Publishing.
Salton, G. and Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5):513–523.
Viegas, F., Canuto, S., Gomes, C., Luiz, W., Rosa, T., Ribas, S., Rocha, L., and Gonc¸alves, M. A. (2019). Cluwords: exploiting semantic word clustering representation for enhanced topic modeling. pages 753–761.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Eletronic Journal of Undergraduate Research on Computing

This work is licensed under a Creative Commons Attribution 4.0 International License.
