A framework for automatic construction of search profiles based on semantic topic modeling
Keywords:
Topic modeling, Word embeddingAbstract
Recent efforts have focused on identifying multidisciplinary teams and detecting co-Authorship Networks based on exploring topic modeling to identify researchers' expertise. Though promising, none of these efforts perform a real-life evaluation of the quality of the built topics. This paper proposes a framework that allows summarizing articles written by researchers to automatically build research profiles and perform online evaluations. We perform a set of experiments, considering the Lattes repository, contrasting two types of evaluation: (1) an offline in which we exploit a traditional metric (NPMI); and (2) an online where researchers evaluate their own built profiles. We observed that using both together is very important for a comprehensive quality evaluation.
Downloads
References
Bangor, A., Kortum, P. T., and Miller, J. T. (2008). An empirical evaluation of the system usability scale. International Journal of Human–Computer Interaction, 24(6):574–594.
de Siqueira, G. O., Canuto, S. D., Gonc¸alves, M. A., and Laender, A. H. F. (2020). A pragmatic approach to hierarchical categorization of research expertise in the presence of scarce information. Int. J. Digit. Libr., 21(1):61–73.
Gusenbauer, M. (2019). Google scholar to overshadow them all? comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics, 118(1):177–214.
Kocaballi, A. B., Laranjo, L., and Coiera, E. (2018). Measuring user experience in conversational interfaces: a comparison of six questionnaires. In Proceedings of the 32nd International BCS Human Computer Interaction Conference, page 21. BCS Learning & Development Ltd.
Lee, D. D. and Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788–791.
Nikolenko, S. I. (2016). Topic quality metrics based on distributed word representations. In SIGIR’16.
Nunes, D., Matos, D., Gomes, J., and Neto, F. (2021a). Chronic pain and language: A topic modelling approach to personal pain descriptions. https://arxiv.org/abs/2109.00402.
Nunes, D. A. P., de Matos, D. M., Ferreira-Gomes, J., and Neto, F. (2021b). Chronic pain and language: A topic modelling approach to personal pain descriptions. CoRR, abs/2109.00402.
Pedro, A., Pereira, A., Cecilio, P., Pena, N., Viegas, F., Tuler, E., Dias, D. R., and Rocha, L. (2021). An Article-Oriented Framework for Automatic Semantic Analysis of COVID-19 Researches, pages 172–187. Springer International Publishing. Salton, G. and Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5):513–523.
Viegas, F., Canuto, S., Gomes, C., Luiz, W., Rosa, T., Ribas, S., Rocha, L., and Gonc¸alves, M. A. (2019). Cluwords: exploiting semantic word clustering representation for enhanced topic modeling. pages 753–761.