A framework for automatic construction of search profiles based on semantic topic modeling

Authors

  • Pablo Cecilio UFSJ
  • Antônio Pereira UFSJ
  • Leonardo Rocha UFSJ
  • Felipe Viegas UFMG

Keywords:

Topic modeling, Word embedding

Abstract

Recent efforts have focused on identifying multidisciplinary teams and detecting co-Authorship Networks based on exploring topic modeling to identify researchers' expertise. Though promising, none of these efforts perform a real-life evaluation of the quality of the built topics. This paper proposes a framework that allows summarizing articles written by researchers to automatically build research profiles and perform online evaluations. We perform a set of experiments, considering the Lattes repository, contrasting two types of evaluation: (1) an offline in which we exploit a traditional metric (NPMI); and (2) an online where researchers evaluate their own built profiles. We observed that using both together is very important for a comprehensive quality evaluation.

Downloads

Download data is not yet available.

References

Bangor, A., Kortum, P. T., and Miller, J. T. (2008). An empirical evaluation of the system usability scale. International Journal of Human–Computer Interaction, 24(6):574–594.

de Siqueira, G. O., Canuto, S. D., Gonc¸alves, M. A., and Laender, A. H. F. (2020). A pragmatic approach to hierarchical categorization of research expertise in the presence of scarce information. Int. J. Digit. Libr., 21(1):61–73.

Gusenbauer, M. (2019). Google scholar to overshadow them all? comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics, 118(1):177–214.

Kocaballi, A. B., Laranjo, L., and Coiera, E. (2018). Measuring user experience in conversational interfaces: a comparison of six questionnaires. In Proceedings of the 32nd International BCS Human Computer Interaction Conference, page 21. BCS Learning & Development Ltd.

Lee, D. D. and Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788–791.

Nikolenko, S. I. (2016). Topic quality metrics based on distributed word representations. In SIGIR’16.

Nunes, D., Matos, D., Gomes, J., and Neto, F. (2021a). Chronic pain and language: A topic modelling approach to personal pain descriptions. https://arxiv.org/abs/2109.00402.

Nunes, D. A. P., de Matos, D. M., Ferreira-Gomes, J., and Neto, F. (2021b). Chronic pain and language: A topic modelling approach to personal pain descriptions. CoRR, abs/2109.00402.

Pedro, A., Pereira, A., Cecilio, P., Pena, N., Viegas, F., Tuler, E., Dias, D. R., and Rocha, L. (2021). An Article-Oriented Framework for Automatic Semantic Analysis of COVID-19 Researches, pages 172–187. Springer International Publishing. Salton, G. and Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5):513–523.

Viegas, F., Canuto, S., Gomes, C., Luiz, W., Rosa, T., Ribas, S., Rocha, L., and Gonc¸alves, M. A. (2019). Cluwords: exploiting semantic word clustering representation for enhanced topic modeling. pages 753–761.

Published

2022-07-21

How to Cite

Cecilio, P., Pereira, A., Rocha, L., & Viegas, F. (2022). A framework for automatic construction of search profiles based on semantic topic modeling. Eletronic Journal of Undergraduate Research on Computing, 20(3). Retrieved from https://journals-sol.sbc.org.br/index.php/reic/article/view/2687

Issue

Section

Special Issue: CTIC/CSBC