New Metrics for Assessing the Quality of Hierarchical Topic Modeling Strategies


Topic modeling, Automatic evaluation, Word embeddings, Hierarchy topic modeling


Hierarchical Topic Modeling (HTM) are strategies that aim to automatically extract consistent semantic topics from textual documents, respecting the hierarchy in which the information is structured. Current evaluation metrics for these approaches typically measure the quality of each topic individually. In HTM, other issues need to be considered: (i) Redundancy of topics; (ii) Semantic diversity of constructed topics; and (iii) Topological consistency. In this work, we propose and evaluate three new evaluation metrics that consider these issues, complementing the methodology for evaluating HTM approaches from the perspective of the hierarchical structure in which the topics are constructed.


