Semantically Time Tracking of Events from Web Documents

Authors

Keywords:

Tracking Events, Word Embeedings, Search Engines

Abstract

Exploring large news collections created by media outlets with traditional search engines is impractical for demanding users. Thus, we propose a temporal exploration tool that aims to facilitate the consultation of news collections. We concentrated our efforts on two fronts (i) allowing users to make queries with the addition of information from documents represented by word embbedings and (ii) developing a strategy for retrieving temporal information to generate timelines presented by an appropriate interface. We evaluated our solution in a collection of a Brazilian newspaper and demonstrated that it can draw different timelines, covering different subtopics of the same theme.

Downloads

Download data is not yet available.

References

Alonso, O., Gertz, M., and Baeza-Yates, R. (2009). Clustering and exploring search results using timeline constructions. In Proceedings of ACM CIKM.

Azad, H. K. and Deepak, A. (2019). Query expansion techniques for information retrieval: A survey. Information Processing and Management, 56(5):1698–1735.

Kanhabua, N. and Anand, A. (2016). Temporal information retrieval. In Proceedings of ACM SIGIR.

Kuzi, S., Shtok, A., and Kurland, O. (2016). Query expansion using word embeddings. In Proceedings of ACM CIKM.

Le, Q. and Mikolov, T. (2014). Distributed representations of sentences and documents. In Proceedings of ICML.

Li, J. and Cardie, C. (2014). Timeline generation: Tracking individuals on twitter. In Proceedings of ACM WWW.

Matthews, M., Tolchinsky, P., Blanco, R., Atserias, J., Mika, P., and Zaragoza, H. (2010). Searching through time in the new york times. In Proceedings of ACM HCIR.

Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013a). Efficient estimation of word representations in vector space. In Proceedings of ICLR.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. In Proceedings of ICNIPS.

Rocchio, J. J. (1971). Relevance feedback in information retrieval. In Proceedings of The Smart retrieval system - experiments in automatic document processing.

Roy, D., Paul, D., Mitra, M., and Garain, U. (2016). Using word embeddings for automatic query expansion. ArXiv, abs/1606.07608.

Singh, J., Nejdl, W., and Anand, A. (2016). History by diversity: Helping historians search news archives. In Proceedings of ACM CHIIR.

Sparck Jones, K., Walker, S., and Robertson, S. (2000). A probabilistic model of information retrieval: development and comparative experiments: Part 1. Information Processing & Management, 36(6):779–808.

Published

2021-06-03

How to Cite

Santos, W., & Rocha, L. (2021). Semantically Time Tracking of Events from Web Documents. Eletronic Journal of Undergraduate Research on Computing, 19(2). Retrieved from https://journals-sol.sbc.org.br/index.php/reic/article/view/2085

Issue

Section

Special Issue: CTIC/CSBC