Analysing Temporal Evolution of Complex Data Using Similarity Queries
DOI:
https://doi.org/10.5753/jidm.2021.1920Keywords:
similarity queries, complex data, temporal evolutionAbstract
Regardless of the data domain, there are applications that must track the temporal evolution of data elements. Based on the instances present in the database, the goal is to estimate the state of a given element at a different time instant from those available in the database. This kind of task is common in many database application domains, such as medicine, meteorology, agriculture, financial, and others. In content-based retrieval with complex data (such as images, sounds and videos), data are usually represented in metric spaces, where only the distances between elements are available. Without dimensional coordinates, it is not possible simply to add a time dimension for trajectory estimation in these spaces, as is the case in multidimensional spaces. In this article we propose to map the metric data to a multidimensional space so that we can estimate the element’s status at a given time instant, based on known states of the same element. As it is not possible to create the complex data equivalent to its estimated position in mapped space, we propose to apply similarity queries using this position as query center. Then, we estimate how this element would be, retrieving the real data elements present in the database that are close to the estimate. In this article, in addition to the nearest neighbor query (k-NN), we propose to use two other queries: kAndRange and kAndRev. With both methods, we aim to prune non-relevant elements from the query results, retrieving only the elements that are really close to the estimates. We present experiments with different query scenarios, evaluating the effects of varying input parameters of the proposed queries.
Downloads
References
Arantes, A., Vieira, M., Traina Jr., C., and Traina, A. Efficient algorithms to execute complex similarity queries in RDBMS. Journal of the Brazilian Computer Society vol. 9, pp. 5–24, 04, 2004.
Bueno, R., Kaster, D. S., Traina, A. J. M., and Traina Jr., C. Time-aware similarity search: a metric-temporal representation for complex data. In Proceedings of the International Symposium on Spatial and Temporal Databases. Springer, Aalborg, Denmark, pp. 302–319, 2009.
Bustos, C., Navarro, G., Reyes, N., and Paredes, R. An empirical evaluation of intrinsic dimension estimators. In Proceedings of the International Conference on Similarity Search and Applications. Springer International Publishing, Cham, pp. 125–137, 2015.
Chávez, E., Navarro, G., Baeza-Yates, R., and Marroquín, J. L. Searching in metric spaces. ACM Computing Surveys 33 (3): 273–321, Sept., 2001.
Cox, M. A. A. and Cox, T. F. pp. 315–347. In , Multidimensional Scaling. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 315–347, 2008.
de Sousa Fogaça, I. C. O. and Bueno, R. Temporal evolution of complex data. In Proceedings of the Brazilian Symposium on Databases. SBC, Porto Alegre, RS, Brasil, pp. 25–36, 2020.
Faloutsos, C. and Lin, K.-I. Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. SIGMOD Record 24 (2): 163–174, May, 1995.
Geusebroek, J., Burghouts, G. J., and Smeulders, A. W. M. The amsterdam library of object images. International Journal of Computer Vision 61 (1): 103–112, 2005.
Hjaltason, G. R. and Samet, H. Properties of embedding methods for similarity searching in metric spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (5): 530–549, 2003.
Paiva, C. E., Malaquias Jr, R. D., and Bueno, R. Visualization of similarity queries with trajectory estimation in complex data. In Proceedings of the International Conference Information Visualisation. Vienna, Austria, pp.92–97, 2020.
Sousa, E. P. M., Traina Jr., C., Traina, A. J. M., Wu, L., and Faloutsos, C. A fast and effective method to find correlations among attributes in databases. Data Mining and Knowledge Discovery 14 (3): 367–407, 2007.
Tao, Y., Yiu, M. L., and Mamoulis, N. Reverse nearest neighbor search in metric spaces. IEEE Transactions on Knowledge and Data Engineering 18 (9): 1239–1252, 2006.
Traina Jr., C., Traina, A. J. M., and Faloutsos, C. Distance exponent: A new concept for selectivity estimation in metric trees. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society, San Diego, California, USA, pp. 195, 2000.
Traina Jr., C., Traina, A. J. M., and Faloutsos, C. Fast feature selection using fractal dimension - ten years later. Journal of Information and Data Management 1 (1): 17–20, 2010.
Traina Jr., C., Traina, A. J. M., Wu, L., and Faloutsos, C. Fast feature selection using fractal dimension. Journal of Information and Data Management 1 (1): 3–16, 2010.
Vieira, M. R., Traina Jr., C., Traina, A. J. M., Arantes, A. S., and Faloutsos, C. Boosting k-nearest neighbor queries estimating suitable query radii. In Proceedings of the International Conference on Scientific and Statistical Databases Management. IEEE Computer Society, pp. 1–10, 2007.