J-EDA: A workbench for tuning similarity and diversity search parameters in content-based image retrieval


  • João V. O. Novaes University of São Paulo (ICMC/USP)
  • Lúcio F. D. Santos Federal Institute of Technology of North of Minas Gerais (IFNMG)
  • Luiz Olmes Carvalho Federal University of Itajubá (UNIFEI)
  • Daniel de Oliveira Fluminense Federal University (IC/UFF)
  • Marcos V. N. Bedo Fluminense Federal University (INFES/UFF)
  • Agma J. M. Traina University of São Paulo (ICMC/USP)
  • Caetano Traina Jr. University of São Paulo (ICMC/USP)




Content-based image retrieval, Result diversification, Similarity searching


Similarity searches can be modeled by means of distances following the Metric Spaces Theory and constitute a fast and explainable query mechanism behind content-based image retrieval (CBIR) tasks. However, classical distance-based queries, e.g., Range and k-Nearest Neighbors, may be unsuitable for exploring large datasets because the retrieved elements are often similar among themselves. Although similarity searching is enriched with the imposition of rules to foster result diversification, the fine-tuning of the diversity query is still an open issue, which is is usually carried out with and a non-optimal expensive computational inspection. This paper introduces J-EDA, a practical workbench implemented in Java that supports the tuning of similarity and diversity search parameters by enabling the automatic and parallel exploration of multiple search settings regarding a user-posed content-based image retrieval task. J-EDA implements a wide variety of classical and diversity-driven search queries, as well as many CBIR settings such as feature extractors for images, distance functions, and relevance feedback techniques. Accordingly, users can define multiple query settings and inspect their performances for spotting the most suitable parameterization for a content-based image retrieval problem at hand. The workbench reports the experimental performances with several internal and external evaluation metrics such as P × R and Mean Average Precision (mAP), which are calculated towards either incremental or batch procedures performed with or without human interaction.


Download data is not yet available.


Baeza-Yates, R. A. and Ribeiro-Neto, B. A. Modern Information Retrieval. ACM Press, Harlow, EN, 1999.

Bedo, M. and et. al. Endowing a content-based medical image retrieval system with perceptual similarity using ensemble strategy. J. Digital Imaging 29 (1): 22–37, 2016.

Carbonell, J. and Goldstein, J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In ACM Conference on Research and Development in Information Retrieval. ACM, New York, USA, pp. 335–336, 1998.

Chowdhury, M. E. H. and et. al. Can AI Help in Screening Viral and COVID-19 Pneumonia? IEEE Access vol. 8, pp. 132665–132676, 2020.

Drosou, M., Jagadish, H., Pitoura, E., and Stoyanovich, J. Diversity in big data: A review. Big data 5 (2): 73–84, 2017.

Drosou, M. and Pitoura, E. POIKILO: A tool for evaluating the results of diversification models and algorithms. PVLDB 6 (12): 1246–1249, 2013.

Hetland, M. L. The basic principles of metric indexing. In Swarm intelligence for multi-objective problems in Data Mining. Springer, New York, USA, pp. 199–232, 2009.

Jain, A., Sarda, P., and Haritsa, J. Providing diversity in k-nearest neighbor query results. In Advances in Knowledge Discovery and Data Mining. Springer, 2004.

Jasbick, D., Santos, L., de Oliveira, D., and Bedo, M. Some Branches May Bear Rotten Fruits: Diversity Browsing VP-Trees. In SISAP. Springer, pp. 140–154, 2020.

Maigrot, C., Kijak, E., Sicre, R., and Claveau, V. Tampering detection and localization in images from social networks: A CBIR approach. In ICIAP. Lecture Notes in Computer Science, vol. 10484. Springer, pp. 750–761, 2017.

Marakakis, A., Galatsanos, N. P., Likas, A., and Stafylopatis, A. Relevance Feedback for Content-Based Image Retrieval Using Support Vector Machines and Feature Selection. In ICANN. Springer, pp. 942–951, 2009.

Marques, P. M. d. A. and Rangayyan, R. M. Content-based Retrieval of Medical Images: Landmarking, Indexing, and Relevance Feedback. Synthesis Lectures on Biomedical Engineering. Morgan & Claypool Publishers, San Rafael, California, USA, 2013.

Novaes, J. V. O., Bedo, M., Oliveira, D., Traina, A. J. M., Traina Jr, C., and Santos, L. F. D. J-EDA: A diversified similarity workbench for content-based image retrieval. In SBBD. SBC, pp. 1–6, 2019.

Pestov, V. Lower bounds on performance of metric tree indexing schemes for exact similarity search in high dimensions. Algorithmica 66 (2): 310–328, 2013.

Porkaew, K. and Chakrabarti, K. Query refinement for multimedia similarity retrieval in MARS. In Int. Conf. on Multimedia. ACM, pp. 235–238, 1999.

Rahman, T., Khandakar, A., Qiblawey, Y., Tahir, A., Kiranyaz, S., Abul Kashem, S. B., Islam, M. T., Al Maadeed, S., Zughaier, S. M., Khan, M. S., and Chowdhury, M. E. Exploring the effect of image enhancement techniques on covid-19 detection using chest x-ray images. Comp. in Biology and Medicine vol. 132, pp. 104319,

Rocchio, J. J. Relevance feedback in information retrieval. The Smart retrieval system-experiments in automatic document processing 1 (1): 313–323, 1971.

Rosu, R., Donias, M., Bombrun, L., Said, S., Regniers, O., and Da Costa, J. Structure Tensor Riemannian Statistical Models for CBIR and Classification of Remote Sensing Images. IEEE Trans. on Geoscience and Remote Sensing 55 (1): 248–260, 2017.

Ruiz, R. R., Rodrı́guez-Mazahua, L., López Chau, A., Peláez-Camarena, S. G., Abud-Figueroa, M. A., and Machorro-Cano, I. A CBIR system for the recognition of agricultural machinery. Res. Comput. Sci. 147 (3): 9–16, 2018.

Santos, L. and et. al. Have you met VikS?: A novel framework for visual diversity search analysis. In SBBD Demos. SBC, Curitiba, Brazil, pp. 209–214, 2014.

Santos, L. F. D., Bedo, M. V. N., Ponciano-Silva, M., Traina, A. J. M., and Traina Jr., C. Being similar is not enough: How to bridge usability gap through diversity in medical images. In CBMS. IEEE, pp. 287–293, 2014.

Santos, L. F. D., Blanco, G., Oliveira, D. d., Traina, A. J. M., Traina, Caetano, J., and Bedo, M. V. N. Exploring diversified similarity with kundaha. In CIKM. ACM, Torino, Italy, pp. 1903–1906, 2018.

Santos, L. F. D., Oliveira, W. D., Ferreira, M. R. P., Traina, A. J. M., and Jr., C. T. Parameter-free and domain-independent similarity search with diversity. In SSDBM. ACM, New York, NY, USA, pp. 5:1–5:12, 2013.

Silva, Y. N., Aref, W. G., Larson, P.-A., Pearson, S. S., and Ali, M. H. Similarity queries: their conceptual evaluation, transformations, and processing. The VLDB J. 22 (3): 395–420, 2013.

Skopal, T., Dohnal, V., Batko, M., and Zezula, P. Distinct nearest neighbors queries for similarity search in very large multimedia databases. In WIDM. ACM, pp. 11–14, 2009.

Su, W., Yuan, Y., and Zhu, M. A Relationship between the Average Precision and the Area Under the ROC Curve. In Int. Conf. on The Theory of Information Retrieval. ACM, pp. 349–352, 2015.

Traina, A. J. M., Brinis, S., Pedrosa, G. V., Avalhais, L. P. S., and Traina Jr, C. Querying on large and complex databases by content: Challenges on variety and veracity regarding real applications. Info. Sys. vol. 86, pp. 10–27, 2019.

Traina, A. J. M., Traina Jr, C., Balan, A. G. R., Ribeiro, M. X., Bugatti, P. H., Watanabe, C. Y. V., and Marques, P. M. d. A. Feature extraction and selection for decision making over medical images. In Biomedical Image Processing - Methods and Applications. Springer, New York, NY, USA, pp. 197–223, 2010.

Tronci, R. and et. al. ImageHunter: A Novel Tool for Relevance Feedback in Content Based Image Retrieval. In DART. Springer, Palermo, Italy, pp. 53–70, 2013.

Vieira, M. and et. al. DivDB: A System for Diversifying Query Results. PVLDB 4 (12): 1395–1398, 2011a.

Vieira, M. and et. al. On query result diversification. In ICDE. IEEE, pp. 1163–1174, 2011b.

Xioufis, E. S., Papadopoulos, S., Gı̂nsca, A.-L., Popescu, A., Kompatsiaris, Y., and Vlahavas, I. P. Improving diversity in image search via supervised relevance scoring. In ICMR. ACM, China, pp. 323–330, 2015.

Yu, C., Lakshmanan, L., and Amer-Yahia, S. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In ICDE. IEEE, ACM, pp. 1299–1302, 2009.

Zezula, P., Amato, G., Dohnal, V., and Batko, M. Similarity Search: The Metric Space Approach. Vol. 2. Springer, New York, USA, 2010.

Zheng, K., Wang, H., Qi, Z., Li, J., and Gao, H. A survey of query result diversification. Know. and Info. Sys. vol. 51, pp. 1–36, 2017.




How to Cite

O. Novaes, J. V., F. D. Santos, L., Olmes Carvalho, L., de Oliveira, D., V. N. Bedo, M., J. M. Traina, A., & Traina Jr., C. (2021). J-EDA: A workbench for tuning similarity and diversity search parameters in content-based image retrieval. Journal of Information and Data Management, 12(2). https://doi.org/10.5753/jidm.2021.1990



SBBD 2020 - Demonstrations and Applications