Integrating Metadata and Interface Components in Mobile Applications: The AID and UID Datasets as Support for Data-Driven Interactive System Design
DOI:
https://doi.org/10.5753/jis.2026.7356Keywords:
Dataset, Mobile Aplication, User Interface, Machine Learning, Natural Language ProcessingAbstract
The success of mobile applications is intrinsically linked to the quality of their User Interfaces (UIs), yet an open gap remains in the systematic integration of app store metadata with semantic interface components. This study addresses this challenge by employing the Design Science Research Methodology to develop and analyze two comprehensive and complementary artifacts: the Automated Insights Dataset (AID), containing 48 technical and market metadata types from 6,400 applications, and the User Interface Depth Dataset (UID), which features a detailed manual mapping of 50 UI component types and 1,948 screenshots from 400 high-quality apps. Moving beyond descriptive statistics, this research performs a multidimensional analysis that uncovers latent design patterns and correlations between interface elements, application categories, and visual identities (characteristic colors). Furthermore, we demonstrate the practical utility of these datasets through a predictive modeling experiment using Natural Language Processing, which successfully infers UI composition from textual descriptions with accuracy levels exceeding 90% in controlled evaluations. The results provide a robust empirical foundation for data-driven design, offering actionable insights for researchers and practitioners to ground their decisions on real-world market evidence and established design conventions.
Downloads
References
Abbas, A. M. H., Ghauth, K. I., and Ting, C.-Y. (2022). User experience design using machine learning: A systematic review. IEEE Access, 10:51501–51514. DOI: https://doi.org/10.1109/access.2022.3173289.
Adler, J. and Parmryd, I. (2010). Quantifying colocalization by correlation: The pearson correlation coefficient is superior to the mander’s overlap coefficient. Cytometry Part A, 77A(8):733–742. DOI: https://doi.org/10.1002/cyto.a.20896.
Ali, R. (2024). Mining and recommending mobile app features using data-driven analytics. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, ASE ’24, page 2432–2434, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3691620.3695371.
Bowers, A. J., Zhao, Y., and Ho, E. (2022). Towards hierarchical cluster analysis heatmaps as visual data analysis of entire student cohort longitudinal trajectories and outcomes from grade 9 through college. The High School Journal, 106(1):5–36. DOI: https://doi.org/10.1353/hsj.2022.a906700.
Bunian, S., Li, K., Jemmali, C., Harteveld, C., Fu, Y., and Seif El-Nasr, M. S. (2021). Vins: Visual search for mobile user interface design. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, CHI ’21, page 1–14. ACM. DOI: https://doi.org/10.1145/3411764.3445762.
Chen, X., Zou, Q., Fan, B., Zheng, Z., and Luo, X. (2019). Recommending software features for mobile applications based on user interface comparison. Requirements Engineering, 24(4):545–559. DOI: https://doi.org/10.1007/s00766-018-0303-4.
Clifton, I. G. (2015). Android user interface design: Implementing material design for developers. Addison-Wesley Professional.
Cooper, A., Reimann, R., Cronin, D., and Noessel, C. (2014). About face: the essentials of interaction design. John Wiley & Sons.
da Cruz Alves, N., Kreuch, L., and von Wangenheim, C. G. (2022). Analyzing structural similarity of user interface layouts of android apps using deep learning. In Proceedings of the 21st Brazilian Symposium on Human Factors in Computing Systems, IHC ’22, page 1–11. ACM. DOI: https://doi.org/10.1145/3554364.3559111.
de Souza Lima, A. L., Martins, O. P. H. R., von Wangenheim, C. G., von Wangenheim, A., Borgatto, A. F., and Hauck, J. C. R. (2022). Automated assessment of visual aesthetics of android user interfaces with deep learning. In Proceedings of the 21st Brazilian Symposium on Human Factors in Computing Systems, IHC ’22, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3554364.3559113.
Deka, B., Huang, Z., Franzen, C., Hibschman, J., Afergan, D., Li, Y., Nichols, J., and Kumar, R. (2017). Rico: A mobile app dataset for building data-driven design applications. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, UIST ’17. ACM. DOI: https://doi.org/10.1145/3126594.3126651.
Fonseca, J. S. d. and Martins, G. d. A. (2016). Curso de estatística. Atlas, São Paulo, SP, 6ª edition.
Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. " O’Reilly Media, Inc.".
Gorla, A., Tavecchia, I., Gross, F., and Zeller, A. (2014). Checking app behavior against app descriptions. In Proceedings of the 36th International Conference on Software Engineering, ICSE ’14, page 1025–1035. ACM. DOI: https://doi.org/10.1145/2568225.2568276.
Hartson, R. and Pyla, P. S. (2012). The UX Book: Process and guidelines for ensuring a quality user experience. Elsevier.
Harty, J. and Müller, M. (2019). Better android apps using android vitals. In Proceedings of the 3rd ACM SIGSOFT International Workshop on App Market Analytics, WAMA 2019, page 26–32, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3340496.3342761.
Hasan, T. I., Silalahi, C. I., Rumagit, R. Y., and Pratama, G. D. (2024). Ui/ux design impact on e-commerce attracting users. Procedia Computer Science, 245:1075–1082. 9th International Conference on Computer Science and Computational Intelligence 2024 (ICCSCI 2024). DOI: https://doi.org/10.1016/j.procs.2024.10.336.
Hecht, G. and Bergel, A. (2021). Quantifying the adoption of kotlin on android stores: Insight from the bytecode. In 2021 IEEE/ACM 8th International Conference on Mobile Software Engineering and Systems (MobileSoft), pages 94–98. DOI: https://doi.org/10.1109/MobileSoft52590.2021.00019.
Kabir, M. S. and Arefin, M. S. (2019). Google play store data mining and analysis. International Journal of Applied Information Systems, 12(26):1–5. DOI: https://doi.org/10.5120/ijais2019451839.
Kortum, P. and Sorber, M. (2015). Measuring the usability of mobile applications for phones and tablets. International Journal of Human-Computer Interaction, 31(8):518–529. DOI: https://doi.org/10.1080/10447318.2015.1064658.
Kuspil, J., Leal, G., and Balancieri, R. (2025). Mineração de componentes de interface e metadados em aplicativos móveis. In Anais do XXIV Simpósio Brasileiro sobre Fatores Humanos em Sistemas Computacionais, pages 817–837, Porto Alegre, RS, Brasil. SBC. DOI: https://doi.org/10.5753/ihc.2025.12001.
Kuspil, J., Leal, G., Guerino, G., Balancieri, R., and Coleti, T. (2023). Modelo de recomendações de diretrizes de interface para aplicativos móveis usando aprendizado de máquina. In Anais do II Workshop Investigações em Interação Humano-Dados, pages 50–55, Porto Alegre, RS, Brasil. SBC. DOI: https://doi.org/10.5753/wide.2023.236109.
Kuspil, J., Ribeiro, J., Leal, G., Guerino, G., and Balancieri, R. (2024). Datasets on mobile app metadata and interface components to support data-driven app design. In Proceedings of the 26th International Conference on Enterprise Information Systems - Volume 1: ICEIS, pages 425–432. INSTICC, SciTePress. DOI: https://doi.org/10.5220/0012740600003690.
Kuspil, J. C. (2024). Mineração de componentes de interface e metadados em aplicativos móveis. Dissertação de mestrado, Universidade Estadual de Maringá (UEM), Maringá, PR. Programa de Pós-Graduação em Ciência da Computação (PCC–UEM). Orientador: Prof. Dr. Renato Balancieri. Coorientadora: Profa. Dra. Gislaine Camila Lapasini Leal.
Li, K., Xu, Z., and Chen, X. (2014). A platform for searching ui component of android application. In 2014 5th International Conference on Digital Home, page 205–210. IEEE. DOI: https://doi.org/10.1109/icdh.2014.46.
Liu, T. F., Craft, M., Situ, J., Yumer, E., Mech, R., and Kumar, R. (2018). Learning design semantics for mobile apps. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology, UIST ’18, page 569–579. ACM. DOI: https://doi.org/10.1145/3242587.3242650.
Masveta, D. and Manyangara, M. E. (2025). The ux-ui continuum: exploring the interplay between user experience and user interface in e-learning platforms. Cogent Education, 12(1):2536531. DOI: https://doi.org/10.1080/2331186X.2025.2536531.
McCallum, A., Nigam, K., et al. (1998). A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization, volume 752, pages 41–48. Madison, WI.
[link] [Accessed: 20 Abr 2026].
McHugh, M. L. (2013). The chi-square test of independence. Biochemia Medica, page 143–149. DOI: https://doi.org/10.11613/bm.2013.018.
Moran, K., Bernal-Cardenas, C., Curcio, M., Bonett, R., and Poshyvanyk, D. (2018). Machine learning-based prototyping of graphical user interfaces for mobile apps. IEEE Transactions on Software Engineering, 46(2):196–221. DOI: https://doi.org/10.1109/tse.2018.2844788.
Neil, T. (2014). Mobile design pattern gallery: UI patterns for smartphone apps. " O’Reilly Media, Inc.".
Nielsen, J. and Budiu, R. (2015). User Experience for Mobile Applications and Websites. Nielsen Norman Group, Fremont, CA, 3rd edition. Design Guidelines for Improving the Usability of Mobile Sites and Apps; Copyright © Nielsen Norman Group, All Rights Reserved.
Peffers, K., Tuunanen, T., Rothenberger, M. A., and Chatterjee, S. (2007). A design science research methodology for information systems research. Journal of Management Information Systems, 24(3):45–77. DOI: https://doi.org/10.2753/mis0742-1222240302.
Prakash, G. and Koshy, J. (2021). Google play store apps. [link] [Accessed: 22 Abr 2026].
Pratama, M. A. T. and Cahyadi, A. T. (2020). Effect of user interface and user experience on application sales. volume 879, page 012133. IOP Publishing. DOI: https://doi.org/10.1088/1757-899x/879/1/012133.
Quiñones-Gómez, J. C., Mor, E., and Chacón, J. (2024). Data-driven design in the design process: A systematic literature review on challenges and opportunities. International Journal of Human–Computer Interaction, 41(4):2227–2252. DOI: https://doi.org/10.1080/10447318.2024.2318060.
Saha, A., Song, Y., Mahmud, J., Zhou, Y., Moran, K., and Chaparro, O. (2024). Toward the automated localization of buggy mobile app uis from bug descriptions. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2024, page 1249–1261, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3650212.3680357.
Sahami Shirazi, A., Henze, N., Schmidt, A., Goldberg, R., Schmidt, B., and Schmauder, H. (2013). Insights into layout patterns of mobile user interfaces by an automatic analysis of android apps. In Proceedings of the 5th ACM SIGCHI Symposium on Engineering Interactive Computing Systems, EICS ’13, page 275–284, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/2494603.2480308.
Singh, S. (2006). Impact of color on marketing. Management Decision, 44(6):783–789. DOI: https://doi.org/10.1108/00251740610673332.
Sousa, Á. (2019). Coeficiente de correlação de pearson e coeficiente de correlação de spearman: o que medem e em que situações devem ser utilizados? Correio dos Açores: Matemática, page 19. [link] [Accessed: 20 Abr 2026].
STATISTA (2026). Statista - technology & telecommunications.
Tidwell, J. (2010). Designing interfaces: Patterns for effective interaction design. " O’Reilly Media, Inc.".
United Nations, Department of Economic and Social Affairs, Population Division (2018). The World’s Cities in 2018: Data Booklet. United Nations, New York. [link] [Accessed: 20 Abr 2026].
Verma, J. P. (2012). Data analysis in management with SPSS software. Springer Science & Business Media.
Wang, B., Li, G., Zhou, X., Chen, Z., Grossman, T., and Li, Y. (2021). Screen2words: Automatic mobile ui summarization with multimodal learning. In The 34th Annual ACM Symposium on User Interface Software and Technology, UIST ’21, page 498–510, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3472749.3474765.
Wu, J., Peng, Y.-H., Li, X. Y. A., Swearngin, A., Bigham, J. P., and Nichols, J. (2024). Uiclip: A data-driven model for assessing user interface design. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, UIST ’24, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3654777.3676408.
Yu, H., Lian, Y., Yang, S., Tian, L., and Zhao, X. (2016). Recommending Features of Mobile Applications for Developer, page 361–373. Springer International Publishing. DOI: https://doi.org/10.1007/978-3-319-49586-6_24.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Jonathan Cesar Kuspil, Guilherme Corredato Guerino, Gislaine Camila L. Leal, Renato Balancieri

This work is licensed under a Creative Commons Attribution 4.0 International License.
JIS is free of charge for authors and readers, and all papers published by JIS follow the Creative Commons Attribution 4.0 International (CC BY 4.0) license.


