UX-MAPPER: An automated approach to analyze app store reviews with a focus on UX

Authors

DOI:

https://doi.org/10.5753/jis.2025.4099

Keywords:

User experience, User reviews, Machine learning, App stores

Abstract

The mobile app market has increased substantially in the past decades, and the myriad options in the app stores have made users less tolerant of low-quality apps. In this competitive scenario, User eXperience (UX) has emerged as an essential factor in standing out from competitors. By understanding what factors affect UX, practitioners could focus on factors that lead to positive UX while mitigating those that affect UX negatively. In this context, app store reviews emerged as a valuable resource for investigating these influential factors. However, analyzing millions of reviews can be costly and time-consuming. This article introduces UX-MAPPER, a tool designed to analyze app store reviews and assist practitioners in pinpointing factors that impact UX. We applied the Design Science Research method to develop UX-MAPPER iteratively and rooted in a robust theoretical background. We performed exploratory studies to investigate the problem, a systematic mapping study to identify UX-affecting factors, and an empirical study to ascertain practitioners’ relevance and acceptance of UX-MAPPER. In general, the participants recognized the relevance and utility of UX-MAPPER in enhancing the quality of existing apps and exploring reviews of competing apps to identify user preferences, requests, and critiques regarding functionalities and features. However, the output quality requires refinement to better convey the benefits of the results, especially for practitioners with prior experience with automated approaches. From the participants’ feedback, we defined a set of suggestions to extract more useful features, which can contribute to future studies involving user review analysis. Based on the results of this research, we present the contributions to the area of HCI and possible developments for future research.

Downloads

Download data is not yet available.

References

Al Omran, F. N. A. and Treude, C. (2017). Choosing an nlp library for analyzing software documentation: A systematic literature review and a series of experiments. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE. DOI: https://doi.org/10.1109/msr.2017.42.

Alexandrakis, D., Chorianopoulos, K., and Tselios, N. (2020). Older adults and web 2.0 storytelling technologies: Probing the technology acceptance model through an age-related perspective. International Journal of Human–Computer Interaction, 36(17):1623–1635. DOI: https://doi.org/10.1080/10447318.2020.1768673.

Alves, R., Valente, P., and Nunes, N. J. (2014). The state of user experience evaluation practice. In Proceedings of the 8th Nordic Conference on Human-Computer Interaction: Fun, Fast, Foundational, NordiCHI ’14. ACM. DOI: https://doi.org/10.1145/2639189.2641208.

Bakiu, E. and Guzman, E. (2017). Which feature is unusable? detecting usability and user experience issues from user reviews. In 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW). IEEE. DOI: https://doi.org/10.1109/rew.2017.76.

Bopp, J. A., Mekler, E. D., and Opwis, K. (2016). Negative emotion, positive experience? emotionally moving moments in digital games. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, CHI ’16, page 2996–3006, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/2858036.2858227.

Bradley, M. M. and Lang, P. J. (1994). Measuring emotion: The self-assessment manikin and the semantic differential. Journal of Behavior Therapy and Experimental Psychiatry, 25(1):49–59. DOI: https://doi.org/10.1016/0005-7916(94)90063-9.

Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, page 785–794, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/2939672.2939785.

Cockburn, A., Quinn, P., and Gutwin, C. (2017). The effects of interaction sequencing on user experience and preference. International Journal of Human-Computer Studies, 108:89–104. DOI: https://doi.org/10.1016/j.ijhcs.2017.07.005.

Cohen, J. (2013). Statistical Power Analysis for the Behavioral Sciences. Routledge. DOI: https://doi.org/10.4324/9780203771587.

Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly, 13(3):319–340. DOI: https://doi.org/10.2307/249008.

de Andrade Cardieri, G. and Zaina, L. M. (2018). Analyzing user experience in mobile web, native and progressive web applications: A user and hci specialist perspectives. In Proceedings of the 17th Brazilian Symposium on Human Factors in Computing Systems, IHC ’18, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3274192.3274201.

de Araújo, A. F. and Marcacini, R. M. (2021). Rebert: automatic extraction of software requirements from app reviews using bert language model. In Proceedings of the 36th Annual ACM Symposium on Applied Computing, SAC ’21, page 1321–1327, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3412841.3442006.

de Sá Siqueira, M. A., Müller, B. C. N., and Bosse, T. (2024). When do we accept mistakes from chatbots? the impact of human-like communication on user experience in chatbots that make mistakes. International Journal of Human–Computer Interaction, 40(11):2862–2872. DOI: https://doi.org/10.1080/10447318.2023.2175158.

Durelli, V. H. S., Durelli, R. S., Endo, A. T., Cirilo, E., Luiz, W., and Rocha, L. (2018). Please please me: does the presence of test cases influence mobile app users’ satisfaction? In Proceedings of the XXXII Brazilian Symposium on Software Engineering, SBES ’18, page 132–141, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3266237.3266272.

Dąbrowski, J., Letier, E., Perini, A., and Susi, A. (2020). Mining user opinions to support requirement engineering: An empirical study. In Advanced Information Systems Engineering, page 401–416. Springer International Publishing. DOI: https://doi.org/10.1007/978-3-030-49435-3_25.

Grinberg, M. (2018). Flask web development: developing web applications with python. O’Reilly Media, Inc. [link]

Gutwin, C., Rooke, C., Cockburn, A., Mandryk, R. L., and Lafreniere, B. (2016). Peak-end effects on player experience in casual games. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, CHI ’16, page 5608–5619, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/2858036.2858419.

Guzman, E. and Maalej, W. (2014). How do users like this feature? a fine grained sentiment analysis of app reviews. In 2014 IEEE 22nd International Requirements Engineering Conference (RE), pages 153–162. DOI: 10.1109/RE.2014.6912257.

Guzman, E., Oliveira, L., Steiner, Y., Wagner, L. C., and Glinz, M. (2018). User feedback in the app store: a cross-cultural study. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Society, ICSE-SEIS ’18, page 13–22, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3183428.3183436.

Guzman, E. and Paredes Rojas, A. (2019). Gender and user feedback: An exploratory study. In 2019 IEEE 27th International Requirements Engineering Conference (RE), pages 381–385. DOI: https://doi.org/10.1109/RE.2019.00049.

Hassenzahl, M. (2007). The hedonic/pragmatic model of user experience. Towards a UX manifesto, 10:2007.

Hedegaard, S. and Simonsen, J. G. (2014). Mining until it hurts: automatic extraction of usability issues from online reviews compared to traditional usability evaluation. In Proceedings of the 8th Nordic Conference on Human-Computer Interaction: Fun, Fast, Foundational, NordiCHI ’14, page 157–166, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/2639189.2639211.

Hevner, A. and Chatterjee, S. (2010). Design Research in Information Systems: Theory and Practice. Springer US. DOI: https://doi.org/10.1007/978-1-4419-5653-8.

Hevner, A. R. (2007). A three cycle view of design science research. Scandinavian journal of information systems, 19(2):4.

Jang, J. and Yi, M. Y. (2017). Modeling user satisfaction from the extraction of user experience elements in online product reviews. In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems, CHI EA ’17, page 1718–1725, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3027063.3053097.

Johann, T., Stanik, C., Alizadeh B., A. M., and Maalej, W. (2017). Safe: A simple approach for feature extraction from app descriptions and app reviews. In 2017 IEEE 25th International Requirements Engineering Conference (RE), pages 21–30. DOI: https://doi.org/10.1109/RE.2017.71.

Landis, J. R. and Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1):159. DOI: https://doi.org/10.2307/2529310.

Law, E. L.-C., Van Schaik, P., and Roto, V. (2014). Attitudes towards user experience (ux) measurement. International Journal of Human-Computer Studies, 72(6):526–541. DOI: https://doi.org/10.1016/j.ijhcs.2013.09.006.

Lynch, K. R., Schwerha, D. J., and Johanson, G. A. (2013). Development of a weighted heuristic for website evaluation for older adults. International Journal of Human–Computer Interaction, 29(6):404–418. DOI: https://doi.org/10.1080/10447318.2012.715277.

Maalej, W., Kurtanović, Z., Nabil, H., and Stanik, C. (2016). On the automatic classification of app reviews. Requir. Eng., 21(3):311–331. DOI: https://doi.org/10.1007/s00766-016-0251-9.

Marques, L., Matsubara, P. G., Nakamura, W. T., Ferreira, B. M., Wiese, I. S., Gadelha, B. F., Zaina, L. M., Redmiles, D., and Conte, T. U. (2021). Understanding ux better: A new technique to go beyond emotion assessment. Sensors, 21(21). DOI: https://doi.org/10.3390/s21217183.

McIlroy, S., Ali, N., Khalid, H., and E. Hassan, A. (2015). Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Software Engineering, 21(3):1067–1106. DOI: https://doi.org/10.1007/s10664-015-9375-7.

Nakamura, W., Marques, L., Ferreira, B., Barbosa, S., and Conte, T. (2020). To inspect or to test? what approach provides better results when it comes to usability and ux? In Proceedings of the 22nd International Conference on Enterprise Information Systems, pages 487–498. SCITEPRESS - Science and Technology Publications. DOI: https://doi.org/10.5220/0009367904870498.

Nakamura, W. T., C. de Oliveira, E. C., H. T. de Oliveira, E., and Conte, T. (2024). Ux-mapper: A user experience method to analyze app store reviews. In Proceedings of the XXII Brazilian Symposium on Human Factors in Computing Systems, IHC ’23, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3638067.3638109.

Nakamura, W. T., de Oliveira, E. C., de Oliveira, E. H., Redmiles, D., and Conte, T. (2022). What factors affect the ux in mobile apps? a systematic mapping study on the analysis of app store reviews. J. Syst. Softw., 193(C). DOI: https://doi.org/10.1016/j.jss.2022.111462.

Nakamura, W. T., de Oliveira, E. H. T., and Conte, T. (2019a). Negative emotions, positive experience: What are we doing wrong when evaluating the ux? In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, CHI EA ’19, page 1–6, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3290607.3313000.

Nakamura, W. T., de Souza, J. C., Teixeira, L. M., Silva, A., da Silva, R., Gadelha, B., and Conte, T. (2021). Requirements behind reviews: How do software practitioners see app user reviews to think of requirements? In Proceedings of the XX Brazilian Symposium on Software Quality, SBQS ’21, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3493244.3493245.

Nakamura, W. T., Marques, L. C., Redmiles, D., de Oliveira, E. H. T., and Conte, T. (2023). Investigating the influence of different factors on the ux evaluation of a mobile application. International Journal of Human–Computer Interaction, 39(20):3948–3968. DOI: https://doi.org/10.1080/10447318.2022.2108658.

Nakamura, W. T., Marques, L. C., Rivero, L., Oliveira, E. H. T. d., and Conte, T. (2019b). Are scale-based techniques enough for learners to convey their ux when using a learning management system? Revista Brasileira de Informática na Educação, 27(1):104–131. DOI: https://doi.org/10.5753/rbie.2019.27.01.104.

Nalepa, J. and Kawulok, M. (2018). Selecting training sets for support vector machines: a review. Artificial Intelligence Review, 52(2):857–900. DOI: https://doi.org/10.1007/s10462-017-9611-1.

Palomba, F., Salza, P., Ciurumelea, A., Panichella, S., Gall, H., Ferrucci, F., and De Lucia, A. (2017). Recommending and localizing change requests for mobile apps based on user reviews. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pages 106–117. DOI: https://doi.org/10.1109/ICSE.2017.18.

Panichella, S., Di Sorbo, A., Guzman, E., Visaggio, C. A., Canfora, G., and Gall, H. C. (2015). How can i improve my app? classifying user reviews for software maintenance and evolution. In 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 281–290. DOI: https://doi.org/10.1109/ICSM.2015.7332474.

Reimers, N. and Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Inui, K., Jiang, J., Ng, V., and Wan, X., editors, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics. DOI: https://doi.org/10.18653/v1/D19-1410.

Rennie, J. D. M., Shih, L., Teevan, J., and Karger, D. R. (2003). Tackling the poor assumptions of naive bayes text classifiers. In Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML’03, page 616–623. AAAI Press.

Rivero, L. and Conte, T. (2017). A systematic mapping study on research contributions on ux evaluation technologies. In Proceedings of the XVI Brazilian Symposium on Human Factors in Computing Systems, IHC ’17, New York, NY, USA. Association for Computing Machinery. DOI: https://doi.org/10.1145/3160504.3160512.

Sagnier, C., Loup-Escande, E., and Valléry, G. (2020). Effects of gender and prior experience in immersive user experience with virtual reality. In Ahram, T. and Falcão, C., editors, Advances in Usability and User Experience, pages 305–314, Cham. Springer International Publishing. DOI: https://doi.org/10.1007/978-3-030-19135-1_30.

Santiago, M. T. and Marques, A. B. (2023). Exploring user reviews to identify accessibility problems in applications for autistic users. Journal on Interactive Systems, 14(1):317–330. DOI: https://doi.org/10.5753/jis.2023.3238.

Schrepp, M., Hinderks, A., and Thomaschewski, J. (2017). Design and evaluation of a short version of the user experience questionnaire (ueq-s). International Journal of Interactive Multimedia and Artificial Intelligence, 4(6):103. DOI: https://doi.org/10.9781/ijimai.2017.09.001.

Sechidis, K., Tsoumakas, G., and Vlahavas, I. (2011). On the stratification of multi-label data. In Machine Learning and Knowledge Discovery in Databases, page 145–158. Springer Berlin Heidelberg. DOI: https://doi.org/10.1007/978-3-642-23808-6_10.

Sullivan, G. M. and Artino, A. R. (2013). Analyzing and interpreting data from likert-type scales. Journal of Graduate Medical Education, 5(4):541–542. DOI: https://doi.org/10.4300/jgme-5-4-18.

Venkatesh, V. and Bala, H. (2008). Technology acceptance model 3 and a research agenda on interventions. Decision Sciences, 39(2):273–315. DOI: https://doi.org/10.1111/j.1540-5915.2008.00192.x.

Venkatesh, V. and Davis, F. D. (2000). A theoretical extension of the technology acceptance model: Four longitudinal field studies. Management Science, 46(2):186–204. DOI: https://doi.org/10.1287/mnsc.46.2.186.11926.

Wieringa, R. J. (2014). Design Science Methodology for Information Systems and Software Engineering. Springer Berlin Heidelberg. DOI: https://doi.org/10.1007/978-3-662-43839-8.

Downloads

Additional Files

Published

2025-01-01

How to Cite

NAKAMURA, W. T.; OLIVEIRA, E. C. C. de; OLIVEIRA, E. H. T. de; CONTE, T. UX-MAPPER: An automated approach to analyze app store reviews with a focus on UX. Journal on Interactive Systems, Porto Alegre, RS, v. 16, n. 1, p. 54–74, 2025. DOI: 10.5753/jis.2025.4099. Disponível em: https://journals-sol.sbc.org.br/index.php/jis/article/view/4099. Acesso em: 5 dec. 2025.

Issue

Section

Regular Paper