A Human-Centered Multiperspective and Interactive Visual Tool For Explainable Machine Learning

Bárbara Lopes; Liziane Santos Soares; Marcos André Gonçalves; Raquel Oliveira Prates

doi:10.5753/jbcs.2025.3982

Authors

Bárbara Lopes Federal University of Minas Gerais https://orcid.org/0000-0001-6875-9353
Liziane Santos Soares Federal University of Minas Gerais https://orcid.org/0000-0003-2387-2357
Marcos André Gonçalves Federal University of Minas Gerais https://orcid.org/0000-0002-2075-3363
Raquel Oliveira Prates Federal University of Minas Gerais https://orcid.org/0000-0002-7128-4974

DOI:

https://doi.org/10.5753/jbcs.2025.3982

Keywords:

Interpretability, Machine Learning Model, Computer Interaction, Visualization

Abstract

Understanding why a trained machine learning model makes some decisions is paramount to trusting the model and applying its recommendations in real-world applications. In this article, we present the design and development of an interactive and visual approach to support the use, interpretation and refinement of ML models, whose development was guided by user's needs. We also present Explain-ML, an interactive tool that implements a visual multi-perspective approach to the support interpretation of ML models. Explain-ML development followed a Human-Centered Machine Learning strategy guided by the target (knowledgeable) users' demands, resulting in a multi-perspective approach in which interpretability is supported by a set of complementary visualizations under several perspectives (e.g., global and local). We performed a qualitative evaluation of the tool´s approach to interpretation with a group of target users, focused on their perspective regarding Explain-ML helpfulness and usefulness in comprehending the outcomes of ML models. The evaluation also explored users' capability in applying the knowledge obtained from the tool's explanations for adapting/improving the current models. Results show that Explain-ML provides a broad account of the model's execution (including historical), offering users an ample and flexible exploration space to make different decisions and conduct distinct analyses. Users stated the tool was very useful and that they would be interested in using it in their daily activities.

Downloads

Download data is not yet available.

References

Adebayo, J. and Kagal, L. (2016). Iterative orthogonal feature projection for diagnosing bias in black-box models. arXiv preprint arXiv:1611.04967. DOI: 10.48550/arXiv.1611.04967.

Adler, P., Falk, C., Friedler, S. A., Nix, T., Rybeck, G., Scheidegger, C., Smith, B., and Venkatasubramanian, S. (2018). Auditing black-box models for indirect influence. Knowledge and Information Systems, 54(1):95-122. DOI: 10.48550/arXiv.1602.07043.

Batista, G. E., Prati, R. C., and Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD explorations newsletter, 6(1):20-29. DOI: 10.1145/1007730.1007735.

Berthold, M. R., Cebron, N., Dill, F., Gabriel, T. R., Kötter, T., Meinl, T., Ohl, P., Thiel, K., and Wiswedel, B. (2009). Knime-the konstanz information miner: version 2.0 and beyond. AcM SIGKDD explorations Newsletter, 11(1):26-31. DOI: 10.1145/1656274.1656280.

Breiman, L. et al. (2001). Statistical modeling: The two cultures. Statistical science, 16(3):199-231. DOI: 10.1214/ss/1009213726.

Cao, S., Sun, X., Widyasari, R., Lo, D., Wu, X., Bo, L., Zhang, J., Li, B., Liu, W., Wu, D., et al. (2024). A systematic literature review on explainability for machine/deep learning-based software engineering research. arXiv preprint arXiv:2401.14617. DOI: 10.48550/arXiv.2401.14617.

Capel, T. and Brereton, M. (2023). What is human-centered about human-centered ai? a map of the research landscape. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1-23. DOI: 10.1145/3544548.3580959.

Carroll, J. (2000). Introduction to this Special Issue on “Scenario-Based System Development”. Interacting with Computers, 13(1):41-42. DOI: 10.1016/S0953-5438(00)00022-9.

Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., and Elhadad, N. (2015). Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of KDD '15, pages 1721-1730. DOI: 10.1145/2783258.2788613.

Chawla, N. V., Japkowicz, N., and Kotcz, A. (2004). Special issue on learning from imbalanced data sets. ACM Sigkdd Explorations Newsletter, 6(1):1-6. DOI: 10.1145/1007730.1007733.

Choung, H., David, P., and Ross, A. (2023). Trust in ai and its role in the acceptance of ai technologies. International Journal of Human-Computer Interaction, 39(9):1727-1739. DOI: 10.1080/10447318.2022.2050543.

Cortez, P. and Embrechts, M. J. (2011). Opening black box data mining models using sensitivity analysis. In 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pages 341-348. IEEE. DOI: 10.1109/CIDM.2011.5949423.

Demiralp, cC. (2016). Clustrophile: A tool for visual clustering analysis. In KDD 2016 Workshop on Interactive Data Exploration and Analytics, pages 37-45. DOI: 10.48550/arXiv.1710.02173.

Doshi-Velez, F. and Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608. DOI: 10.48550/arXiv.1702.08608.

Dudley, J. J. and Kristensson, P. O. (2018). A review of user interface design for interactive machine learning. ACM Transactions on Interactive Intelligent Systems (TiiS), 8(2):8. DOI: 10.1145/3185517.

Erik, S. and Kononenko, I. (2010). An efficient explanation of individual classifications using game theory. Journal of Machine Learning Research, 11(Jan):1-18. newblock [link].

Fails, J. A. and Olsen Jr, D. R. (2003). Interactive machine learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces, pages 39-45. ACM. DOI: 10.1145/604045.604056.

Fiebrink, R. and Gillies, M. (2018). Introduction to the special issue on human-centered machine learning. ACM Transactions on Interactive Intelligent Systems (TiiS), 8(2):7. DOI: 10.1145/3205942.

Flick, U. (2008). Designing qualitative research. Sage Sage Publications Ltd., 1th edition. Book.

Gillies, M., Fiebrink, R., Tanaka, A., Garcia, J., Bevilacqua, F., Heloir, A., Nunnari, F., Mackay, W., Amershi, S., Lee, B., et al. (2016). Human-centred machine learning. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, pages 3558-3565. ACM. DOI: 10.1145/2851581.2856492.

Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E. (2015). Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. J. of Comput. and Graphical Statistics, 24(1):44-65. DOI: 10.48550/arXiv.1309.6392.

Goodman, B. and Flaxman, S. (2017). European union regulations on algorithmic decision-making and a “right to explanation”. AI Magazine, 38(3):50-57. DOI: 10.48550/arXiv.1606.08813.

Guidotti, R., Monreale, A., Ruggieri, S., Pedreschi, D., Turini, F., and Giannotti, F. (2018a). Local rule-based explanations of black box decision systems. arXiv preprint arXiv:1805.10820. DOI: 10.48550/arXiv.1805.10820.

Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., and Pedreschi, D. (2018b). A survey of methods for explaining black box models. ACM computing surveys (CSUR), 51(5):93. DOI: 10.1145/3236009.

Hall, P. and Gill, N. (2018). Introduction to Machine Learning Interpretability. O'Reilly Media, Incorporated. Book.

Han, Q., Zhu, W., Heimerl, F., Koch, S., and Ertl, T. (2016). A visual approach for interactive co-training. In KDD 2016 Workshop on Interactive Data Exploration and Analytics, pages 46-52. [link].

Hooker, G. (2004). Discovering additive structure in black box functions. In Proc. of ACM SIGKDD, pages 575-580. ACM. DOI: 10.1145/1014052.1014122.

Kononenko, I. et al. (2010). An efficient explanation of individual classifications using game theory. Journal of Machine Learning Research, 11(Jan):1-18. newblock [link].

Krause, J., Dasgupta, A., Swartz, J., Aphinyanaphongs, Y., and Bertini, E. (2017). A workflow for visual diagnostics of binary classifiers using instance-level explanations. In 2017 IEEE Conference on Visual Analytics Science and Technology (VAST), pages 162-172. IEEE. DOI: 10.48550/arXiv.1705.01968.

Krause, J., Perer, A., and Bertini, E. (2016a). Using visual analytics to interpret predictive machine learning models. In Proceedings of the 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016). DOI: 10.48550/arXiv.1606.05685.

Krause, J., Perer, A., and Bertini, E. (2018). A user study on the effect of aggregating explanations for interpreting machine learning models. In ACM KDD Workshop on Interactive Data Exploration and Analytics. [link].

Krause, J., Perer, A., and Ng, K. (2016b). Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pages 5686-5697. ACM. DOI: 10.1145/2858036.2858529.

Lakkaraju, H., Kamar, E., Caruana, R., and Leskovec, J. (2017). Interpretable & explorable approximations of black box models. arXiv preprint arXiv:1707.01154. DOI: 10.48550/arXiv.1707.01154.

Lazar, J., Feng, J. H., and Hochheiser, H. (2017). Research methods in human-computer interaction. Morgan Kaufmann. Book.

Linardatos, P., Papastefanopoulos, V., and Kotsiantis, S. (2021). Explainable ai: A review of machine learning interpretability methods. Entropy, 23(1):18. DOI: 10.3390/e23010018.

Longo, L., Brcic, M., Cabitza, F., Choi, J., Confalonieri, R., Del Ser, J., Guidotti, R., Hayashi, Y., Herrera, F., Holzinger, A., et al. (2024). Explainable artificial intelligence (xai) 2.0: A manifesto of open challenges and interdisciplinary research directions. Information Fusion, 106:102301. DOI: https://doi.org/10.48550/arXiv.2310.19775.

Lopes, B. G., Soares, L. S., Prates, R. O., and Gonçalves, M. A. (2021). Analysis of the user experience with a multiperspective tool for explainable machine learning in light of interactive principles. In Proceedings of the XX Brazilian Symposium on Human Factors in Computing Systems, pages 1-11. DOI: 10.1145/3472301.3484360.

Lopes, B. G. C., Soares, L. S., Prates, R. O., and Gonçalves, M. A. (2022). Contrasting explain-ml with interpretability machine learning tools in light of interactive machine learning principles. Journal on Interactive Systems, 13(1):313-334. DOI: 10.5753/jis.2022.2556.

Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, pages 4765-4774. DOI: 10.48550/arXiv.1705.07874.

Ming, Y., Qu, H., and Bertini, E. (2019). Rulematrix: Visualizing and understanding classifiers with rules. IEEE Transactions on Visualization and Computer Graphics, 25(1):342-352. DOI: 10.1109/TVCG.2018.2864812.

Mohseni, S., Zarei, N., and Ragan, E. D. (2021). A multidisciplinary survey and framework for design and evaluation of explainable ai systems. ACM TIIS, 11(3-4):1-45. DOI: 10.1145/3387166.

Mosqueira-Rey, E., Pereira, E. H., Alonso-Ríos, D., and Bobes-Bascarán, J. (2022). A classification and review of tools for developing and interacting with machine learning systems. In Proc. of the 37th ACM/SIGAPP Symposium on Applied Computing, pages 1092-1101. DOI: 10.1145/3477314.3507310.

Nakao, Y., Strappelli, L., Stumpf, S., Naseer, A., Regoli, D., and Gamba, G. D. (2023). Towards responsible ai: A design space exploration of human-centered artificial intelligence user interfaces to investigate fairness. International Journal of Human-Computer Interaction, 39(9):1762-1788. DOI: 10.48550/arXiv.2206.00474.

Neto, M. P. and Paulovich, F. V. (2021). Explainable matrix - visualization for global and local interpretability of random forest classification ensembles. IEEE Transactions on Visualization and Computer Graphics, 27(2):1427-1437. DOI: 10.48550/arXiv.2005.04289.

Nielsen, L. (2013). Personas In "The Encyclopedia of Human-Computer Interaction, 2nd Ed.''. [link].

Preece, J., Rogers, Y., and Sharp, H. (2019). Interaction Design: Beyond Human - Computer Interaction. Wiley Publishing, 5th edition. Book.

Preece, J., Rogers, Y., and Sharp, H. (2023). Interaction Design: Beyond Human-Computer Interaction. John Wiley & Sons, 6th edition. Book.

Ramos, G., Suh, J., Ghorashi, S., Meek, C., Banks, R., Amershi, S., Fiebrink, R., Smith-Renner, A., and Bansal, G. (2019). Emerging perspectives in human-centered machine learning. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, page W11. ACM. DOI: 10.1145/3290607.3299014.

Ribeiro, M. T., Singh, S., and Guestrin, C. (2016). Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD, pages 1135-1144. DOI: 10.48550/arXiv.1602.04938.

Ribeiro, M. T., Singh, S., and Guestrin, C. (2018). Anchors: High-precision model-agnostic explanations. In Thirty-Second AAAI Conference on Artificial Intelligence. [link].

Rong, Y., Leemann, T., Nguyen, T.-t., Fiedler, L., Seidel, T., Kasneci, G., and Kasneci, E. (2022). Towards human-centered explainable ai: user studies for model explanations. arXiv preprint arXiv:2210.11584. DOI: 10.48550/arXiv.2210.11584.

Schneider, J. (2024). Explainable generative ai (genxai): A survey, conceptualization, and research agenda. arXiv preprint arXiv:2404.09554. DOI: 10.1007/s10462-024-10916-x.

Shneiderman, B. (2020). Bridging the gap between ethics and practice: Guidelines for reliable, safe, and trustworthy human-centered ai systems. ACM Trans. Interact. Intell. Syst., 10(4). DOI: 10.1145/3419764.

Singh, S., Ribeiro, M. T., and Guestrin, C. (2016). Programs as black-box explanations. arXiv preprint arXiv:1611.07579. DOI: 10.48550/arXiv.1611.07579.

Smilkov, D., Carter, S., Sculley, D., Viégas, F. B., and Wattenberg, M. (2016). Direct-manipulation visualization of deep networks. In KDD 2016 Workshop on Interactive Data Exploration and Analytics, pages 115-119. DOI: 10.48550/arXiv.1708.03788.

Tamagnini, P., Krause, J., Dasgupta, A., and Bertini, E. (2017). Interpreting black-box classifiers using instance-level visual explanations. In Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics, page 6. ACM. DOI: 10.1145/3077257.3077260.

Tolomei, G., Silvestri, F., Haines, A., and Lalmas, M. (2017). Interpretable predictions of tree-based ensembles via actionable feature tweaking. In Proceedings of the 23rd ACM SIGKDD, pages 465-474. DOI: 10.48550/arXiv.1706.06691.

Turner, R. (2016). A model explanation system. In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), pages 1-6. IEEE. DOI: 10.48550/arXiv.1606.09517.

Vidovic, M. M.-C., Görnitz, N., Müller, K.-R., and Kloft, M. (2016). Feature importance measure for non-linear learning algorithms. arXiv preprint arXiv:1611.07567. DOI: 10.48550/arXiv.1611.07567.

Vilone, G. and Longo, L. (2021). Notions of explainability and evaluation approaches for explainable artificial intelligence. Information Fusion, 76:89-106. DOI: 10.1016/j.inffus.2021.05.009.

Wang, Q., Ming, Y., Jin, Z., Shen, Q., Liu, D., Smith, M. J., Veeramachaneni, K., and Qu, H. (2019). Atmseer: Increasing transparency and controllability in automated machine learning. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, page 681. ACM. DOI: 10.48550/arXiv.1902.05009.

Wondimu, N. A., Buche, C., and Visser, U. (2022). Interactive machine learning: A state of the art review. arXiv preprint arXiv:2207.06196. DOI: 10.48550/arXiv.2207.06196.

Yang, W., Wei, Y., Wei, H., Chen, Y., Huang, G., Li, X., Li, R., Yao, N., Wang, X., Gu, X., et al. (2023). Survey on explainable ai: From approaches, limitations and applications aspects. Human-Centric Intelligent Systems, 3(3):161-188. DOI: 10.1007/s44230-023-00038-y.

Zhang, J., Wang, Y., Molino, P., Li, L., and Ebert, D. S. (2019). Manifold: A model-agnostic framework for interpretation and diagnosis of machine learning models. IEEE Transactions on Visualization and Computer Graphics, 25(1):364-373. DOI: 10.1109/TVCG.2018.2864499.

Zhao, X., Wu, Y., Lee, D. L., and Cui, W. (2019). iforest: Interpreting random forests via visual analytics. IEEE Transactions on Visualization and Computer Graphics, 25(1):407-416. DOI: 10.1109/TVCG.2018.2864475.

Zhou, J., Gandomi, A. H., Chen, F., and Holzinger, A. (2021). Evaluating the quality of machine learning explanations: A survey on methods and metrics. Electronics, 10(5):593. DOI: 10.3390/electronics10050593.