Evaluation of explainable artificial intelligence techniques in the context of credit card fraud detection

Gabriel Mendes de Lima; Paulo Henrique Pisani

doi:10.5753/jbcs.2026.5376

Authors

Gabriel Mendes de Lima Universidade Federal do ABC https://orcid.org/0009-0001-8791-8070
Paulo Henrique Pisani Universidade Federal do ABC https://orcid.org/0000-0002-5231-6766

DOI:

https://doi.org/10.5753/jbcs.2026.5376

Keywords:

machine learning, explainable artificial intelligence, credit card fraud detection

Abstract

Artificial intelligence has been employed in several applications in the financial sector. This paper deals with one of these applications: fraud detection in credit card transactions. In this context, a number of machine learning algorithms can be used to obtain models which automate the classification of a transaction as fraudulent or genuine. However, some of these machine learning algorithms are not directly interpretable. The current paper presents an evaluation of explainable artificial intelligence techniques SHAP and LIME applied to models for fraud detection in credit card transactions. Along with the results of the evaluation, the paper discusses the effectiveness and need for explainable artificial intelligence techniques. This paper extends a previous paper by including hyperparameter tuning, new results and an evaluation of the processing time to obtain explanations. The reported results suggest that SHAP obtains better results than LIME, although LIME required less processing time after obtaining the LIME explainer.

Downloads

Download data is not yet available.

References

Aldeia, G. S. I. and de França, F. O. (2022). Interpretability in symbolic regression: a benchmark of explanatory methods using the feynman data set. Genetic Programming and Evolvable Machines, 23:309-349. DOI: 10.1007/s10710-022-09435-x.

Alfaiz, N. S. and Fati, S. M. (2022). Enhanced credit card fraud detection model using machine learning. Electronics (Switzerland), 11. DOI: 10.3390/electronics11040662.

Alvarez-Melis, D. and Jaakkola, T. S. (2018). On the robustness of interpretability methods. DOI: 10.48550/arxiv.1806.08049.

Aros, L. H., Molano, L. X. B., Gutierrez-Portela, F., Hernandez, J. J. M., and Barrero, M. S. R. (2024). Financial fraud detection through the application of machine learning techniques: a literature review. Humanities and Social Sciences Communications. DOI: 10.1057/s41599-024-03606-0.

Bischl, B., Binder, M., Lang, M., Pielok, T., Richter, J., Coors, S., Thomas, J., Ullmann, T., Becker, M., Boulesteix, A.-L., Deng, D., and Lindauer, M. (2023). Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges. WIREs Data Mining and Knowledge Discovery, 13(2):e1484. DOI: 10.1002/widm.1484.

Bourdonnaye, F. D. L. and Daniel, F. (2021). Evaluating categorical encoding methods on a real credit card fraud detection database. Available at:r̆lhttps://arxiv.org/pdf/2112.12024.

Bussmann, N., Giudici, P., Marinelli, D., and Papenbrock, J. (2021). Explainable machine learning in credit risk management. Computational Economics, 57:203-216. DOI: 10.1007/s10614-020-10042-0.

Bücker, M., Szepannek, G., Gosiewska, A., and Biecek, P. (2022). Transparency, auditability, and explainability of machine learning models in credit scoring. Journal of the Operational Research Society, 73(1):70-90. DOI: 10.1080/01605682.2021.1922098.

Capuano, N., Fenza, G., Loia, V., and Stanzione, C. (2022). Explainable artificial intelligence in cybersecurity: A survey. IEEE Access, 10:93575-93600. DOI: 10.1109/ACCESS.2022.3204171.

Carcillo, F., Dal Pozzolo, A., Le Borgne, Y.-A., Caelen, O., Mazzer, Y., and Bontempi, G. (2017). Scarff : a scalable framework for streaming credit card fraud detection with spark. Information Fusion, 41. DOI: 10.1016/j.inffus.2017.09.005.

Carcillo, F., Le Borgne, Y.-A., Caelen, O., Kessaci, Y., Oblé, F., and Bontempi, G. (2019). Combining unsupervised and supervised learning in credit card fraud detection. Information Sciences. DOI: 10.1016/j.ins.2019.05.042.

Černevičienė, J. and Kabašinskas, A. (2024). Explainable artificial intelligence (xai) in finance: a systematic literature review. Artificial Intelligence Review, 57(8):216. DOI: 10.1007/s10462-024-10854-8.

Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. J. Artif. Int. Res., 16(1):321–357. DOI: 10.1613/jair.953.

Dal Pozzolo, A., Boracchi, G., Caelen, O., Alippi, C., and Bontempi, G. (2017). Credit card fraud detection: A realistic modeling and a novel learning strategy. IEEE Transactions on Neural Networks and Learning Systems, PP:1-14. DOI: 10.1109/TNNLS.2017.2736643.

Dal Pozzolo, A., Caelen, O., Le Borgne, Y.-A., Waterschoot, S., and Bontempi, G. (2014). Learned lessons in credit card fraud detection from a practitioner perspective. Expert Systems with Applications, 41:4915–4928. DOI: 10.1016/j.eswa.2014.02.026.

Doshi-Velez, F. and Kim, B. (2017). Towards a rigorous science of interpretable machine learning. DOI: 10.48550/arxiv.1702.08608.

Dwivedi, R., Dave, D., Naik, H., Singhal, S., Omer, R., Patel, P., Qian, B., Wen, Z., Shah, T., Morgan, G., and Ranjan, R. (2023). Explainable ai (xai): Core ideas, techniques, and solutions. ACM Comput. Surv., 55(9). DOI: 10.1145/3561048.

European Union (2016). Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). Official Journal L110, 59:1-88. Available at:[link].

Gee, J., Button, M., and Brooks, G. (2019). The financial cost of fraud: what data from around the world shows. MacIntyre Hudson. Book.

Hanif, A. (2021). Towards explainable artificial intelligence in banking and financial services. DOI: 10.48550/arxiv.2112.08441.

Hsin, Y.-Y., Dai, T.-S., Ti, Y.-W., and Huang, M.-C. (2021). Interpretable electronic transfer fraud detection with expert feature constructions. In CIKM Workshops. Avaialble at:[link].

Ji, Y. (2021). Explainable ai methods for credit card fraud detection: Evaluation of LIME and SHAP through a user study. Available at:[link].

Khyati Chaudhary, Jyoti Yadav, B. M. (2012). A review of fraud detection techniques: Credit card. International Journal of Computer Applications, 45(1):39-44. DOI: 10.5120/6748-8991.

Kim, J. and Canny, J. (2017). Interpretable learning for self-driving cars by visualizing causal attention. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2961-2969. DOI: 10.1109/ICCV.2017.320.

Krawczyk, B. (2016). Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence, 5:221-232. DOI: 10.1007/s13748-016-0094-0.

Le Borgne, Y.-A., Siblini, W., Lebichot, B., and Bontempi, G. (2022). Reproducible Machine Learning for Credit Card Fraud Detection - Practical Handbook. Université Libre de Bruxelles. Available at:[link].

Lima, G. M. d. and Pisani, P. H. (2024). Comparativo de técnicas de inteligência artificial explicável na detecção de fraudes em transações com cartão de crédito. In Anais Estendidos do XXIV Simpósio Brasileiro de Segurança da Informação e de Sistemas Computacionais, pages 244-255, Porto Alegre, RS, Brasil. SBC. DOI: 10.5753/sbseg_estendido.2024.243180.

Lipton, Z. C. (2018). The mythos of model interpretability. Commun. ACM, 61(10):36–43. DOI: 10.1145/3233231.

Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS'17, page 4768–4777, Red Hook, NY, USA. Curran Associates Inc.. DOI: 10.48550/arxiv.1705.07874.

Makki, S., Assaghir, Z., Taher, Y., Haque, R., Hacid, M. S., and Zeineddine, H. (2019). An experimental study with imbalanced classification approaches for credit card fraud detection. IEEE Access, 7:93010-93022. DOI: 10.1109/ACCESS.2019.2927266.

Marcinkevičs, R. and Vogt, J. E. (2023). Interpretability and explainability: A machine learning zoo mini-tour. DOI: 10.48550/arxiv.2012.01805.

Martins, T., de Almeida, A. M., Cardoso, E., and Nunes, L. (2024). Explainable artificial intelligence (xai): A systematic literature review on taxonomies and applications in finance. IEEE Access, 12:618-629. DOI: 10.1109/ACCESS.2023.3347028.

Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267:1-38. DOI: 10.1016/j.artint.2018.07.007.

Miller, T. (2023). Explainable AI is Dead, Long Live Explainable AI! Hypothesis-driven decision support. Available at:[link].

Miller, T., Howe, P., and Sonenberg, L. (2017). Explainable ai: Beware of inmates running the asylum. Available at:[link].

Moepya, S. O., Akhoury, S. S., Nelwamondo, F. V., and Twala, B. (2016). The role of imputation in detecting fraudulent financial reporting. International Journal of Innovative Computing, Information and Control ICIC International c, 12:333-356. Available at:[link].

Molnar, C. (2022). Interpretable Machine Learning. 2 edition. Available at:[link].

Padhi, I., Schiff, Y., Melnyk, I., Rigotti, M., Mroueh, Y., Dognin, P., Ross, J., Nair, R., and Altman, E. (2021). Tabular transformers for modeling multivariate time series. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3565-3569. DOI: 10.1109/ICASSP39728.2021.9414142.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830. Avaialble at:[link].

Petsiuk, V., Das, A., and Saenko, K. (2018). Rise: randomized input sampling for explanation of black-box models. In Proceedings of the British machine vision conference, pages 1-13. DOI: 10.48550/arxiv.1806.07421.

Pozzolo, A. D., Caelen, O., Johnson, R. A., and Bontempi, G. (2015). Calibrating probability with undersampling for unbalanced classification. In 2015 IEEE Symposium Series on Computational Intelligence, pages 159-166. DOI: 10.1109/SSCI.2015.33.

Psychoula, I., Gutmann, A., Mainali, P., Lee, S. H., Dunphy, P., and Petitcolas, F. (2021). Explainable machine learning for fraud detection. Computer, 54(10):49-59. DOI: 10.1109/MC.2021.3081249.

Ribeiro, M. T., Singh, S., and Guestrin, C. (2016a). Model-agnostic interpretability of machine learning. pages 91-95. DOI: 10.48550/arxiv.1606.05386.

Ribeiro, M. T., Singh, S., and Guestrin, C. (2016b). "Why should I trust you?" explaining the predictions of any classifier. volume 13-17-August-2016, pages 1135-1144. Association for Computing Machinery. DOI: 10.1145/2939672.2939778.

Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206-215. DOI: 10.1038/s42256-019-0048-x.

Schwalbe, G. and Finzel, B. (2024). A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts. Data Mining and Knowledge Discovery, 38:3043-3101. DOI: 10.1007/s10618-022-00867-8.

Shapley, L. S. (2016). 17. A Value for n-Person Games, pages 307-318. Princeton University Press, Princeton. DOI: 10.1515/9781400881970-018.

Sulaiman, R. B., Schetinin, V., and Sant, P. (2022). Review of machine learning approach on credit card fraud detection. Human-Centric Intelligent Systems, pages 55-68. DOI: 10.1007/s44230-022-00004-0.

Sundararajan, M., Taly, A., and Yan, Q. (2017). Axiomatic attribution for deep networks. In Precup, D. and Teh, Y. W., editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 3319-3328. PMLR. DOI: 10.48550/arxiv.1703.01365.

Thabtah, F., Hammoud, S., Kamalov, F., and Gonsalves, A. (2020). Data imbalance in classification: Experimental evaluation. Information Sciences, 513:429-441. DOI: 10.1016/j.ins.2019.11.004.

Wu, T.-Y. and Wang, Y.-T. (2021). Locally interpretable one-class anomaly detection for credit card fraud detection. In 2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI), pages 25-30. DOI: 10.1109/TAAI54685.2021.00014.