Customer segmentation in e-commerce: a context-aware quality model for comparing clustering algorithms

Authors

DOI:

https://doi.org/10.5753/jisa.2024.3851

Keywords:

e-commerce, personalization, segmentation, quality framework, clustering

Abstract

E-commerce platforms are constantly evolving to meet the ever-changing needs and preferences of online shoppers. One of the ways that is gaining popularity and leading to a more personalised and efficient user experience is through the use of clustering techniques. However, the choice between clustering algorithms should be made based on specific business context, project requirements, data characteristics, and computational resources.  The purpose of this paper was to present a quality framework that allows the comparison of different clustering approaches, taking into account the business context of the application of the results obtained. The validation of the proposed approach was carried out by comparing three methods - K-means, K-medians, and BIRCH. One possible application of the generated clusters is a platform to support multiple variants of the e-commerce user interface, which requires the selection of an optimal algorithm based on different quality criteria. The contribution of the paper includes the proposal of a framework that takes into account the business context of e-commerce customer clustering and its practical validation. The results obtained confirmed that the clustering techniques analysed can differ significantly when analysing e-commerce customer behaviour data. The quality framework presented in this paper is a flexible approach that can be developed and adapted to the specifics of different e-commerce systems.

Downloads

Download data is not yet available.

References

Aksoy, N. C., Kabadayi, E. T., Yilmaz, C., and Alan, A. K. (2023). Personalization in marketing: How do people perceive personalization practices in the business world? Journal of Electronic Commerce Research, 24(4):269-297. Available online [link].

Al-Kilidar, H., Cox, K., and Kitchenham, B. (2005). The use and usefulness of the iso/iec 9126 quality standard. In 2005 International Symposium on Empirical Software Engineering, 2005., pages 7-pp. IEEE. DOI: 10.1109/ISESE.2005.1541821.

Albert, B., Tullis, T., and Tadesco, D. (2010). Beyond the Usability Lab. Elsevier. DOI: 10.1016/C2009-0-19827-6.

Amini, A. and Haughton, M. (2023). A mathematical optimization model for cluster-based single-depot location-routing e-commerce logistics problems. Supply Chain Analytics, 3. DOI: 10.1016/j.sca.2023.100019.

Amna Altaf, Adnen El Amraoui, F. D. and Lecoutre, C. (2023). Applications of artificial intelligence in cross docking: A systematic literature review. Journal of Computer Information Systems, 63(5):1280-1300. DOI: 10.1080/08874417.2022.2143455.

Calinski, T. and Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics - Theory and Methods, 3:1-27. DOI: 10.1080/03610927408827101.

Camilleri, M. A. (2017). Market Segmentation, Targeting and Positioning. Springer. DOI: 10.1007/978-3-319-49849-2.

Chen, R., Jia, S., and Meng, Q. (2023). Dynamic container drayage booking and routing decision support approach for e-commerce platforms. Transportation Research Part E: Logistics and Transportation Review, 177. DOI: 10.1016/j.tre.2023.103220.

Cui, H., Niu, S., Li, K., Shi, C., Shao, S., and Gao, Z. (2021). A k-means++ based user classification method for social e-commerce. Intelligent Automation & Soft Computing, 28:277-291. DOI: 10.32604/iasc.2021.016408.

Dasgupta, S., Frost, N., Moshkovitz, M., and Rashtchian, C. (2020). Explainable k-means and k-medians clustering. In Proceedings of the 37th International Conference on Machine Learning, ICML'20. JMLR.org. DOI: 10.48550/arXiv.2002.12538.

Davies, D. L. and Bouldin, D. W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2):224-227. DOI: 10.1109/TPAMI.1979.4766909.

Desaid, D. (2019). An empirical study of website personalization effect on users intention to revisit e-commerce website through cognitive and hedonic experience: Proceedings of icdmai 2018, volume 2. Advances in Intelligent Systems and Computing, pages 3-19. DOI: 10.1007/978-981-13-1274-8_1.

Dolnicar, S., Grün, B., and Leisch, F. (2018). Market Segmentation Analysis. Springer Singapore. DOI: 10.1007/978-981-10-8818-6.

Dunn, J. C. (1973). A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. Journal of Cybernetics, 3(3):32-57. DOI: 10.1080/01969727308546046.

Estdale, J. and Georgiadou, E. (2018). Applying the iso/iec 25010 quality models to software product. In Systems, Software and Services Process Improvement: 25th European Conference, EuroSPI 2018, Bilbao, Spain, September 5-7, 2018, Proceedings 25, pages 492-503. Springer. DOI: 10.1007/978-3-319-97925-0_42.

Faraone, M., Gorgoglione, M., Palmisano, C., and Panniello, U. (2012). Using context to improve the effectiveness of segmentation and targeting in e-commerce. Expert Systems with Applications, 39(9):8439-8451. DOI: 10.1016/j.eswa.2012.01.174.

Fontanini, A. D. and Abreu, J. (2018). A data-driven birch clustering method for extracting typical load profiles for big data. In 2018 IEEE Power & energy society general meeting (PESGM), pages 1-5. IEEE. DOI: 10.1109/PESGM.2018.8586542.

Gomes, M. and Meisen, T. (2023). A review on customer segmentation methods for personalized customer targeting in e-commerce use cases. Information Systems and e-Business Management, 21:1-44. DOI: 10.1007/s10257-023-00640-4.

Guo, G. and Altrjman, C. (2022). E-commerce customer segmentation method under improved k-means algorithm. In Sugumaran, V., Sreedevi, A. G., and Xu, Z., editors, Application of Intelligent Systems in Multi-modal Information Analytics, pages 1083-1089. Springer International Publishing. DOI: 10.1007/978-3-031-05484-6_148.

Han, J., Kamber, M., and Pei, J. (2011). Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 3rd edition. DOI: 10.1016/C2009-0-61819-5.

Han, L., Fang, J., Zheng, Q., George, B. T., Liao, M., and Hossin, M. A. (2024). Unveiling the effects of livestream studio environment design on sales performance: A machine learning exploration. Industrial Marketing Management, 117:161-172. DOI: 10.1016/j.indmarman.2023.12.021.

Hicham, N. and Karim, S. (2022). Analysis of unsupervised machine learning techniques for an efficient customer segmentation using clustering ensemble and spectral clustering. International Journal of Advanced Computer Science and Applications, 13(10). DOI: 10.14569/IJACSA.2022.0131016.

Hjort, K., Lantz, B., Ericsson, D., and Gattorna, J. (2016). Customer Segmentation Based on Buying and Returning Behaviour: Supporting Differentiated Service Delivery in Fashion E-Commerce, pages 153-169. Palgrave Macmillan UK, London. DOI: 10.1057/9781137541253_14.

Hwang, C.-L. and Yoon, K. (1981). Multiple attribute decision making: methods and applications a state-of-the-art survey. Springer Science & Business Media. DOI: 10.1007/978-3-642-48318-9.

John, J., Shobayo, O., and Ogunleye, B. (2023). An exploration of clustering algorithms for customer segmentation in the uk retail market. Analytics, 2:809-823. DOI: 10.3390/analytics2040042.

Koehn, D., Lessmann, S., and Schaal, M. (2020). Predicting online shopping behaviour from clickstream data using deep learning. Expert Systems with Applications, 150. DOI: 10.1016/j.eswa.2020.113342.

Kopel, M., Sobecki, J., and Wasilewski, A. (2013). Automatic web-based user interface delivery for soa-based systems. Computational Collective Intelligence, 8083:110-119. DOI: 10.1007/978-3-642-40495-5_12.

Li, P., Wang, C., Wu, J., and Madlenak, R. (2022). An e-commerce customer segmentation method based on rfm weighted k-means. In Proceedings - 2022 International Conference on Management Engineering, Software Engineering and Service Sciences, ICMSS 2022, page 61 – 68. DOI: 10.1109/ICMSS55574.2022.00017.

Lorbeer, B., Kosareva, A., Deva, B., Softić, D., Ruppel, P., and Küpper, A. (2017). Variations on the clustering algorithm birch. Big Data Research, 11. DOI: 10.1016/j.bdr.2017.09.002.

Ma, J. (2022). E-commerce customer segmentation based on rfm model. In Hung, J. C., Yen, N. Y., and Chang, J.-W., editors, Frontier Computing, pages 926-931, Singapore. Springer Nature Singapore. DOI: 10.1007/978-981-16-8052-6_118.

Mashalah, H. A., Hassini, E., Gunasekaran, A., and Bhatt (Mishra), D. (2022). The impact of digital transformation on supply chains through e-commerce: Literature review and a conceptual framework. Transportation Research Part E: Logistics and Transportation Review, 165. DOI: 10.1016/j.tre.2022.102837.

Maulana, A. D., Ningsih, A. K., and Abdillah, G. (2023). Consumer segmentation using k-medians algorithm on transaction data based on lrfmp (length, recency, frequency, monetary, periodecity). Enrichment: Journal of Multidisciplinary Research and Development, 1(8):477-483. DOI: 10.55324/enrichment.v1i8.70.

Meena, P., Kumar, C., and Puri, S. (2023). Customer segmentation and behavioral systems through influential effective elements: An e-satisfaction analysis using machine learning. In AIP Conference Proceedings, volume 2782. DOI: 10.1063/5.0154287.

Nanayakkara, P. R., Jayalath, M. M., Thibbotuwawa, A., and Perera, H. N. (2022). A circular reverse logistics framework for handling e-commerce returns. Cleaner Logistics and Supply Chain, 5. DOI: 10.1016/j.clscn.2022.100080.

Nawara, D. and Kashef, R. (2021). Deploying different clustering techniques on a collaborative-based movie recommender. In 2021 IEEE International Systems Conference (SysCon), pages 1-6. DOI: 10.1109/SysCon48628.2021.9447139.

Nguyen, T. T., Phan, T. C., Pham, H. T., Nguyen, T. T., Jo, J., and Nguyen, Q. V. H. (2023). Example-based explanations for streaming fraud detection on graphs. Information Sciences, 621:319-340. DOI: 10.1016/j.ins.2022.11.119.

Nurma Sari, J., Nugroho, L., Ferdiana, R., and Santosa, P. (2016). Review on customer segmentation technique on ecommerce. Advanced Science Letters, 22:3018-3022. DOI: 10.1166/asl.2016.7985.

Okon, E., Eke, B., and Asagba, P. (2018). An improved online book recommender system using collaborative filtering algorithm. International Journal of Computer Applications, 179. DOI: 10.13140/RG.2.2.24240.46086.

Ooi, K.-B. O., Tan, G. W.-H., Mostafa Al-Emran, M., and Al-Sharafi, M. A. a. (2023). The potential of generative artificial intelligence across disciplines: Perspectives and future directions. Journal of Computer Information Systems, 0(0):1-32. DOI: 10.1080/08874417.2023.2261010.

Papamichail, G. P. and Papamichail, D. P. (2007). The k-means range algorithm for personalized data clustering in e-commerce. European Journal of Operational Research, 177(3):1400-1408. DOI: 10.1016/j.ejor.2005.04.011.

Punhani, R., Arora, V., Sabitha, A. S., and Shukla, V. K. (2020). Segmenting e-commerce customer through data mining techniques. Journal of Physics: Conference Series, 1714:1-12. DOI: 10.1088/1742-6596/1714/1/012026.

Punhani, R., Arora, V., Sabitha, S., and Shukla, V. K. (2021). Application of clustering algorithm for effective customer segmentation in e-commerce. In Proceedings of the 2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), pages 149-154. IEEE. DOI: 10.1109/ICCIKE51210.2021.9410713.

Rajput, L. and Singh, S. N. (2023). Customer segmentation of e-commerce data using k-means clustering algorithm. In 2023 13th International Conference on Cloud Computing, Data Science & Engineering (Confluence), pages 658-664. DOI: 10.1109/Confluence56041.2023.10048834.

Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53-65. DOI: 10.1016/0377-0427(87)90125-7.

Sahinbas, K. and Catak, F. O. (2022). Customer segmentation in the retail sector: A data analytics approach. In 2022 14th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), pages 174-178. DOI: 10.1109/IHMSC55436.2022.00048.

Shen, X. (2023). E-commerce user recommendation algorithm based on social relationship characteristics and improved k-means algorithm. International Journal of Computational Intelligence Systems, 16. DOI: 10.1007/s44196-023-00321-7.

Sihombing, P. (2021). Implementation of k-means and k-medians clustering in several countries based on global innovation index (gii) 2018. Advance Sustainable Science, Engineering and Technology, 3:0210107. DOI: 10.26877/asset.v3i1.8461.

Solichin, A. and Wibowo, G. (2022). Customer segmentation based on recency frequency monetary (rfm) and user event tracking (uet) using k-means algorithm. In Proceeding - IEEE 8th Information Technology International Seminar, ITIS 2022, page 257 – 262. DOI: 10.1109/ITIS57155.2022.10009981.

Song, Y. W. G., Lim, H. S., and Oh, J. (2021). “we think you may like this”: An investigation of electronic commerce personalization for privacy-conscious consumers. Psychology & Marketing, 38(10):1723-1740. DOI: 10.1002/mar.21501.

Su, Q. and Chen, L. (2015). A method for discovering clusters of e-commerce interest patterns using click-stream data. Electronic Commerce Research and Applications, 14(1):1-13. DOI: 10.1016/j.elerap.2014.10.002.

Tabianan, K., Velu, S., and Ravi, V. (2022). K-means clustering approach for intelligent customer segmentation using customer purchase behavior data. Sustainability, 14(12). DOI: 10.3390/su14127243.

Tsao, Y.-C., Chen, Y.-K., Chiu, S.-H., Lu, J.-C., and Vu, T.-L. (2022). An innovative demand forecasting approach for the server industry. Technovation, 110:102371. DOI: 10.1016/j.technovation.2021.102371.

Tsao, Y.-C., Liu, Y.-H., Vũ, L., and Fang, I.-W. (2023). Intelligent design suggestion and sales forecasting for new products in the apparel industry. Fibres & Textiles in Eastern Europe, 31:30-38. DOI: 10.2478/ftee-2023-0052.

Wang, G., Zhang, X., Tang, S., Wilson, C., Zheng, H., and Zhao, B. (2017). Clickstream user behavior models. ACM Transactions on the Web, 11:1-37. DOI: 10.1145/3068332.

Wasilewski, A. (2019). Integration challenges for outsourcing of logistics processes in e-commerce. In Asian Conference on Intelligent Information and Database Systems. DOI: 10.1007/978-3-030-14132-5_29.

Wasilewski, A. (2024). Functional framework for multivariant e-commerce user interfaces. Journal of Theoretical and Applied Electronic Commerce Research, 19(1):412-430. DOI: 10.3390/jtaer19010022.

Wasilewski, A. and Kolaczek, G. (2024). One size does not fit all: Multivariant user interface personalization in e-commerce. IEEE Access, 12(2024):65570-65582. DOI: 10.1109/ACCESS.2024.3398192.

Wasilewski, A. and Przyborowski, M. (2023). Clustering methods for adaptive e-commerce user interfaces. In International Joint Conference on Rough Sets, pages 511-525. Springer. DOI: 10.1007/978-3-031-50959-9_35.

Wu, R.-S. and Chou, P.-H. (2011). Customer segmentation of multiple category data in e-commerce using a soft-clustering approach. Electronic Commerce Research and Applications, 10(3):331-341. DOI: 10.1016/j.elerap.2010.11.002.

Wu, T. and Liu, X. (2020). A dynamic interval type-2 fuzzy customer segmentation model and its application in e-commerce. Applied Soft Computing, 94:106366. DOI: 10.1016/j.asoc.2020.106366.

Xiao, B. and Benbasat, I. (2007). E-commerce product recommendation agents: Use, characteristics, and impact. MIS Q., 31:137-209. DOI: 10.2307/25148784.

Zare, H. and Emadi, S. (2020). Determination of customer satisfaction using improved k-means algorithm. Soft Computing, 24(22):16947 – 16965. DOI: 10.1007/s00500-020-04988-4.

Zhang, J., Wu, J., and Gao, C. (2022). Consumption behavior analysis of e-commerce users based on k-means algorithm. Journal of Network Intelligence, 7(4):935 – 942. Available online [link].

Zhao, H.-H., Luo, X.-C., Ma, R., and Lu, X. (2021). An extended regularized k-means clustering approach for high-dimensional customer segmentation with correlated variables. IEEE Access, 9:48405-48412. DOI: 10.1109/ACCESS.2021.3067499.

Zheng, K., Huo, X., Jasimuddin, S., Zhang, J. Z., and Battaïa, O. (2023). Logistics distribution optimization: Fuzzy clustering analysis of e-commerce customers’ demands. Computers in Industry, 151. DOI: 10.1016/j.compind.2023.103960.

Downloads

Published

2024-07-25

How to Cite

Wasilewski, A. (2024). Customer segmentation in e-commerce: a context-aware quality model for comparing clustering algorithms. Journal of Internet Services and Applications, 15(1), 160–178. https://doi.org/10.5753/jisa.2024.3851

Issue

Section

Research article