A Novel Forgetting Technique with Random Walk Sampling for Scalable and Adaptive Stream-Based Recommender Systems
DOI:
https://doi.org/10.5753/jbcs.2026.5496Keywords:
Recommender Systems, Data Streams, Random Walks, Forgetting, Online LearningAbstract
The explosion of user-generated data at fast rates in online services leads to the need for designing scalable recommender systems that are able to learn from data streams. Stream-based recommender systems are specifically devised for these scenarios, and have seen a recent increase in interest. These systems rely on incremental approaches that incorporate newly generated data on a single pass, resulting in a model that is always up-to-date. A known limitation of only incorporating data into a model is the presence and effect of old data, which negatively affects predictive performance and eventually raises scalability issues. Therefore, an explicit mechanism to forget such data and remove it from the model is required. In this work, we present a graph-based recommender system that recommends items based on random walk sampling, and simultaneously includes new information while also forgetting obsolete ones. Information obtained from random walk sampling is not only used to recommend relevant items, but also to capture structural information from the graph. We devise a forgetting function that prunes obsolete edges based on this information, and also on the recency, popularity and acceptance ratio of items. Our experiments highlight the importance of forgetting obsolete information and suggest the effectiveness of our method, which leads to scalability, accuracy and diversity improvements.
Downloads
References
Aggarwal, C. C. et al. (2016). Recommender systems: the textbook, volume 1. Springer.
Al-Ghossein, M., Abdessalem, T., and Barre, A. (2021). A survey on stream-based recommender systems. ACM Computing Surveys (CSUR), 54(5):1-36. DOI: 10.1145/3453443.
Andersen, R., Borgs, C., Chayes, J., Hopcroft, J., Mirrokni, V., and Teng, S.-H. (2008). Local computation of pagerank contributions. Internet Mathematics, 5(1-2):23-45. DOI: 10.1007/978-3-540-77004-6_12.
Burke, R. (2002). Hybrid Recommender Systems: Survey and Experiments. User Modeling and User-Adapted Interaction, 12(4):331-370. DOI: 10.1023/A:1021240730564.
Castells, P., Hurley, N. J., and Vargas, S. (2015). Novelty and diversity in recommender systems. In Recommender systems handbook, pages 881-918. Springer, Boston, MA. DOI: 10.1007/978-1-4899-7637-6_26.
Celma, Ò. (2010). Music recommendation. In Music recommendation and discovery, pages 43-85. Springer, Boston, MA. DOI: 10.1007/978-3-642-13287-2.
Celma, Ò. and Cano, P. (2008). From hits to niches? or how popular artists can bias music recommendation and discovery. In Proceedings of the 2nd KDD Workshop on Large-Scale Recommender Systems and the Netflix Prize Competition, pages 1-8. DOI: 10.1145/1722149.1722154.
Christoffel, F., Paudel, B., Newell, C., and Bernstein, A. (2015). Blockbusters and wallflowers: Accurate, diverse, and scalable recommendations with random walks. In Proceedings of the 9th ACM Conference on Recommender Systems, pages 163-170. DOI: 10.1145/2792838.2800180.
Cooper, C., Lee, S. H., Radzik, T., and Siantos, Y. (2014). Random walks in recommender systems: exact computation and simulations. In Proceedings of the 23rd International Conference on World Wide Web, pages 811-816. DOI: 10.1145/2567948.2579244.
Cremonesi, P., Koren, Y., and Turrin, R. (2010). Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the fourth ACM conference on Recommender systems, pages 39-46. DOI: 10.1145/1864708.1864721.
Das, A. S., Datar, M., Garg, A., and Rajaram, S. (2007). Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th international conference on World Wide Web, pages 271-280. DOI: 10.1145/1242572.1242610.
de Souza Pereira Moreira, G., Jannach, D., and Da Cunha, A. M. (2019). Contextual hybrid session-based news recommendation with recurrent neural networks. IEEE Access, 7:169185-169203. DOI: 10.1109/ACCESS.2019.2954957.
Ding, Y. and Li, X. (2005). Time weight collaborative filtering. In Proceedings of the 14th ACM international conference on Information and knowledge management, pages 485-492. DOI: 10.1145/1099554.1099689.
Domingos, P. M. and Hulten, G. (2001). Catching up with the data: Research issues in mining data streams. In DMKD.
Frigó, E., Pálovics, R., Kelen, D., Kocsis, L., and Benczúr, A. (2017). Online ranking prediction in non-stationary environments. In Proceedings of the 1st Workshop on Temporal Reasoning in Recommender Systems co-located with RecSys ’17 (RecTemp ’17)., pages 28-34, CEUR-WS. org.
Gama, J., Sebastião, R., and Rodrigues, P. P. (2009). Issues in evaluation of stream learning algorithms. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 329-338. DOI: 10.1145/1557019.1557060.
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., and Bouchachia, A. (2014). A survey on concept drift adaptation. ACM computing surveys (CSUR), 46(4):1-37. DOI: 10.1145/2523813.
Harper, F. M. and Konstan, J. A. (2015). The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5(4):1-19. DOI: 10.1145/2827872.
Haveliwala, T. H. (2003). Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE transactions on knowledge and data engineering, 15(4):784-796. DOI: 10.1145/511446.511513.
He, X., Zhang, H., Kan, M.-Y., and Chua, T.-S. (2016). Fast matrix factorization for online recommendation with implicit feedback. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pages 549-558. DOI: 10.1145/2911451.2911489.
Jannach, D., Lerche, L., and Zanker, M. (2018). Recommending Based on Implicit Feedback, pages 510-569. Springer International Publishing, Cham. DOI: 10.1007/978-3-319-90092-6_14.
Jeh, G. and Widom, J. (2002). Simrank: a measure of structural-context similarity. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 538-543. DOI: 10.1145/775047.775126.
Jugovac, M., Jannach, D., and Karimi, M. (2018). Streamingrec: a framework for benchmarking stream-based news recommenders. In Proceedings of the 12th ACM conference on recommender systems, pages 269-273. DOI: 10.1145/3240323.3240384.
Koren, Y. (2009). Collaborative filtering with temporal dynamics. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 447-456. DOI: 10.1145/1557019.1557072.
Koren, Y., Rendle, S., and Bell, R. (2022). Advances in collaborative filtering. Recommender systems handbook, pages 91-142. DOI: 10.1007/978-1-0716-2197-4_3.
Koychev, I. (2000). Gradual forgetting for adaptation to concept drift. In Proceedings of ECAI 2000 Workshop on Current Issues in Spatio-Temporal Reasoning.
Liu, N. N., Zhao, M., Xiang, E., and Yang, Q. (2010). Online evolutionary collaborative filtering. In Proceedings of the fourth ACM conference on Recommender systems, pages 95-102. DOI: 10.1145/1864708.1864729.
Lommatzsch, A. and Albayrak, S. (2015). Real-time recommendations for user-item streams. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, pages 1039-1046. DOI: 10.1145/2695664.2695678.
Lu, J., Wu, D., Mao, M., Wang, W., and Zhang, G. (2015). Recommender system application developments. Decis. Support Syst., 74(C):12–32. DOI: 10.1016/j.dss.2015.03.008.
Matuszyk, P., Vinagre, J., Spiliopoulou, M., Jorge, A. M., and Gama, J. (2015). Forgetting methods for incremental matrix factorization in recommender systems. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, pages 947-953. DOI: 10.1145/2695664.2695820.
Matuszyk, P., Vinagre, J., Spiliopoulou, M., Jorge, A. M., and Gama, J. (2018). Forgetting techniques for stream-based matrix factorization in recommender systems. Knowledge and Information Systems, 55(2):275-304. DOI: 10.1007/s10115-017-1091-8.
McAuley, J., Pandey, R., and Leskovec, J. (2015). Inferring networks of substitutable and complementary products. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 785-794. DOI: 10.1145/2783258.2783381.
Miranda, C. and Jorge, A. M. (2009). Item-based and user-based incremental collaborative filtering for web recommendations. In Portuguese Conference on Artificial Intelligence, pages 673-684. Springer. DOI: 10.1007/978-3-642-04686-5.
Nasraoui, O., Cerwinske, J., Rojas, C., and Gonzalez, F. (2007). Performance of recommendation systems in dynamic streaming environments. In Proceedings of the 2007 SIAM International Conference on Data Mining, pages 569-574. SIAM. DOI: 10.1137/1.9781611972771.63.
Nikolakopoulos, A. N., Ning, X., Desrosiers, C., and Karypis, G. (2022). Trust Your Neighbors: A Comprehensive Survey of Neighborhood-Based Methods for Recommender Systems, pages 39-89. Springer US, New York, NY. DOI: 10.1007/978-1-0716-2197-4_2.
Page, L., Brin, S., Motwani, R., and Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab.
Quadrana, M., Cremonesi, P., and Jannach, D. (2018). Sequence-aware recommender systems. ACM Computing Surveys (CSUR), 51(4):1-36. DOI: 10.1145/3190616.
Ricci, F., Rokach, L., and Shapira, B. (2022). Recommender Systems: Techniques, Applications, and Challenges, pages 1-35. Springer US, New York, NY. DOI: 10.1007/978-1-0716-2197-4_1.
Schein, A. I., Popescul, A., Ungar, L. H., and Pennock, D. M. (2002). Methods and metrics for cold-start recommendations. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pages 253-260. DOI: 10.1145/564376.564421.
Schmitt, M. F. L. and Spinosa, E. J. (2020). Incremental graph of sequential interactions for online recommendation with implicit feedback. In 3rd Workshop on Online Recommender Systems and User Modeling.
Schmitt, M. F. L. and Spinosa, E. J. (2022a). Forgetting on evolving graphs for accurate and diverse stream-based recommendation. In Anais do X Symposium on Knowledge Discovery, Mining and Learning, pages 138-145, Porto Alegre, RS, Brasil. SBC. DOI: 10.5753/kdmile.2022.227804.
Schmitt, M. F. L. and Spinosa, E. J. (2022b). Scalable stream-based recommendations with random walks on incremental graph of sequential interactions with implicit feedback. User Modeling and User-Adapted Interaction, 32(4):543-573. DOI: 10.1007/s11257-021-09315-6.
Siddiqui, Z. F., Tiakas, E., Symeonidis, P., Spiliopoulou, M., and Manolopoulos, Y. (2014). xstreams: Recommending items to users with time-evolving preferences. In Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14), pages 1-12. DOI: 10.1145/2611040.2611051.
Smyth, B. and McClave, P. (2001). Similarity vs. diversity. In International conference on case-based reasoning, pages 347-361. Springer.
Symeonidis, P., Kirjackaja, L., and Zanker, M. (2020). Session-aware news recommendations using random walks on time-evolving heterogeneous information networks. User Modeling and User-Adapted Interaction, pages 1-29. DOI: 10.1007/s11257-020-09261-9.
Tabassum, S., Veloso, B., and Gama, J. (2020). On fast and scalable recurring link’s prediction in evolving multi-graph streams. Network Science, 8(S1):S65-S81. DOI: 10.1017/nws.2019.64.
Takács, G., Pilászy, I., Németh, B., and Tikk, D. (2009). Scalable collaborative filtering approaches for large recommender systems. The Journal of Machine Learning Research, 10:623-656.
Veloso, B., Malheiro, B., Burguillo, J. C., and Foss, J. (2017). Personalised fading for stream data. In Proceedings of the Symposium on Applied Computing, pages 870-872. DOI: 10.1145/3019612.3019868.
Verachtert, R., Jeunen, O., and Goethals, B. (2023). Scheduling on a budget: Avoiding stale recommendations with timely updates. Machine Learning with Applications, 11:100455. DOI: https://doi.org/10.1016/j.mlwa.2023.100455.
Verachtert, R., Michiels, L., and Goethals, B. (2022). Are we forgetting something? correctly evaluate a recommender system with an optimal training window. In PERSPECTIVES 2022: Proceedings of the Perspectives on the Evaluation of Recommender Systems Workshop 2022, September 22, 2022, Seatle, USA, volume 3228, pages 1-15.
Vinagre, J. and Jorge, A. M. (2012). Forgetting mechanisms for scalable collaborative filtering. Journal of the Brazilian Computer Society, 18(4):271-282. DOI: 10.1007/s13173-012-0077-3.
Vinagre, J., Jorge, A. M., and Gama, J. (2014). Fast incremental matrix factorization for recommendation with positive-only feedback. In International Conference on User Modeling, Adaptation, and Personalization, pages 459-470. Springer. DOI: 10.1007/978-3-319-08786-3_41.
Vinagre, J., Jorge, A. M., and Gama, J. (2015). Collaborative filtering with recency-based negative feedback. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, pages 963-965. DOI: 10.1145/2695664.2695998.
Vinagre, J., Jorge, A. M., Rocha, C., and Gama, J. (2021). Statistically robust evaluation of stream-based recommender systems. IEEE Transactions on Knowledge and Data Engineering, 33(7):2971-2982. DOI: 10.1109/TKDE.2019.2960216.
Viniski, A. D., Barddal, J. P., de Souza Britto Jr, A., and de Campos, H. V. A. (2023). Incremental specialized and specialized-generalized matrix factorization models based on adaptive learning rate optimizers. Neurocomputing, 552:126515. DOI: 10.1016/j.neucom.2023.126515.
Viniski, A. D., Barddal, J. P., de Souza Britto Jr, A., Enembreck, F., and de Campos, H. V. A. (2021). A case study of batch and incremental recommender systems in supermarket data under concept drifts and cold start. Expert Systems with Applications, 176:114890. DOI: 10.1016/j.eswa.2021.114890.
Wang, X. and Brewster, C. (2024). Forgetting in knowledge graph based recommender systems. In Proceedings of the 13th International Conference on Data Science, Technology and Applications, pages 309-317. DOI: 10.5220/0012757300003756.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Murilo F. L. Schmitt, Eduardo J. Spinosa

This work is licensed under a Creative Commons Attribution 4.0 International License.

