Integrating counterfactual assessments into traditional interactive recommendation frameworks
DOI:
https://doi.org/10.5753/reic.2023.3418Keywords:
Multi-Armed Bandit, Recommender system, CounterfactualAbstract
Online recommendation task has been recognized as a Multi-Armed Bandit (MAB) problem. Despite the recent advances, there is still a lack of consensus on the best practices to evaluate such bandit solutions. Recently, we observed two complementary frameworks that allow us to evaluate bandit solutions more accurately: iRec and OBP. The first has a complete set of datasets, metrics and MAB models implemented, allowing only offline evaluations of these solutions. However, the second is limited to a few bandit solutions with more current metrics and methodologies, such as counterfactuals. In this work, we propose and evaluate an integration between these two frameworks, demonstrating the potential and richness of analyzes that can be carried out from this combination.
Downloads
References
Auer, P., Cesa-Bianchi, N., and Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine learning, 47(2-3):235–256.
Liu, Y., Yen, J.-N., Yuan, B., Shi, R., Yan, P., and Lin, C.-J. (2022). Practical counterfactual policy learning for top-k recommendations. In ACM SIGKDD, pages 1141–1151.
Pan, W., Cui, S., Wen, H., Chen, K., Zhang, C., and Wang, F. (2021). Correcting the user feedback-loop bias for recommendation systems. arXiv preprint arXiv:2109.06037.
Saito, Y., Aihara, S., Matsutani, M., and Narita, Y. (2020). Open bandit dataset and pipeline: Towards realistic and reproducible off-policy evaluation. arXiv preprint arXiv:2008.07146.
Sanz-Cruzado, J., Castells, P., and López, E. (2019). A simple multi-armed nearest-neighbor bandit for interactive recommendation. In RecSys, pages 358–362.
Shams, S., Anderson, D., and Leith, D. (2021). Cluster-based bandits: Fast cold-start for recommender system new users.
Silva, T., Silva, N., Werneck, H., Mito, C., Pereira, A. C., and Rocha, L. (2022). irec: An interactive recommendation framework. In SIGIR, pages 3165–3175.
Wu, Q., Iyer, N., and Wang, H. (2018). Learning contextual bandits in a non-stationary environment. In SIGIR, pages 495–504.
Yang, Y., Xia, X., Lo, D., and Grundy, J. (2022). A survey on deep learning for software engineering. ACM Computing Surveys (CSUR), 54(10s):1–73.
Zhou, S., Dai, X., Chen, H., Zhang, W., Ren, K., Tang, R., He, X., and Yu, Y. (2020). Interactive recommender system via knowledge graph-enhanced reinforcement learning. In SIGIR, pages 179–188.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Os autores

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
