Using Non-Local Connections to Augment Knowledge and Efficiency in Multiagent Reinforcement Learning: an Application to Route Choice


  • Ana L. C. Bazzan Universidade Federal do Rio Grande do Sul
  • H. U. Gobbi Universidade Federal do Rio Grande do Sul
  • G. D. dos Santos Universidade Federal do Rio Grande do Sul



multiagent reinforcement learning, non-local information, urban mobility


Providing timely information to drivers is proving valuable in urban mobility applications. There has been several attempts to tackle this question, from transportation engineering, as well as from computer science points of view. In this paper we use reinforcement learning to let driver agents learn how to select a route. In previous works, vehicles and the road infrastructure exchange information to allow drivers to make better informed decisions. In the present paper, we provide extensions in two directions. First, we use non-local information to augment the knowledge that some elements of the infrastructure have. By non-local we mean information that are not in the immediate neighborhood. This is done by constructing a graph in which the elements of the infrastructure are connected according to a similarity measure regarding patterns. Patterns here relate to a set of different attributes: we consider not only travel time, but also include emission of gases. The second extension refers to the environment: the road network now contains signalized intersections. Our results show that using augmented information leads to more efficiency. In particular, we measure travel time and emission of CO along time, and show that the agents learn to use routes that reduce both these measures and, when non-local information is used, the learning task is accelerated.


Download data is not yet available.


Auld, J., Verbas, O., and Stinson, M. (2019). Agent-based dynamic traffic assignment with information mixing. Procedia Computer Science, 151:864–869.

Bazzan, A. L., Gobbi, H. U., and dos Santos, G. D. (2022). More knowledge, more efficiency: Using non-local information on multiple traffic attributes. In Proceedings of the KDMiLe 2022, Campinas. SBC.

Bazzan, A. L. C., Fehler, M., and Klügl, F. (2006). Learning to coordinate in a network of social drivers: The role of information. In Tuyls, K., Hoen, P. J., Verbeeck, K., and Sen, S., editors, Proceedings of the International Workshop on Learning and Adaptation in MAS (LAMAS 2005), number 3898 in Lecture Notes in Artificial Intelligence, pages 115–128.

Bazzan, A. L. C. and Grunitzki, R. (2016). A multiagent reinforcement learning approach to en-route trip building. In 2016 International Joint Conference on Neural Networks (IJCNN), pages 5288–5295. DOI: 10.1109/IJCNN.2016.7727899.

Cui, K., Tahir, A., Ekinci, G., Elshamanhory, A., Eich, Y., Li, M., and Koeppl, H. (2022). A survey on large-population systems and scalable multi-agent reinforcement learning.

Grunitzki, R. and Bazzan, A. L. C. (2016). Combining car-to-infrastructure communication and multi-agent reinforcement learning in route choice. In Bazzan, A. L. C., Klügl, F., Ossowski, S., and Vizzari, G., editors, Proceedings of the Ninth Workshop on Agents in Traffic and Transportation (ATT-2016), volume 1678 of CEUR Workshop Proceedings, New York.

Huanca-Anquise, C. A. (2021). Multi-objective reinforcement learning methods for action selection: dealing with multiple objectives and non-stationarity. Master’s thesis, Instituto de Informática, UFRGS, Porto Alegre, Brazil.

Huanca-Anquise, C. A., Bazzan, A. L., and Tavares, A. R. (2023). Multi-objective, multi-armed bandits: Algorithms for repeated games and application to route choice. Revista de Informática Teórica e Aplicada. To appear.

Lopez, P. A., Behrisch, M., Bieker-Walz, L., Erdmann, J., Flötteröd, Y.-P., Hilbrich, R., Lücken, L., Rummel, J., Wagner, P., and Wießner, E. (2018). Microscopic traffic simulation using SUMO. In The 21st IEEE International Conference on Intelligent Transportation Systems.

Mahmassani, H. S. (2016). Autonomous vehicles and connected vehicle systems: Flow and operations considerations. Transp. Sci., 50(4):1140–1162. DOI: 10.1287/trsc.2016.0712.

Maimaris, A. and Papageorgiou, G. (2016). A review of intelligent transportation systems from a communications technology perspective. In 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), pages 54–59. DOI: 10.1109/ITSC.2016.7795531.

Ortúzar, J. d. D. and Willumsen, L. G. (2011). Modelling transport. John Wiley & Sons, Chichester, UK, 4 edition.

Ramos, G. de O., da Silva, B. C., and Bazzan, A. L. C. (2017). Learning to minimise regret in route choice. In Das, S., Durfee, E., Larson, K., and Winikoff, M., editors, Proc. of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2017), pages 846–855, São Paulo. IFAAMAS.

Ramos, G. de O. and Grunitzki, R. (2015). An improved learning automata approach for the route choice problem. In Koch, F., Meneguzzi, F., and Lakkaraju, K., editors, Agent Technology for Intelligent Mobile Services and Smart Societies, volume 498 of Communications in Computer and Information Science, pages 56–67. Springer. DOI: 10.1007/978-3-662-46241-6_6.

Santos, G. D. dos and Bazzan, A. L. C. (2020). Accelerating learning of route choices with C2I: A preliminary investigation. In Proc. of the VIII Symposium on Knowledge Discovery, Mining and Learning, pages 41–48. SBC. DOI: 10.5753/kdmile.2020.11957.

Santos, G. D. dos and Bazzan, A. L. C. (2021). Sharing diverse information gets driver agents to learn faster: an application in en route trip building. PeerJ Computer Science, 7:e428. DOI: 10.7717/peerj-cs.428.

Santos, G. D. dos and Bazzan, A. L. C. (2022). A multiobjective reinforcement learning approach to trip building. In Bazzan, A. L., Dusparic, I., Lujak, M., and Vizzari, G., editors, Proc. of the 12th International Workshop on Agents in Traffic and Transportation (ATT 2022), volume 3173, pages 160–174.

Santos, G. D. dos, Bazzan, A. L. C., and Baumgardt, A. P. (2021). Using car to infrastructure communication to accelerate learning in route choice. Journal of Information and Data Management, 12(2).

Tumer, K., Welch, Z. T., and Agogino, A. (2008). Aligning social welfare and agent preferences to alleviate traffic congestion. In Padgham, L., Parkes, D., Müller, J., and Parsons, S., editors, Proceedings of the 7th Int. Conference on Autonomous Agents and Multiagent Systems, pages 655–662, Estoril. IFAAMAS.

Van Moffaert, K. and Nowé, A. (2014). Multi-objective reinforcement learning using sets of Pareto dominating policies. J. Mach. Learn. Res., 15(1):3483–3512.

Yu, Y., Han, K., and Ochieng, W. (2020). Day-to-day dynamic traffic assignment with imperfect information, bounded rationality and information sharing. Transportation Research Part C: Emerging Technologies, 114:59–83.

Zhou, B., Song, Q., Zhao, Z., and Liu, T. (2020). A reinforcement learning scheme for the equilibrium of the in-vehicle route choice problem based on congestion game. Applied Mathematics and Computation, 371:124895.




How to Cite

L. C. Bazzan, A., U. Gobbi, H., & D. dos Santos, G. (2024). Using Non-Local Connections to Augment Knowledge and Efficiency in Multiagent Reinforcement Learning: an Application to Route Choice. Journal of Information and Data Management, 15(1), 186–195.



Best Papers of KDMiLe 2022 - Extended Papers