Solutions for Heterogeneous Data in Federated Learning through Model Similarity and Client Clustering

Authors

  • Gabriel Talasso UNICAMP
  • Leandro Villas UNICAMP

DOI:

https://doi.org/10.5753/reic.2024.4649

Abstract

The rise of mobile devices and growing privacy concerns have posed significant challenges in distributed artificial intelligence. In this scenario, Federated Learning (FL) emerges as a promising method in which learning models are trained collaboratively and privately. However, FL also faces challenges in model convergence, optimization, and communication overhead due to data and devices heterogeneity. In this context, this work reports two solutions developed to address this problem: 1) NeuralMatch, a tool capable of identifying similarities between clients just using models and 2) FedSCCS a complete solution that uses the previous principles to create multiple models by client clustering. Both solutions have proven to be efficient and effective according to extensive experiments.

Downloads

Download data is not yet available.

References

Abdulrahman, S., Tout, H., Ould-Slimane, H., Mourad, A., Talhi, C., and Guizani, M. (2021). A survey on federated learning: The journey from centralized to distributed on-site learning and beyond. IEEE Internet of Things Journal, 8(7):5476–5497.

Beutel, D. J., Topal, T., Mathur, A., Qiu, X., Fernandez-Marques, J., Gao, Y., Sani, L., Li, K. H., Parcollet, T., de Gusmão, P. P. B., et al. (2020). Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390.

Cho, Y. J., Wang, J., and Joshi, G. (2022). Towards understanding biased client selection in federated learning. In International Conference on Artificial Intelligence and Statistics, pages 10351–10375. PMLR.

Deng, Y., Lyu, F., Ren, J., Wu, H., Zhou, Y., Zhang, Y., and Shen, X. (2022). Auction: Automated and quality-aware client selection framework for efficient federated learning. IEEE Transactions on Parallel and Distributed Systems, 33(8):1996–2009.

Dennis, D. K., Li, T., and Smith, V. (2021). Heterogeneity for the win: One-shot federated clustering. In Meila, M. and Zhang, T., editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 2611–2620. PMLR.

Imteaj, A., Thakker, U., Wang, S., Li, J., and Amini, M. H. (2022). A survey on federated learning for resource-constrained iot devices. IEEE Internet of Things Journal, 9(1):1–24.

Kornblith, S., Norouzi, M., Lee, H., and Hinton, G. (2019). Similarity of neural network representations revisited. In International conference on machine learning, pages 3519–3529. PMLR.

Lim, W. Y. B., Luong, N. C., Hoang, D. T., Jiao, Y., Liang, Y.-C., Yang, Q., Niyato, D., and Miao, C. (2020). Federated learning in mobile edge networks: A comprehensive survey. IEEE Communications Surveys Tutorials, 22(3):2031–2063.

Liu, B., Ding, M., Shaham, S., Rahayu, W., Farokhi, F., and Lin, Z. (2021). When machine learning meets privacy: A survey and outlook. ACM Computing Surveys (CSUR), 54(2):1–36.

McMahan, B., Moore, E., Ramage, D., Hampson, S., and y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR.

Morafah, M., Vahidian, S., Wang, W., and Lin, B. (2023). Flis: Clustered federated learning via inference similarity for non-iid data distribution. IEEE Open Journal of the Computer Society, 4:109–120.

Nishio, T. and Yonetani, R. (2019). Client selection for federated learning with heterogeneous resources in mobile edge. In ICC 2019 - 2019 IEEE International Conference on Communications (ICC), pages 1–7.

Palihawadana, C., Wiratunga, N., Wijekoon, A., and Kalutarage, H. (2022). Fedsim: Similarity guided model aggregation for federated learning. Neurocomputing, 483:432–445.

Ribero, M., Vikalo, H., and de Veciana, G. (2023). Federated learning under intermittent client availability and time-varying communication constraints. IEEE Journal of Selected Topics in Signal Processing, 17(1):98–111.

Sattler, F., Müller, K.-R., and Samek, W. (2021). Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE Transactions on Neural Networks and Learning Systems, 32(8):3710–3722.

Sozinov, K., Vlassov, V., and Girdzijauskas, S. (2018). Human activity recognition using federated learning. In 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pages 1103–1111.

Talasso, G., de Souza, A. M., Bittencourt, L. F., Cerqueira, E., Loureiro, A., and Villas, L. (2024). FedSCCS: hierarchical clustering with multiple models for federated learning. In 2024 IEEE International Conference on Communications (ICC): SAC Cloud Computing, Networking and Storage Track (IEEE icc’24 - SAC-02 CCNS track), page 5.99, Denver, USA.

Talasso, G., Souza, A., and Villas, L. (2023). Neuralmatch: Identificando a similaridade de clientes baseado em modelos no aprendizado federado. In Anais Estendidos do XLI Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos, pages 176–183, Porto Alegre, RS, Brasil. SBC.

Wang, H., Kaplan, Z., Niu, D., and Li, B. (2020). Optimizing federated learning on non-iid data with reinforcement learning. In IEEE INFOCOM 2020 IEEE Conference on Computer Communications, pages 1698–1707.

Ward Jr, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American statistical association, 58(301):236–244.

Published

2024-06-28

How to Cite

Talasso, G., & Villas, L. (2024). Solutions for Heterogeneous Data in Federated Learning through Model Similarity and Client Clustering. Eletronic Journal of Undergraduate Research on Computing, 22(1), 61–70. https://doi.org/10.5753/reic.2024.4649

Issue

Section

Full Papers