Federated Learning Hyper-Parameter Tuning for Edge Computing

*Xueying Zhang, Lei Fu, Huanle Zhang and Xin Liu*

## **Abstract**

Edge computing is widely recognized as a crucial technology for the upcoming generation of communication networks and has garnered significant interest from both industry and academia. Compared to other offloading models like cloud computing, it provides faster data processing capabilities, enhanced security measures, and lower costs by leveraging the proximity of the edge servers to the end devices. This helps mitigate the privacy concerns associated with data transfer in edge computing, by reducing the distance between the data source and the server. Raw data in typical edge computing scenarios still need to be sent to the edge server, leading to data leakage and privacy breaches. Federated Learning (FL) is a distributed model training paradigm that preserves end devices' data privacy. Therefore, it is crucial to incorporate FL into edge computing to protect data privacy. However, the high training overhead of FL makes it impractical for edge computing. In this study, we propose to facilitate the integration of FL and edge computing by optimizing FL hyperparameters, which can significantly reduce FL's training overhead and make it more affordable for edge computing.

**Keywords:** edge computing, federated learning, hyper-parameter tuning, system overhead, internet of things

#### **1. Introduction**

As machine learning (ML) and hardware manufacturing technologies continue to advance, training and deploying ML models have become increasingly ubiquitous in our daily lives, from smart-home voice assistants to widely deployed camera surveillance systems. Edge computing is becoming more and more popular due to its advantages, such as fast data processing and analysis, security, and low cost [1]. By placing the edge servers near to the end device, which is the fundamental principle of edge computing, the border of an edge computing system is constrained and manageable.

However, even with the shorter distance between the end device and the edge server, typical edge computing systems still suffer from a significant data privacy issue, as user data is frequently transmitted from the end device to the edge server for training a centralized ML model.

Federated Learning (FL) [2] is a method of model training that is distributed and has been utilized in various applications, including mobile keyboard and speech recognition for mobile devices and IoT. It is naturally suited for edge computing since data is kept on the end devices. **Figure 1** illustrates the combination of FL and edge computing in training a distributed model. First, the model parameters are transferred from the edge server to the end device. After that, the end device trains the model locally and then transfers the model parameters from the end device to the edge server. At the end of this iteration, the edge server aggregates the received model parameters and updates the model parameters. The above procedure will be repeated until the entire training process converges or reaches a predetermined number of epochs.

Unfortunately, FL training incurs significant system overhead, making it difficult for edge computing systems equipped with FL to operate without appropriate acceleration or optimization. Therefore, we propose the integration of FL hyper-parameter tuning in edge computing to reduce the system overhead of FL training and make it more feasible. The FL tuning algorithm should focus on optimizing the four essential system overheads:


#### **Figure 1.**

*An illustration of combining FL with edge computing. The model training process incorporates four steps: Model parameter download from the edge server to the end devices, local training on the end devices, model parameter upload from the end devices to the edge server, and model aggregation on the edge server.*

computing load is beyond the reach of some low-profile devices (e.g., IoT nodes) with few computing resources.

• *Transmission Load (TransL)*. It is the total data size transmitted between the clients and the server. If the cost of data transfer is high (e.g., data transfer is expensive), the benefits of reducing the total amount of data transferred can be considerable.

Different application scenarios have distinct preferences for training parameters in terms of CompT, TransT, CompL, and TransL. For example, (1) detecting attacks and anomalies in computer networks, as shown in Ref. [3], requires quick adaptation to malicious traffic and is therefore time-sensitive (CompT and TransT); (2) smart home control systems for indoor environment automation [4], such as HVAC, have limited computation capabilities and therefore prioritize computation efficiency (CompT and CompL); (3) traffic monitoring systems for vehicles [5] rely on cellular communications and therefore emphasize communication efficiency (TransT and TransL); (4) precision agriculture based on IoT sensing [6] does not require urgent response but necessitates energy-efficient solutions, with emphasis on CompL and TransL; (5) healthcare systems, like fall detection for elderly individuals [7], require both quick response time and small energy consumption, and therefore prioritize all four training parameters (CompT, TransT, CompL, and TransL); and (6) human stampede detection/prevention systems, as discussed in [8], need efficient systems for time, computation, and communication.

In this chapter, we explore the problem of supporting FL in edge computing from the perspective of FL hyper-parameter tuning. FL hyper-parameters significantly affect the system overhead of FL training, and thus, optimizing FL hyper-parameters is greatly valuable for resource-constrained edge computing. We organize this chapter as follows. Section 2 provides related work on edge computing and FL hyperparameter tuning. Section 3 explains the challenges of supporting FL in edge computing, and Section 4 presents some preliminary results. Last, Section 5 concludes this chapter.
