**2. Related work**

In this section, we provide relate work with regard to edge computing and FL hyper-parameter tuning.

#### **2.1 Edge computing**

With the fast expansion of IoT, more smart devices are connected to the Internet, producing significant amounts of data. The device-generated data causes bandwidth and latency problems when it is sent to a centralized data center or the cloud. Due to this, typical cloud computing models experience problems such as bandwidth usage, slow reaction times, insufficient security, and poor privacy. Moreover, the growing amount of data also puts more strain on servers and drives up operating costs.

Edge computing solutions have evolved as a result of the fact that traditional cloud computing is no longer able to serve the diversified data processing demands of today's intelligent society. In simplest terms, edge computing is a network technology that analyzes data collected from an endpoint directly in a local device or network close to

where the data is generated, without sending the data to a cloud-based data processing facility. Its core idea is to make computing closer to the source of the data [9].

Edge computing has several advantages. (1) **Low latency**: Since edge computing is closer to the data source, data storage and computational operations may be performed in the edge computing node, reducing the intermediate data transmission process. Therefore, service providers can process user requests in real time and allow users to experience low-latency services. (2) **Low bandwidth**: In edge computing, as the data to be processed do not need to be uploaded to a cloud computing centre, it does not need to use too much network bandwidth, therefore reducing the network bandwidth load and significantly reducing the energy consumption of intelligent devices at the edge of the network. (3) **Privacy**: Since the edge nodes are only responsible for tasks within their own scope and do not need to upload data to the cloud, network transmission concerns are avoided. Even if one of the edge nodes suffers a data breach as a result of a network attack, the other edge nodes will not be affected. Edge computing significantly secures data.

However, although edge computing protects user data privacy better than traditional cloud computing, it is inevitable that users will upload some or all of their personal information to edge servers, such as cloud data centers or edge data centers. These core infrastructures may be managed by the same third-party suppliers, such as mobile network operators, that may not be trusted. Data is exposed to data security issues such as data leakage and data loss during transmission. Also, personal private data may be used illegally by application providers. Thus, the security of outsourcing data is still a fundamental problem of edge computing data security [10].

#### **2.2 FL hyper-parameter tuning**

The area of Hyper-Parameter Optimization (HPO) has received a lot of attention [11]. The hyper-parameters of machine learning models are optimized using a variety of classical HPO techniques, such as Bayesian Optimization (BO) [12], successive halving [13], and hyperband [14]. These cannot, however, be directly applied to FL due to FL's unique hyper-parameters and different training paradigms. For example, FL has specific client-side and server-side aggregation methods that need to be optimized, and the data remains on end devices rather than being centralized on a server.


#### **Table 1.**

*Related work on FL hyper-parameter optimization. We tag if (1) the work can run in an online and single trail manner and (2) the work targets system overheads of FL training.*

#### *Federated Learning Hyper-Parameter Tuning for Edge Computing DOI: http://dx.doi.org/10.5772/intechopen.110747*

Designing HPO algorithms for FL is an emerging area of research. In the past studies, several methods have touched the filed of FL HPO. **Table 1** provides an overview of various notable methods, indicating whether they can operate in a single-trial and online manner, and whether they address system overhead concerns in FL training.

For instance, BO has been combined with FL to strengthen client privacy [17] and enhance various client models [15]; Zhiyuan et al. utilized particle swarm optimization (PSO) to expedite the exploration process of FL hyper-parameters [16]. However, this approach lacked support for single-trial and system overhead. Multiple methods utilize reinforcement learning to fine-tune FL hyper-parameters [18, 19], but this leads to additional intricacy and reduced versatility. FedEx is a comprehensive framework that utilizes weight-sharing neural architecture search (NAS) techniques to optimize the round-to-accuracy of FL. This approach enhances the baseline by a few percentage points [20]. FLoRA chooses global hyper-parameters by identifying the ones that exhibit high performance in local clients [21]. Although a benchmark suite for optimizing federated hyper-parameters has been created [23], its efficacy has not been evaluated yet. FedTune suggests a basic framework for tuning FL hyper-parameters based on the specific requirements of an application [22]. There are two reasons why the current approaches are not applicable to the problem of federated learning in edge computing. Firstly, the measures such as CompT (in seconds), TransL (in seconds), CompL (in FLOPs), and TransL (in bytes) are not directly comparable, and incorporating various system factors in optimizing HPO is challenging. Secondly, hyper-parameter tuning must occur simultaneously with FL training, and there is no possibility of revisiting the model as the training continues until the final model accuracy is reached. Otherwise, this would lead to a substantial rise in the system's overhead.
