**4. Fine-tuning networks**

In the last section, we categorize the main methods to conduct domain adaptation with deep neural networks and give some high-level information. In this section, we firstly discuss the details of four approaches for fine-tuning networks in **Table 1**.

#### **4.1 Label criterion**

The most basic approach to conduct domain adaptation is to fine-tune a pretrained network with labeled data from the target domain. Hence, we assume that the labels in the target dataset are available and we can utilize a supervised learning approach to adjust the weights/parameters in the network. Based on the definition of the task, our target task *T <sup>t</sup>* based on label criterion approach is

$$\mathcal{T}\_t = \mathcal{L}(\mathbf{Y}\_t, \hat{\mathbf{Y}}\_t) = \mathcal{L}(\mathbf{Y}\_t, \mathcal{F}\_t(\mathbf{X}\_t; \Theta)) \tag{3}$$

where *<sup>L</sup>* denotes a loss function, such as the cross-entropy loss *<sup>L</sup> <sup>Y</sup>*, *<sup>Y</sup>*^ <sup>¼</sup> �*Ylog <sup>Y</sup>*^ � ð Þ <sup>1</sup> � *<sup>Y</sup> log* <sup>1</sup> � *<sup>Y</sup>*^ , which is commonly used in many works. Note that Θ is a set of parameters which is normally initialized with weights from the pre-trained model.

As discussed in Section 3.1, a question is that how many layers in the neural network we should freeze. In general, there are two main factors that can influence the fine-tuning procedure: the size of the target dataset and its similarity to the source domain. Based on the two factors, some common rules of thumb are introduced in [13]. One typical work is [14], in which a unified supervised method for

deep domain adaptation is proposed. Another problem is that what if there are no labels in the target dataset. Therefore, an unsupervised learning method must be applied to the target dataset for domain confusion.
