**2. Related work**

The number of aesthetic surgery, particularly for facial areas, has been recently drastically surged. This trend even could spread further in the next few years

**45**

*A Deep Learning-Based Aesthetic Surgery Recommendation System*

due to lowering the average cost of such treatments and the desire of beautification. Numerous research works have been proposed in the literature, especially in the computer vision community to address the challenges posed by aesthetic surgery. These research works are generally broken into three categories, namely, skin quality inspection, face recognition after surgery, and surgery planning and

Aesthetic surgery in facial areas to correct facial feature anomaly and to improve the beauty, in general, alters the original facial information. It poses a significant challenge for face recognition algorithms. Majority of proposed methods in the literature focused on the advances of handcraft features. Richa et al. [9] have investigated the effects of aesthetic surgery to face recognition. Amal et al. [10] proposed a face recognition system based on LBP and GIST descriptor to address this problem. Maria et al. [11] combine two methods, face recognition against occlusions and expression variations (FARO) [12] and face analysis for commercial entities (FACE) [13] with split face architecture to deal with the effects of plastic surgery to process each face region as separate biometrics. For a comprehensive survey of face recognition algorithms against variations due to aesthetic surgery, please refer to [14]. Skin quality inspection and assessment is also a potential application of computer vision and deep learning methods. By assessing skin quality, it is able to help the aesthetic surgeon to recommend certain kinds of operation to enhance the beauty of the face. Surface roughness, wrinkle depth, volume, and epidermal thickness of the skin are quantitatively computed by applying deep learning method to images captured by optical coherence tomography [15]. The facial skin is classified into facial skin patches, namely, normal, spots, and wrinkles, by using convolutional neural networks [16]. Batool and Chellappa [17] proposed a method to model wrinkles as texture features or curvilinear objects, so-called aging skin texture, for facial aging analysis. They reviewed commonly used image features to capture the intensity gradients caused by facial wrinkles, such as Laplacian of Gaussian, Hessian filter, steerable filter bank, Gabor filter bank, active appearance model, and

In the last category of aesthetic surgery planning and recommendation, facial beauty prediction is the first step to assess whether or not a face should perform an aesthetic surgery. Yikui et al. [18] described BeautyNet in which multiscale CNN is employed to obtain deep features, characterizing the facial beauty, and is combined with transfer learning strategy to alleviate overfitting and to achieve robust performance on unconstrained faces. Lu et al. [19] transferred rich deep features from a pretrained VGG16 model on face verification task to Bayesian ridge regression

Going beyond the facial beauty prediction, Arakawa and Nomoto [20] removed

wrinkles and spots while preserving natural skin roughness by using a bank of nonlinear filters for facial beautification. Eighty-four facial landmark points are represented in a vector of 234 normalized lengths to compare with vectors of beautiful faces to suggest how to warp the triangulation of the original face to the beautiful ones [21]. Bottino et al. [22] presented a quantitative approach to automatically recommend effective patient-specific improvements of facial attractiveness by comparing the face of the patient with a large database of attractive faces. Simulations are performed by applying features of similar attractive faces into the

Our research differs from the above-related works in two senses. Firstly, a convolutional autoencoder is employed to learn rich features and characterize both unattractive and beautiful faces in an unsupervised manner, rather than under supervised learning [18, 19]. The learned features are more discriminative than handcrafted features [9–13, 15–17]. Secondly, the proposed deep learning

*DOI: http://dx.doi.org/10.5772/intechopen.86411*

recommendation.

local binary patterns.

algorithms for predicting facial beauty.

patient faces with a suitable morphing of facial shape.

#### *A Deep Learning-Based Aesthetic Surgery Recommendation System DOI: http://dx.doi.org/10.5772/intechopen.86411*

*Advanced Analytics and Artificial Intelligence Applications*

how in this field, given a set of rich training data.

of images.

mance on unseen data.

fold surgery, respectively.

**2. Related work**

countries leads to the shortage of high-skill labors in almost all industrial sectors. Thus this chapter proposes a deep learning-based aesthetic surgery recommendation system, aiming at keeping the valuable know-how of experienced doctors to consult the patient in aesthetic surgery. Moreover, the continuous learning capability of the AI model also facilitates the self-update of the newly fashionable know-

Although aesthetic surgery can be performed on all areas of the head, neck, and body, our focused areas in this chapter are the facial area. We take the most popular treatments for facial areas, rejuvenation, and eye double-fold surgery into consideration. In order to build a deep learning system which is capable of predicting the perfection of aesthetic surgery, we collect an in-house training dataset composing of pairs of images capturing the eye area of the same person before and after aesthetic surgery. It is assumed that the beauty of facial areas after surgery is perfect, that is, the know-how of an aesthetic surgeon is embedded into these pairs

In order to keep the know-how of experienced aesthetic surgeon, we propose to train a deep neural network by these pairs of images in our in-house dataset. Among various kinds of neural network architectures, proposed in the literature, convolutional neural networks (CNN) have been demonstrating outstanding performance in image recognition [1]. This was the first time a large and deep CNN—AlexNet model—achieved record-breaking results on highly challenging image recognition dataset with a margin of more than 10% with respect to the second best which makes use of handcrafted features. Even though the performance of AlexNet is still far from the inferotemporal pathway of the human visual system, it created the way of success for successor models such as Inception [2], VGG [3], and Resnet [4]. The convolutional layers learn from data to extract a rich set of features for a variety of purposes such as image classification and recognition [4], visual tracking [5], face recognition [6], object detection [7], person reidentification [8], etc. The power of CNN is enabled by the learning mechanism in which weights of convolutional filters are adjusted toward the adaption to the labels. The generalization of CNN is guaranteed by the availability of a huge dataset to produce an outstanding perfor-

However, our in-house dataset is not huge enough to guarantee the generalization of CNN for this task. Therefore, we propose to use convolutional autoencoder neural networks to overcome the limitation of our small dataset. The network is firstly trained in a layer-wise mechanism to reconstruct input images in the output layer. This training mechanism is completely unsupervised. After the convolutional autoencoder neural network is trained, the decoder part is truncated. Only the encoder part is kept and is concatenated with fully connected layers. The whole network will be trained by images and their labels, before and after surgery. The weights of the encoder parts are kept intact because the encoder parts have already learned the key features of the training image set. As a result, our proposed model is able to achieve 88.9 and 93.1% accuracy on rejuvenation treatment and eye double-

The rest of this chapter is organized as follows. Section 2 describes our contribution to the backdrop of related work. The proposed method is presented in detail in Section 3. Finally, we conclude the chapter and delineate future work in Section 4.

The number of aesthetic surgery, particularly for facial areas, has been recently

drastically surged. This trend even could spread further in the next few years

**44**

due to lowering the average cost of such treatments and the desire of beautification. Numerous research works have been proposed in the literature, especially in the computer vision community to address the challenges posed by aesthetic surgery. These research works are generally broken into three categories, namely, skin quality inspection, face recognition after surgery, and surgery planning and recommendation.

Aesthetic surgery in facial areas to correct facial feature anomaly and to improve the beauty, in general, alters the original facial information. It poses a significant challenge for face recognition algorithms. Majority of proposed methods in the literature focused on the advances of handcraft features. Richa et al. [9] have investigated the effects of aesthetic surgery to face recognition. Amal et al. [10] proposed a face recognition system based on LBP and GIST descriptor to address this problem. Maria et al. [11] combine two methods, face recognition against occlusions and expression variations (FARO) [12] and face analysis for commercial entities (FACE) [13] with split face architecture to deal with the effects of plastic surgery to process each face region as separate biometrics. For a comprehensive survey of face recognition algorithms against variations due to aesthetic surgery, please refer to [14].

Skin quality inspection and assessment is also a potential application of computer vision and deep learning methods. By assessing skin quality, it is able to help the aesthetic surgeon to recommend certain kinds of operation to enhance the beauty of the face. Surface roughness, wrinkle depth, volume, and epidermal thickness of the skin are quantitatively computed by applying deep learning method to images captured by optical coherence tomography [15]. The facial skin is classified into facial skin patches, namely, normal, spots, and wrinkles, by using convolutional neural networks [16]. Batool and Chellappa [17] proposed a method to model wrinkles as texture features or curvilinear objects, so-called aging skin texture, for facial aging analysis. They reviewed commonly used image features to capture the intensity gradients caused by facial wrinkles, such as Laplacian of Gaussian, Hessian filter, steerable filter bank, Gabor filter bank, active appearance model, and local binary patterns.

In the last category of aesthetic surgery planning and recommendation, facial beauty prediction is the first step to assess whether or not a face should perform an aesthetic surgery. Yikui et al. [18] described BeautyNet in which multiscale CNN is employed to obtain deep features, characterizing the facial beauty, and is combined with transfer learning strategy to alleviate overfitting and to achieve robust performance on unconstrained faces. Lu et al. [19] transferred rich deep features from a pretrained VGG16 model on face verification task to Bayesian ridge regression algorithms for predicting facial beauty.

Going beyond the facial beauty prediction, Arakawa and Nomoto [20] removed wrinkles and spots while preserving natural skin roughness by using a bank of nonlinear filters for facial beautification. Eighty-four facial landmark points are represented in a vector of 234 normalized lengths to compare with vectors of beautiful faces to suggest how to warp the triangulation of the original face to the beautiful ones [21]. Bottino et al. [22] presented a quantitative approach to automatically recommend effective patient-specific improvements of facial attractiveness by comparing the face of the patient with a large database of attractive faces. Simulations are performed by applying features of similar attractive faces into the patient faces with a suitable morphing of facial shape.

Our research differs from the above-related works in two senses. Firstly, a convolutional autoencoder is employed to learn rich features and characterize both unattractive and beautiful faces in an unsupervised manner, rather than under supervised learning [18, 19]. The learned features are more discriminative than handcrafted features [9–13, 15–17]. Secondly, the proposed deep learning

framework facilitates a holistic approach to identify what kinds of facial treatment should be performed to enhance the attractiveness, rather than predicting beauty score [20–22].
