**2. Methods**

set to a known one. It differs from Euclidean distance in that it takes into account the corre‐ lations of the data set and is scale-invariant, i.e. not dependent on the scale of measure‐ ments. In our study, the well-known Euclidean distance matching process outperformed the process of Mahalanobis distance. Thus, we chose this matching technique in our study.

Hammond [7] pointed out that in terms of future technological support, two (2D) or threedimensional (3D) models of facial morphology are showing potential in syndrome delinea‐ tion and discrimination, in analyzing individual dysmorphology, and in contributing to multi-disciplinary and multi-species studies of genotype–phenotype correlations. Our study is an example of substantiating this potential. We have developed a real-time computer sys‐ tem that can locate and track a patient's head, and then recognize the patient by comparing characteristics of the face to those of trained individuals with classified dysmorphic diseas‐ es. In this study, in terms of both the feature extraction and helping non experienced practi‐ tioners in diagnosis process as well as to support experts in their decisions, we established an application to ease the process and we refer to our method as "Facial Genotype-Pheno‐ type Diagnostic Decision Support System (FaceGP DDSS) in Dysmorphology". Up to date, no complete solution has been proposed that allow the automatic diagnosis of dysmorphic diseases from the raw data (live camera, video or frontal photographs) without human in‐ tervention. The FaceGP DDSS aims not only to ease the required on-site expertise, but also to eliminate the time consuming catalog search of practitioners and geneticists to diagnose facial dysmorphic diseases through approximately 4.700 known dysmorphic diseases3 auto‐ matically, no intervention from the user such as preprocessing of images. The FaceGP DDSS methodology can be implemented on any site easily. In the methodology, reference images or reference patients on live subjects having the specific dysmorphic diseases are used as a guide for identifying the facial phenotypes (the outward physical manifestation of the geno‐ types) to train the system. Digital facial image processing methods are employed to reveal facial features with disorders indicating dysmorphic genotype-phenotype interrelation. A great number of genetic disorders indicating a characteristic pattern of facial anomalies can be classified by analyzing specific features (eigenfaces) with the aid of facial image process‐ ing methods such as PCA. Distance algorithms such as Euclidean, Mahalanobis are used to construct the correlation of the input image with the trained images in matching. Some im‐ age enhancement methods such as histogram equalization and median filter are implement‐ ed on detected images to capture better features and compensate for lighting differences. This study proposes a novel and robust composite computer-assisted and cost-effective method by merging several methods in the characterization of the facial dysmorphic pheno‐ type associated with genotype, in particular a method relying primary on face image cap‐ ture (acquisition from either camera, video or frontal face images) and manipulation to help

medical professionals to diagnose syndromes efficiently.

es.com).

70 Decision Support System

3 Many new dysmorphic diseases are described each year. London Dysmorphology Database (http://www.lmdatabas‐

The FaceGP DDSS methodology has been established in C++ programming language. We benefited OpenCV4 library. The application can function on any present computer, not re‐ quiring much CPU and memory while processing thanks to the easy implementation of PCA. The methodology comprises several main modules, namely *face detection and image ac‐ quisition, image processing, training and diagnosis/recognition module,* and these main modules are divided into several sub modules as illustrated in Figure1. Functions of these modules are explained in following subsections in detail.

**Figure 1.** 1. Overall architecture of the methodology: the system consists of four main modules; face detection and image acquisition, image processing, training and diagnosis/recognition module. These modules are divided into sev‐ eral sub modules that are delineated in the specified sections of the modules.

<sup>4</sup> OpenCV (Open Source Computer Vision) library can be reached from the site, http://opencv.willowgarage.com/ wiki/.

#### **2.1. Face Detection and Image acquisition**

The patient images can be captured from several different environments: they can be cap‐ tured from softcopy images stored in a disk, it can be observed from hardcopy pictures or real-time images which can be detected simultaneously from patients by a camera attached to the computer. The images can be automatically captured on an ordinary high-resolution digital camera (e.g. 8 megapixels) mounted across the patient (no matter what the distance is), provided that a frontal face is detected (to prevent motion artifacts). Natural movements of the body at rest (e.g., breathing), and a person's inability to sit motionless result in no problem as the camera detects and acquire images in a short period of time. Time of capture per image is 0.2s which is triggered by the system automatically excluding time for not ac‐ quiring frontal faces. Capturing images is so instantaneous that it may be more suitable for imaging children with mental retardation who are unable to hold a pose for long and poten‐ tially uncooperative. Images not including a proper frontal face are not captured and the ap‐ plication is idle during this time, especially when the head turning to sides, up and down. Frontal faces are the essential components of our data preparation and model building. In other words, the process of capturing images is just performed if the application detects an acceptable frontal face. The application is able to capture sufficient number of frontal images in real-time, which is required to train the system for subsequent analyses.

**Figure 2.** Delineation of cropping a dysmorphic face.

The face images are standardized by employing two essential image enhancement methods, namely histogram equalization and median filtering to remove illumination variations in or‐ der to capture better features. Histogram equalization employed on an image is known to provide better extraction of features [14]. Histogram equalization is applied on too dark or too bright face images in order to enhance image quality and to improve diagnostic per‐ formance, thus, facial features become more apparent by enhancing the contrast range of the image. Median filtering is very effective to enhance images especially obtained from a cam‐ era without losing information [15]. There is no preprocessing of frontal images manually.

Extraction of facial features is implemented in this sub module. After automatic image preprocessing is employed on raw image data, the feature extraction is implemented on the normalized face image to reveal the key features that are used to classify and recognize dys‐ morphic diseases. The FaceGP DDSS methodology is designed for modeling and analyzing large sets of face images. Whichever method is used, the most important problem in face recognition is reducing the dimensionality in terms of easing computational complexity. We

principal component analysis (PCA) which is a very effective method at classifying face im‐ ages in the sense of reducing computational complexity. PCA which evaluates entire face is

5 Extracting feature graphs is based on interpolation of the basic parts of a face such as eyes, nose, mouth, and chin. In the method, with the help of deformable templates and extensive mathematics, key information from the basic parts of a face is gathered and then converted into a feature vector. L. Yullie and S. Cohen played a great role in adapting de‐

. Instead; we performed

Diagnostic Decision Support System in Dysmorphology

http://dx.doi.org/10.5772/51118

73

haven't used craniofacial landmarks or extracting feature graphs5

formable templates to contour extraction of face images[16].

*2.2.3. Enhancement of face images*

*2.2.4. Feature extraction*

#### **2.2. Image Processing**

#### *2.2.1. Capturing faces for a specific case*

Images of patients are acquired for a specific syndrome either to train or to diagnose in terms of the function triggered by the user. The functions of the implementation are ex‐ plained in the next section. The system activates the diagnosis function as it detects the im‐ ages of patients if this function is triggered. Otherwise, it expects user interference to activate the training function in order to make sure that the procedure of capturing all re‐ quired frontal face images for a specific syndrome is finished.

#### *2.2.2. Conversion to grayscale and cropping of images*

After images of patient are acquired, they are converted to grayscale and cropped to include just faces by relying upon the face outline, forehead, two eyes, cheek, chin and mouth. This stage is finalized by normalizing the image with a size of 120x90 pixels. Processing of this phase as depicted in Figure 2 ensures that all face images are exactly positioned the same regarding a proper rotation. A background removal algorithm is not implemented due to the cropped face in which there is almost no background and the rest of the image is re‐ moved before feature extraction.

**Figure 2.** Delineation of cropping a dysmorphic face.

#### *2.2.3. Enhancement of face images*

**2.1. Face Detection and Image acquisition**

72 Decision Support System

**2.2. Image Processing**

*2.2.1. Capturing faces for a specific case*

moved before feature extraction.

The patient images can be captured from several different environments: they can be cap‐ tured from softcopy images stored in a disk, it can be observed from hardcopy pictures or real-time images which can be detected simultaneously from patients by a camera attached to the computer. The images can be automatically captured on an ordinary high-resolution digital camera (e.g. 8 megapixels) mounted across the patient (no matter what the distance is), provided that a frontal face is detected (to prevent motion artifacts). Natural movements of the body at rest (e.g., breathing), and a person's inability to sit motionless result in no problem as the camera detects and acquire images in a short period of time. Time of capture per image is 0.2s which is triggered by the system automatically excluding time for not ac‐ quiring frontal faces. Capturing images is so instantaneous that it may be more suitable for imaging children with mental retardation who are unable to hold a pose for long and poten‐ tially uncooperative. Images not including a proper frontal face are not captured and the ap‐ plication is idle during this time, especially when the head turning to sides, up and down. Frontal faces are the essential components of our data preparation and model building. In other words, the process of capturing images is just performed if the application detects an acceptable frontal face. The application is able to capture sufficient number of frontal images

in real-time, which is required to train the system for subsequent analyses.

quired frontal face images for a specific syndrome is finished.

*2.2.2. Conversion to grayscale and cropping of images*

Images of patients are acquired for a specific syndrome either to train or to diagnose in terms of the function triggered by the user. The functions of the implementation are ex‐ plained in the next section. The system activates the diagnosis function as it detects the im‐ ages of patients if this function is triggered. Otherwise, it expects user interference to activate the training function in order to make sure that the procedure of capturing all re‐

After images of patient are acquired, they are converted to grayscale and cropped to include just faces by relying upon the face outline, forehead, two eyes, cheek, chin and mouth. This stage is finalized by normalizing the image with a size of 120x90 pixels. Processing of this phase as depicted in Figure 2 ensures that all face images are exactly positioned the same regarding a proper rotation. A background removal algorithm is not implemented due to the cropped face in which there is almost no background and the rest of the image is re‐ The face images are standardized by employing two essential image enhancement methods, namely histogram equalization and median filtering to remove illumination variations in or‐ der to capture better features. Histogram equalization employed on an image is known to provide better extraction of features [14]. Histogram equalization is applied on too dark or too bright face images in order to enhance image quality and to improve diagnostic per‐ formance, thus, facial features become more apparent by enhancing the contrast range of the image. Median filtering is very effective to enhance images especially obtained from a cam‐ era without losing information [15]. There is no preprocessing of frontal images manually.

#### *2.2.4. Feature extraction*

Extraction of facial features is implemented in this sub module. After automatic image preprocessing is employed on raw image data, the feature extraction is implemented on the normalized face image to reveal the key features that are used to classify and recognize dys‐ morphic diseases. The FaceGP DDSS methodology is designed for modeling and analyzing large sets of face images. Whichever method is used, the most important problem in face recognition is reducing the dimensionality in terms of easing computational complexity. We haven't used craniofacial landmarks or extracting feature graphs5 . Instead; we performed principal component analysis (PCA) which is a very effective method at classifying face im‐ ages in the sense of reducing computational complexity. PCA which evaluates entire face is

<sup>5</sup> Extracting feature graphs is based on interpolation of the basic parts of a face such as eyes, nose, mouth, and chin. In the method, with the help of deformable templates and extensive mathematics, key information from the basic parts of a face is gathered and then converted into a feature vector. L. Yullie and S. Cohen played a great role in adapting de‐ formable templates to contour extraction of face images[16].

implemented to extract the features of dysmorphic faces in terms of its simplicity, learning capability and speed. PCA may be defined as the eigenvectors of the covariance matrix of the set of face images, meaning an image as a vector in a high dimensional space. It classifies the face images by projecting them to the 2-D space which is composed of eigenvectors ob‐ tained by the variance of the face images. These eigenvectors are a set of features that char‐ acterize the variation between face images. In other words, the features of the images are obtained by looking for the maximum deviation of each image from the mean image. Each image location contributes more or less to each eigenvector, so that it is possible to display these eigenvectors as a sort of ghostly face image which is called an "eigenface". Eigenfaces can be viewed as a sort of map of the variations between faces. PCA reduces the dimension‐ ality of the dataset. Thus a face of 8-bit intensity values can be represented by an ordered sequences numbers in a vector, one dimensionally with PCA. This is a huge data compac‐ tion, as in our case reducing the representation of a face surface from 10800 dimensionality space (120x90 2D points each with x and y coordinates) down to 1 per image. Each face can be regenerated from eigenvalues using a linear weighted sum of the PCA modes in return. The most relevant information contained in a dysmorphic face image is extracted by PCA6 . Briefly, after capturing sample dysmorphic images and normalizing them following some automatic image processing methods, eigenfaces are generated and saved. An eigenface keep values that make a dysmorphic image unique.

its eigenface components and then these components are compared to those of predefined labeled classes. There are a number of algorithms in the literature that can compare faces to look for a match. A simple and intuitively appealing way to compare an individual face with two sets of faces is to calculate how close for which a method is a nearest-neighbor classification, how close are they in terms of the Euclidean distance. Once a face has been detected and extracted from an image it is ready to be compared against known/trained syn‐ dromes to see if there is a similarity. The captured face image compared to all the syn‐ dromes trained in the database one by one for similarities to make sure all similar faces

dean distance comparison for image recognition is not much different in that it tries to cap‐ ture how similar or different a test object is from training objects in terms of the 50 or so mode values, that face surface is to the average face surfaces of each set. Whichever of the average faces is closest determines the classification of the individual. In other words, dur‐ ing the recognition stage, when a new image is input to the system, the mean image is sub‐ tracted and the result is projected into the face space. The best match is found for the identity that minimizes the Euclidean distance. For example, for a particular syndrome, the features obtained from the PCA is compared to the trained faces in the database using an Euclidean distance comparison to calculate how close it is to the features of faces in the Da‐ tabase. The similar faces above the threshold value supplied by the user are probable diag‐ noses displayed to the user. Confidence values can be defined as the resemblance or degree of proximity of eigenface values (how near they are) between two sets of eigenface values obtained from the values of the trained diagnostic images and the identified image to be di‐ agnosed. These values are used for assessing the reliability of the proposed diagnostic infer‐

Diagnosis is where the system compares the given patient's face features to all the other trained face features in the database and gives a ranked list of possible matches with respect to the confidence values above the threshold value. In our system, probabilities of similari‐ ties in diagnostic process are revealed to the user in a decreasing order rather than either "known" or "unknown" outputs as in face recognition systems, after comparing an unde‐ fined dysmorphic face with recorded defined dysmorphic faces that are diagnostically clas‐ sified. That is to say, our system is not a face recognition system; rather, it is how much a disease is similar to the diseases classified in the database as ruling in or ruling out diseases in terms of the threshold value supplied by the user. This cost effective diagnosis methodol‐ ogy could then help to determine subsequent investigations including more appropriate ge‐ netic testing, and possibly even avoiding or delaying the need to undertake some of the

A screen shot of the implementation is presented in Figure 3. The main utilities of the imple‐

8 Interested reader may reach the Calva's article [17] for more information about the Euclidean distance formulations

and implementation for the comparison of eigenvalues of a test image to the trained labeled images.

. The role of Eucli‐

Diagnostic Decision Support System in Dysmorphology

http://dx.doi.org/10.5772/51118

75

above the threshold value are found in terms of the Euclidean distance8

ence by the system.

more expensive genetic tests.

**2.5. Interface and the Functions of the Methodology**

mentation are explained in following sub-sections.

#### **2.3. Training**

The eigenfaces, eigenvalues and average image generated by PCA are stored in XML file as Haar-like features7 together with their labeled diagnosed names. The methodology has the ability to store and read data in XML format. The number of generated eigenfaces (n-1) is almost equal to the number of input images (n) in our study. Classifiers are trained with those features extracted by feature extraction module. As new dysmorphic diseases are trained, the eigenfaces and eigenvalues are recalculated.

Users can easily add new dysmorphic diseases by using their archives in which there are several sample images representing other diseases not defined in the system one by one as well as more than one disease once by the help of the utility provided by the system.

#### **2.4. Diagnosing/Recognition**

The trained classifiers are employed for prediction in this sub module. Diagnosing/recogni‐ tion process is a pattern recognition task. The prediction of the diagnosis of a patient re‐ quires the detection of frontal face from a camera or a file, normalizing and processing of face with techniques mentioned in Section 2.2 and extraction of facial features for comparing the trained classifiers. A dysmorphic face image captured and processed is transformed into

<sup>6</sup> Details, especially formulas about PCA can be found in Akalin's study [15] and Calvo's article [17].

<sup>7</sup> Haar-like features encode the existence of oriented contrasts between regions in the image. A set of these features can be used to encode the contrasts exhibited by a human face and their spacial relationships. Haar-like are computed sim‐ ilar to the coefficients in Haar wavelet transforms. Interested reader can refer to Viola's study for more information about Haar-like features [18].

its eigenface components and then these components are compared to those of predefined labeled classes. There are a number of algorithms in the literature that can compare faces to look for a match. A simple and intuitively appealing way to compare an individual face with two sets of faces is to calculate how close for which a method is a nearest-neighbor classification, how close are they in terms of the Euclidean distance. Once a face has been detected and extracted from an image it is ready to be compared against known/trained syn‐ dromes to see if there is a similarity. The captured face image compared to all the syn‐ dromes trained in the database one by one for similarities to make sure all similar faces above the threshold value are found in terms of the Euclidean distance8 . The role of Eucli‐ dean distance comparison for image recognition is not much different in that it tries to cap‐ ture how similar or different a test object is from training objects in terms of the 50 or so mode values, that face surface is to the average face surfaces of each set. Whichever of the average faces is closest determines the classification of the individual. In other words, dur‐ ing the recognition stage, when a new image is input to the system, the mean image is sub‐ tracted and the result is projected into the face space. The best match is found for the identity that minimizes the Euclidean distance. For example, for a particular syndrome, the features obtained from the PCA is compared to the trained faces in the database using an Euclidean distance comparison to calculate how close it is to the features of faces in the Da‐ tabase. The similar faces above the threshold value supplied by the user are probable diag‐ noses displayed to the user. Confidence values can be defined as the resemblance or degree of proximity of eigenface values (how near they are) between two sets of eigenface values obtained from the values of the trained diagnostic images and the identified image to be di‐ agnosed. These values are used for assessing the reliability of the proposed diagnostic infer‐ ence by the system.

Diagnosis is where the system compares the given patient's face features to all the other trained face features in the database and gives a ranked list of possible matches with respect to the confidence values above the threshold value. In our system, probabilities of similari‐ ties in diagnostic process are revealed to the user in a decreasing order rather than either "known" or "unknown" outputs as in face recognition systems, after comparing an unde‐ fined dysmorphic face with recorded defined dysmorphic faces that are diagnostically clas‐ sified. That is to say, our system is not a face recognition system; rather, it is how much a disease is similar to the diseases classified in the database as ruling in or ruling out diseases in terms of the threshold value supplied by the user. This cost effective diagnosis methodol‐ ogy could then help to determine subsequent investigations including more appropriate ge‐ netic testing, and possibly even avoiding or delaying the need to undertake some of the more expensive genetic tests.

#### **2.5. Interface and the Functions of the Methodology**

implemented to extract the features of dysmorphic faces in terms of its simplicity, learning capability and speed. PCA may be defined as the eigenvectors of the covariance matrix of the set of face images, meaning an image as a vector in a high dimensional space. It classifies the face images by projecting them to the 2-D space which is composed of eigenvectors ob‐ tained by the variance of the face images. These eigenvectors are a set of features that char‐ acterize the variation between face images. In other words, the features of the images are obtained by looking for the maximum deviation of each image from the mean image. Each image location contributes more or less to each eigenvector, so that it is possible to display these eigenvectors as a sort of ghostly face image which is called an "eigenface". Eigenfaces can be viewed as a sort of map of the variations between faces. PCA reduces the dimension‐ ality of the dataset. Thus a face of 8-bit intensity values can be represented by an ordered sequences numbers in a vector, one dimensionally with PCA. This is a huge data compac‐ tion, as in our case reducing the representation of a face surface from 10800 dimensionality space (120x90 2D points each with x and y coordinates) down to 1 per image. Each face can be regenerated from eigenvalues using a linear weighted sum of the PCA modes in return. The most relevant information contained in a dysmorphic face image is extracted by PCA6

Briefly, after capturing sample dysmorphic images and normalizing them following some automatic image processing methods, eigenfaces are generated and saved. An eigenface

The eigenfaces, eigenvalues and average image generated by PCA are stored in XML file as

ability to store and read data in XML format. The number of generated eigenfaces (n-1) is almost equal to the number of input images (n) in our study. Classifiers are trained with those features extracted by feature extraction module. As new dysmorphic diseases are

Users can easily add new dysmorphic diseases by using their archives in which there are several sample images representing other diseases not defined in the system one by one as

The trained classifiers are employed for prediction in this sub module. Diagnosing/recogni‐ tion process is a pattern recognition task. The prediction of the diagnosis of a patient re‐ quires the detection of frontal face from a camera or a file, normalizing and processing of face with techniques mentioned in Section 2.2 and extraction of facial features for comparing the trained classifiers. A dysmorphic face image captured and processed is transformed into

7 Haar-like features encode the existence of oriented contrasts between regions in the image. A set of these features can be used to encode the contrasts exhibited by a human face and their spacial relationships. Haar-like are computed sim‐ ilar to the coefficients in Haar wavelet transforms. Interested reader can refer to Viola's study for more information

well as more than one disease once by the help of the utility provided by the system.

6 Details, especially formulas about PCA can be found in Akalin's study [15] and Calvo's article [17].

together with their labeled diagnosed names. The methodology has the

keep values that make a dysmorphic image unique.

trained, the eigenfaces and eigenvalues are recalculated.

**2.3. Training**

74 Decision Support System

Haar-like features7

**2.4. Diagnosing/Recognition**

about Haar-like features [18].

.

A screen shot of the implementation is presented in Figure 3. The main utilities of the imple‐ mentation are explained in following sub-sections.

<sup>8</sup> Interested reader may reach the Calva's article [17] for more information about the Euclidean distance formulations and implementation for the comparison of eigenvalues of a test image to the trained labeled images.

*2.5.4. Train captured images:*

*2.5.5. Train from directory:*

*2.5.6. Train all database:*

the datasets are.

*2.5.7. Change threshold:*

*2.5.8. Identify from a directory:*

*2.5.9. Identify from a camera:*

drome stored in the database.

**2.6. Evaluation of the Methodology**

chosen to diagnose probable diseases as well.

study are presented in the next section, results.

into the database.

Detected face images from either a directory or a live camera are processed to be classified

Diagnostic Decision Support System in Dysmorphology

http://dx.doi.org/10.5772/51118

77

A directory in which there are several images of a specified disease is chosen and the name of that syndrome is entered by the user. All necessary algorithms are run through training of

All labeled diseases in a directory are processed by "face detection and image acquisition" and "image processing" modules respectively and automatically. Then, processed cropped frontal face images are trained into a database without user intervention by training mod‐ ule. In other words, all modules in Figure 3 are employed automatically. The system accepts the directory names where the dysmorphic images are as the names of the syndromes while training the datasets. The only thing expected from the user is to specify the folder where

The user can specify the threshold value to rule in or rule out diseases during the identifica‐ tion process. The greater the threshold value is, the less the number of probable diagnoses is proposed, vice versa, the less the threshold value is entered, the greater the number of diag‐ noses is revealed to the user together with confidence (probability or proximity) values.

Several images in a directory can be chosen at once and these images can be compared to the labeled syndromes stored in the database. A video that has dysmorphic patients can be

A patient can be captured from a live camera mounted to the computer to diagnose a syn‐

A case study has been carried out to evaluate the methodology. We have analyzed 2D fron‐ tal face pictures of patients, each being affected with one of the syndromes. The scope and the design of the case study are presented in the following subsections. The findings of the

the diseases automatically. Diseases can be trained one by one by this utility.

**Figure 3.** An example for the diagnosing process via camera: Four messages are displayed; the name of the probable diagnosis (e.g. Fragile X), the degree of proximity to that diagnosis (e.g. 0.345319), the threshold value entered by the user (e.g. 0.30), the message about whether a probable disease is found or not as "recognized successfully" or "un‐ known disease". The messages change if other diseases are found above the threshold value to reveal them on the screen. All the time required to search through 7 trained identified diagnoses that include 34 eigenfaces and find the nearest diagnoses in respect to the threshold value is few seconds.

#### *2.5.1. Detect images from image file:*

This utility detects faces from the images displayed on the screen where images are brought from the chosen directory automatically. Detection stops and training process begins when the user clicks the utility "train captured images". The name of the disease is entered by the user.

#### *2.5.2. Detect images from camera:*

This utility detects face appearances on live camera from a patient or patients who are diag‐ nosed with same dysmorphic disease. Detection stops and training process begins when the user clicks the utility "train captured images". The name of the syndrome is supplied by the user.

#### *2.5.3. Detect images from video:*

Face images of a patient or patients who are diagnosed with same dysmorphic disease are detected from a video or several videos. Detection stops at the end of video and training process begins when the user clicks the utility "train captured images". The name of the syn‐ drome is presented by the user.

#### *2.5.4. Train captured images:*

Detected face images from either a directory or a live camera are processed to be classified into the database.

#### *2.5.5. Train from directory:*

A directory in which there are several images of a specified disease is chosen and the name of that syndrome is entered by the user. All necessary algorithms are run through training of the diseases automatically. Diseases can be trained one by one by this utility.

#### *2.5.6. Train all database:*

All labeled diseases in a directory are processed by "face detection and image acquisition" and "image processing" modules respectively and automatically. Then, processed cropped frontal face images are trained into a database without user intervention by training mod‐ ule. In other words, all modules in Figure 3 are employed automatically. The system accepts the directory names where the dysmorphic images are as the names of the syndromes while training the datasets. The only thing expected from the user is to specify the folder where the datasets are.

#### *2.5.7. Change threshold:*

**Figure 3.** An example for the diagnosing process via camera: Four messages are displayed; the name of the probable diagnosis (e.g. Fragile X), the degree of proximity to that diagnosis (e.g. 0.345319), the threshold value entered by the user (e.g. 0.30), the message about whether a probable disease is found or not as "recognized successfully" or "un‐ known disease". The messages change if other diseases are found above the threshold value to reveal them on the screen. All the time required to search through 7 trained identified diagnoses that include 34 eigenfaces and find the

This utility detects faces from the images displayed on the screen where images are brought from the chosen directory automatically. Detection stops and training process begins when the user clicks the utility "train captured images". The name of the disease is entered by the user.

This utility detects face appearances on live camera from a patient or patients who are diag‐ nosed with same dysmorphic disease. Detection stops and training process begins when the user clicks the utility "train captured images". The name of the syndrome is supplied by the user.

Face images of a patient or patients who are diagnosed with same dysmorphic disease are detected from a video or several videos. Detection stops at the end of video and training process begins when the user clicks the utility "train captured images". The name of the syn‐

nearest diagnoses in respect to the threshold value is few seconds.

*2.5.1. Detect images from image file:*

76 Decision Support System

*2.5.2. Detect images from camera:*

*2.5.3. Detect images from video:*

drome is presented by the user.

The user can specify the threshold value to rule in or rule out diseases during the identifica‐ tion process. The greater the threshold value is, the less the number of probable diagnoses is proposed, vice versa, the less the threshold value is entered, the greater the number of diag‐ noses is revealed to the user together with confidence (probability or proximity) values.

#### *2.5.8. Identify from a directory:*

Several images in a directory can be chosen at once and these images can be compared to the labeled syndromes stored in the database. A video that has dysmorphic patients can be chosen to diagnose probable diseases as well.

#### *2.5.9. Identify from a camera:*

A patient can be captured from a live camera mounted to the computer to diagnose a syn‐ drome stored in the database.

#### **2.6. Evaluation of the Methodology**

A case study has been carried out to evaluate the methodology. We have analyzed 2D fron‐ tal face pictures of patients, each being affected with one of the syndromes. The scope and the design of the case study are presented in the following subsections. The findings of the study are presented in the next section, results.

#### *2.6.1. Study sample*

The study sample was composed of 7 syndromes comprising 35 individual frontal faces (5 for each) of patients that are included from Boehringer's study [2]. These diseases depicted in Figure 4 are microdeletion 22q11.2, Cornelia de Lange, fragile X, Mucopolysaccharidosis III, Smith–Lemli–Opitz, Sotos and Williams–Beuren. The patients for each syndrome are more or less the same age, but, in different genders (10 females and 25 males) for 5 syn‐ dromes that are microdeletion 22q11.2 (3 F: Fmales, 2 M: Males), Cornelia de Lange (2F, 3M), Smith–Lemli–Opitz (2F, 3M), Sotos (2F, 3M) and Williams–Beuren (1F, 4M).

*2.6.2. Study design*

**3. Results**

7 syndromes were trained by the utility named "Train all database" mentioned previously. The system could build a training set for 7 syndromes that have 34 eigenfaces for 35 faces less than one minute. The mean face and all 33 eigenfaces are displayed in Figure 5 and Figure 6 respectively. The first four images from the syndromes were used as a testing set to meas‐ ure the recognition/diagnosis success. All these test images were put in a directory and each test image was compared to other trained 34 images in the database to measure how close it is on the vector space to others by using the utility of "identify from directory". All recogni‐ tion process lasted about 1 minute in terms of pairwise comparison. This utility produces confidence values in a table and diagnose regarding these values above the threshold value entered by the user. In our case study we have aimed to find the probable diagnoses in the sense of rule-in 1, rule-in 2 and rule-in 3 diagnoses respectively by adjusting the threshold value for each person. Then, the success rates of these n-rule-in observations were obtained.

Diagnostic Decision Support System in Dysmorphology

http://dx.doi.org/10.5772/51118

79

**Figure 5.** Mean face for 7 dysmorphic diseases generated by the system.

**Figure 6.** Eigenfaces of 7 dysmorphic diseases that comprise 35 frontal face images.

The threshold value entered by the user increases or decreases the possible number of diag‐ noses, the less the threshold value is entered, the more diagnoses are proposed; vice versa,

**Figure 4.** Seven syndromes: each row comprises one syndrome; microdeletion 22q11.2 (3 F: Females, 2 M: Males), Cornelia de Lange (2F, 3M), fragile X (5M), Mucopolysaccharidosis III (5M), Smith–Lemli–Opitz (2F, 3M), Sotos (2F, 3M) and Williams –Beuren (1F, 4M).

#### *2.6.2. Study design*

*2.6.1. Study sample*

78 Decision Support System

and Williams –Beuren (1F, 4M).

The study sample was composed of 7 syndromes comprising 35 individual frontal faces (5 for each) of patients that are included from Boehringer's study [2]. These diseases depicted in Figure 4 are microdeletion 22q11.2, Cornelia de Lange, fragile X, Mucopolysaccharidosis III, Smith–Lemli–Opitz, Sotos and Williams–Beuren. The patients for each syndrome are more or less the same age, but, in different genders (10 females and 25 males) for 5 syn‐ dromes that are microdeletion 22q11.2 (3 F: Fmales, 2 M: Males), Cornelia de Lange (2F, 3M),

**Figure 4.** Seven syndromes: each row comprises one syndrome; microdeletion 22q11.2 (3 F: Females, 2 M: Males), Cornelia de Lange (2F, 3M), fragile X (5M), Mucopolysaccharidosis III (5M), Smith–Lemli–Opitz (2F, 3M), Sotos (2F, 3M)

Smith–Lemli–Opitz (2F, 3M), Sotos (2F, 3M) and Williams–Beuren (1F, 4M).

7 syndromes were trained by the utility named "Train all database" mentioned previously. The system could build a training set for 7 syndromes that have 34 eigenfaces for 35 faces less than one minute. The mean face and all 33 eigenfaces are displayed in Figure 5 and Figure 6 respectively. The first four images from the syndromes were used as a testing set to meas‐ ure the recognition/diagnosis success. All these test images were put in a directory and each test image was compared to other trained 34 images in the database to measure how close it is on the vector space to others by using the utility of "identify from directory". All recogni‐ tion process lasted about 1 minute in terms of pairwise comparison. This utility produces confidence values in a table and diagnose regarding these values above the threshold value entered by the user. In our case study we have aimed to find the probable diagnoses in the sense of rule-in 1, rule-in 2 and rule-in 3 diagnoses respectively by adjusting the threshold value for each person. Then, the success rates of these n-rule-in observations were obtained.

**Figure 5.** Mean face for 7 dysmorphic diseases generated by the system.

**Figure 6.** Eigenfaces of 7 dysmorphic diseases that comprise 35 frontal face images.

### **3. Results**

The threshold value entered by the user increases or decreases the possible number of diag‐ noses, the less the threshold value is entered, the more diagnoses are proposed; vice versa, the less diagnoses are included to be examined by the medical professionals. Pairwise analy‐ sis of 28 patients in 7 different syndromes is depicted in Table 1 regarding the confidence values. Each column corresponds to an individual patient's comparison to other patients one-to-one on the vector space of eigenfaces. The greater the value a column, the greater the probable proximity it has corresponding to that row whose syndrome indicates the probable diagnosis. The threshold value for each column is adjusted to rule in one, two and three di‐ agnoses respectively. The results of three diagnoses for each tested patient are presented in Table 2 in which the gray cells correspond to the right diagnoses. Ruling in one, two and three diagnoses yields 21, 26 (extra 5 to rule in one) and 28 (extra 2 to rule in two) correct diagnoses among all 28 tested patients respectively. Ruling in one, two and three diagnoses designates the success rates of 75, 94 and 100 percent respectively as well. The diagnostic success rates are depicted in Figure 7.

the frontal face which is coded as 1 in the syndrome list of 1 is the most proximate with a confidence value of 0.464

Diagnostic Decision Support System in Dysmorphology

http://dx.doi.org/10.5772/51118

81

**Figure 7.** Graphical representation of success rates with respect to ruling in one, two and three diagnoses concerning the values in Table 2. Ruling in one, two and three diagnoses yields 21, 26 (including 5 more) and 28 (including 2 more) correct diagnoses among all 28 tested patients respectively. This designates the success rates of 75, 94 and 100

FaceGP DDSS methodology has a success rate of 75%, 94% and 100% in terms of ruling in one, two and three diagnoses respectively. The results show that FaceGP DDSS methodolo‐ gy is able to make a biometric identification among syndromes successfully and efficiently based on the features of the patients' frontal faces, even though, the methodology has been tested by a limited number of 7 syndromes. Diagnosing syndromes correctly among many syndromes can be eased by the methodology provided that it is trained with those syn‐ dromes. One specific result of the study is that all test frontal faces of three syndromes that are microdeletion 22q11.2, fragile X and Williams–Beuren are correctly diagnosed in ruling in one diagnosis. Ruling in two diagnoses covers the 5 syndromes correctly whereas the rul‐ ing three does all seven syndromes. From these results we can conclude that the implement‐ ed methodology especially can guide medical professionals to employ correct cyto- and/or molecular genetic analysis that is the appropriate route of investigation in order to confirm a diagnosis with known genetic causes by ruling in probable diseases. Even with adequate knowledge, there remains the problem of reconciling sometimes imprecise descriptions of dysmorphic features in the literature with a personal and potentially subjective examination

Preliminary results indicate that the application can be trained with many syndromes in several minutes and syndrome recognition can be established in few seconds either from an attach‐ ed camera or from a file. We expect that the performance of the system doesn't degrade as the syndrome number grows owing to the computational efficiency of PCA. Moreover, multiple

that is the greatest value in that column.

percent respectively as well.

**4. Discussion**

**4.1. Main findings**

of an individual patient [7].


**Table 1.** Pairwise comparison: Each column corresponds to an individual patient's comparison to other patients oneto-one on the vector space of eigenfaces in terms of the proximity. The greater the value a column has, the greater the probable proximity it has corresponding to that row whose syndrome indicates the probable diagnosis. First column together with first row corresponds to the syndromes (e.g. 1 corresponds to microdeletion 22q11.2). Second column together with second row indicates the number of the images in the specified syndrome in the first column or first row.


**Table 2.** Ruling in one, two and three diagnoses regarding the greatest values in columns of Table 1: the grey cells correspond to the right diagnosis. Ruling in one, two and three diagnoses yields 21, 26 and 28 correct diagnoses with success rates of 75, 94 and 100 percent among all 28 tested patients respectively. For instance, the value, 0.464, is observed from the cell where the syndrome name is coded as 1 and the frontal face is coded as 4. This indicates that the frontal face which is coded as 1 in the syndrome list of 1 is the most proximate with a confidence value of 0.464 that is the greatest value in that column.

**Figure 7.** Graphical representation of success rates with respect to ruling in one, two and three diagnoses concerning the values in Table 2. Ruling in one, two and three diagnoses yields 21, 26 (including 5 more) and 28 (including 2 more) correct diagnoses among all 28 tested patients respectively. This designates the success rates of 75, 94 and 100 percent respectively as well.

## **4. Discussion**

the less diagnoses are included to be examined by the medical professionals. Pairwise analy‐ sis of 28 patients in 7 different syndromes is depicted in Table 1 regarding the confidence values. Each column corresponds to an individual patient's comparison to other patients one-to-one on the vector space of eigenfaces. The greater the value a column, the greater the probable proximity it has corresponding to that row whose syndrome indicates the probable diagnosis. The threshold value for each column is adjusted to rule in one, two and three di‐ agnoses respectively. The results of three diagnoses for each tested patient are presented in Table 2 in which the gray cells correspond to the right diagnoses. Ruling in one, two and three diagnoses yields 21, 26 (extra 5 to rule in one) and 28 (extra 2 to rule in two) correct diagnoses among all 28 tested patients respectively. Ruling in one, two and three diagnoses designates the success rates of 75, 94 and 100 percent respectively as well. The diagnostic

**Table 1.** Pairwise comparison: Each column corresponds to an individual patient's comparison to other patients oneto-one on the vector space of eigenfaces in terms of the proximity. The greater the value a column has, the greater the probable proximity it has corresponding to that row whose syndrome indicates the probable diagnosis. First column together with first row corresponds to the syndromes (e.g. 1 corresponds to microdeletion 22q11.2). Second column together with second row indicates the number of the images in the specified syndrome in the first column or first

**Table 2.** Ruling in one, two and three diagnoses regarding the greatest values in columns of Table 1: the grey cells correspond to the right diagnosis. Ruling in one, two and three diagnoses yields 21, 26 and 28 correct diagnoses with success rates of 75, 94 and 100 percent among all 28 tested patients respectively. For instance, the value, 0.464, is observed from the cell where the syndrome name is coded as 1 and the frontal face is coded as 4. This indicates that

success rates are depicted in Figure 7.

80 Decision Support System

row.

#### **4.1. Main findings**

FaceGP DDSS methodology has a success rate of 75%, 94% and 100% in terms of ruling in one, two and three diagnoses respectively. The results show that FaceGP DDSS methodolo‐ gy is able to make a biometric identification among syndromes successfully and efficiently based on the features of the patients' frontal faces, even though, the methodology has been tested by a limited number of 7 syndromes. Diagnosing syndromes correctly among many syndromes can be eased by the methodology provided that it is trained with those syn‐ dromes. One specific result of the study is that all test frontal faces of three syndromes that are microdeletion 22q11.2, fragile X and Williams–Beuren are correctly diagnosed in ruling in one diagnosis. Ruling in two diagnoses covers the 5 syndromes correctly whereas the rul‐ ing three does all seven syndromes. From these results we can conclude that the implement‐ ed methodology especially can guide medical professionals to employ correct cyto- and/or molecular genetic analysis that is the appropriate route of investigation in order to confirm a diagnosis with known genetic causes by ruling in probable diseases. Even with adequate knowledge, there remains the problem of reconciling sometimes imprecise descriptions of dysmorphic features in the literature with a personal and potentially subjective examination of an individual patient [7].

Preliminary results indicate that the application can be trained with many syndromes in several minutes and syndrome recognition can be established in few seconds either from an attach‐ ed camera or from a file. We expect that the performance of the system doesn't degrade as the syndrome number grows owing to the computational efficiency of PCA. Moreover, multiple diagnostic as well as multiple training of syndromes effectively and efficiently can be em‐ ployed easily with the implementation which will attract medical professionals most. Dur‐ ing training process, the larger the training dataset per syndrome, the better the success of syndrome recognition is, thanks to the pair wise comparison in our study, regarding differ‐ ent characteristics even in same syndromes such as Down syndrome that has three subtypes9 and cri du chat syndrome that has several variations in the feature10. Our implementations may be used to assist in diagnosing and defining facial phenotypes associated with syn‐ dromes in different ethnic groups and in different age groups provided that these kinds of cases are included into the training process of a specific syndrome.

ing in/ruling out diseases in our methodology can be very helpful for geneticists who wish to employ cyto- and/or molecular genetic analysis for their cases to confirm probable diag‐ noses. Moreover, the application is ready to be used with a user friendly interface when im‐ plemented at any site. Our methodology, noticeably different from others, doesn't require any manual intervention or preprocessing of images by users, rather, all necessary algo‐

Diagnostic Decision Support System in Dysmorphology

http://dx.doi.org/10.5772/51118

83

Medical professionals may construct their own databases in terms of their own dysmorphic facial findings, thus facilitating incorporation of their findings into the examination as well as may add their special cases into a formerly constructed database. In addition, these data‐ bases, later, can be shared in a web environment, can be easily used as plug-and-play sepa‐ rately and furthermore, valuable databases can be united in a unique database after their evaluation by an expert group to be served to the knowledge of the scientific community. This opens the door to cross-study analyses of not only the primary genotypes-phenotypes dysmorphological diseases (prevalence < 1/25.000) but also secondary genotypes-pheno‐ types syndromes (prevalence < 1/25.000, maybe 1/100.000). However, probable specified genotype-phenotype diseases should be trained before presenting the application to the use

Users could benefit the methodology with a user friendly interface without any manual intervention which may cause the users to avoid the use of any system. These findings refute the notion presented in several studies that manual steps cannot be excluded entirely from any dysmorphic facial analysis software that intends to extract as much information as possible.

There are several concerns that we should keep in mind while implementing these kinds of studies. Some genetic dysmorphic diseases could not be brought into scientific literature for especially two reasons. One of which is that many cases are lost before birth and nobody doesn't have any incentive to investigate the reason and eventually no one knows the cause; the other and the most important one is that families facing nonspecific cases don't permit the geneticists to investigate the cases, even though the investigation would help the fami‐ lies greatly for their future decisions12 and consequently babies or fetus are buried with their genetic secrets in the sense of the ethical rules. On the other hand, in practice, some genetic dysmorphic diseases that are very rare (e.g. prevalence 1/100.000) are not globally defined and they are stored in the local Electronic Medical Records (EMRs) or research databases across different medical institutions without any common data structure or representational format. These data elements are needed to be presented to the knowledge of the genetic communities on behalf of the current and future healthy population. Our ability to fully un‐ derstand the genetic basis of common diseases is significantly hindered by the inability to precisely specify the phenotypes and in particular, identifying and extracting phenotypes at large varies greatly between different medical specialties and institution, and lacks the sys‐

12 Such as having a dysmorphic baby (the recurrence risk: the possibility that the problem would happen again in an‐ other pregnancy) and more than that such as early diagnosis, disease prevention, patient management, or even adjunc‐

rithms have been embedded into the methodology.

of the medical professionals to gain their support.

**4.3. Limitations of the study**

tive therapies to be developed.

#### **4.2. Comparison to other published studies**

Hammond [7] has pointed out that there are well documented approaches to recording cra‐ niofacial dysmorphology in a more objective fashion. Moreover, he [7] has asserted that in‐ ternational experts in dysmorphology are currently developing standardized terminology to address issues of imprecision and inconsistency. Most genome-wide association studies to date have focused on a limited number of specific diseases or traits to test whether a com‐ puter can classify syndromes and then recognize them when compared to new cases. There have been successful discrimination studies using images of children with a limited number of dysmorphic syndromes [2, 5, 6, 16]. Some robust composite computer assisted decision support systems are needed to be established not only for practitioners but also geneticists to support their decisions through thousands of dysmorphic diseases. FaceGP DDSS meth‐ odology is aimed to serve several needs of medical professionals who work in dysmorphol‐ ogy. This study presenting the FaceGP DDSS methodology contributes to the medical environment in several aspects. One of which is the support of general practitioners or pe‐ diatricians in the rural areas rather than trained geneticists as well as making easy of works for geneticists throughout thousands of dysmorphic diseases defined in some catalogs and databases. The other one is that a limitless number of dysmorphic diseases can be trained and tackled with FaceGP DDSS methodology in diagnosing process simultaneously. On the other hand, other similar studies having a fixed number of syndromes up to 10 in number are implemented to reveal the potential rather than to be an application to serve medical professionals for their everyday needs. Right diagnosis and consequently right treatments can influence progression of disease, especially in term of removing the effects of environ‐ mental factors11. For instance, when you supplement a patient having a syndrome with hor‐ mone, the patient may get better. The FaceGP DDSS methodology is designed primarily for investigators who wish to diagnose their patients with dysmorphic diseases quickly, effec‐ tively and successfully. Furthermore, it aims to support scientists for their studies who do not have expertise in the particular domain of dysmorphology. A new understanding of rul‐

<sup>9</sup> Chromosome 21 can be affected in three main ways, leading to the three main sub-types of Down's syndrome. Full trisomy 21 Down's syndrome, Mosaicism Down's syndrome and Translocation Down's syndrome are the three subtypes of Down syndrome.

<sup>10</sup> Wilkins's findings delineate the variation in the clinical and karyotypic features of cri du chat syndrome [20].

<sup>11</sup> The complex interplay of genetic and environmental factors is a significant confounding factor in various human genetic approaches, including genome-wide and candidate gene association studies as well as linkage analysis [21].

ing in/ruling out diseases in our methodology can be very helpful for geneticists who wish to employ cyto- and/or molecular genetic analysis for their cases to confirm probable diag‐ noses. Moreover, the application is ready to be used with a user friendly interface when im‐ plemented at any site. Our methodology, noticeably different from others, doesn't require any manual intervention or preprocessing of images by users, rather, all necessary algo‐ rithms have been embedded into the methodology.

Medical professionals may construct their own databases in terms of their own dysmorphic facial findings, thus facilitating incorporation of their findings into the examination as well as may add their special cases into a formerly constructed database. In addition, these data‐ bases, later, can be shared in a web environment, can be easily used as plug-and-play sepa‐ rately and furthermore, valuable databases can be united in a unique database after their evaluation by an expert group to be served to the knowledge of the scientific community. This opens the door to cross-study analyses of not only the primary genotypes-phenotypes dysmorphological diseases (prevalence < 1/25.000) but also secondary genotypes-pheno‐ types syndromes (prevalence < 1/25.000, maybe 1/100.000). However, probable specified genotype-phenotype diseases should be trained before presenting the application to the use of the medical professionals to gain their support.

Users could benefit the methodology with a user friendly interface without any manual intervention which may cause the users to avoid the use of any system. These findings refute the notion presented in several studies that manual steps cannot be excluded entirely from any dysmorphic facial analysis software that intends to extract as much information as possible.

#### **4.3. Limitations of the study**

diagnostic as well as multiple training of syndromes effectively and efficiently can be em‐ ployed easily with the implementation which will attract medical professionals most. Dur‐ ing training process, the larger the training dataset per syndrome, the better the success of syndrome recognition is, thanks to the pair wise comparison in our study, regarding differ‐ ent characteristics even in same syndromes such as Down syndrome that has three sub-

 and cri du chat syndrome that has several variations in the feature10. Our implementations may be used to assist in diagnosing and defining facial phenotypes associated with syn‐ dromes in different ethnic groups and in different age groups provided that these kinds of

Hammond [7] has pointed out that there are well documented approaches to recording cra‐ niofacial dysmorphology in a more objective fashion. Moreover, he [7] has asserted that in‐ ternational experts in dysmorphology are currently developing standardized terminology to address issues of imprecision and inconsistency. Most genome-wide association studies to date have focused on a limited number of specific diseases or traits to test whether a com‐ puter can classify syndromes and then recognize them when compared to new cases. There have been successful discrimination studies using images of children with a limited number of dysmorphic syndromes [2, 5, 6, 16]. Some robust composite computer assisted decision support systems are needed to be established not only for practitioners but also geneticists to support their decisions through thousands of dysmorphic diseases. FaceGP DDSS meth‐ odology is aimed to serve several needs of medical professionals who work in dysmorphol‐ ogy. This study presenting the FaceGP DDSS methodology contributes to the medical environment in several aspects. One of which is the support of general practitioners or pe‐ diatricians in the rural areas rather than trained geneticists as well as making easy of works for geneticists throughout thousands of dysmorphic diseases defined in some catalogs and databases. The other one is that a limitless number of dysmorphic diseases can be trained and tackled with FaceGP DDSS methodology in diagnosing process simultaneously. On the other hand, other similar studies having a fixed number of syndromes up to 10 in number are implemented to reveal the potential rather than to be an application to serve medical professionals for their everyday needs. Right diagnosis and consequently right treatments can influence progression of disease, especially in term of removing the effects of environ‐ mental factors11. For instance, when you supplement a patient having a syndrome with hor‐ mone, the patient may get better. The FaceGP DDSS methodology is designed primarily for investigators who wish to diagnose their patients with dysmorphic diseases quickly, effec‐ tively and successfully. Furthermore, it aims to support scientists for their studies who do not have expertise in the particular domain of dysmorphology. A new understanding of rul‐

9 Chromosome 21 can be affected in three main ways, leading to the three main sub-types of Down's syndrome. Full trisomy 21 Down's syndrome, Mosaicism Down's syndrome and Translocation Down's syndrome are the three sub-

10 Wilkins's findings delineate the variation in the clinical and karyotypic features of cri du chat syndrome [20]. 11 The complex interplay of genetic and environmental factors is a significant confounding factor in various human genetic approaches, including genome-wide and candidate gene association studies as well as linkage analysis [21].

cases are included into the training process of a specific syndrome.

**4.2. Comparison to other published studies**

types9

82 Decision Support System

types of Down syndrome.

There are several concerns that we should keep in mind while implementing these kinds of studies. Some genetic dysmorphic diseases could not be brought into scientific literature for especially two reasons. One of which is that many cases are lost before birth and nobody doesn't have any incentive to investigate the reason and eventually no one knows the cause; the other and the most important one is that families facing nonspecific cases don't permit the geneticists to investigate the cases, even though the investigation would help the fami‐ lies greatly for their future decisions12 and consequently babies or fetus are buried with their genetic secrets in the sense of the ethical rules. On the other hand, in practice, some genetic dysmorphic diseases that are very rare (e.g. prevalence 1/100.000) are not globally defined and they are stored in the local Electronic Medical Records (EMRs) or research databases across different medical institutions without any common data structure or representational format. These data elements are needed to be presented to the knowledge of the genetic communities on behalf of the current and future healthy population. Our ability to fully un‐ derstand the genetic basis of common diseases is significantly hindered by the inability to precisely specify the phenotypes and in particular, identifying and extracting phenotypes at large varies greatly between different medical specialties and institution, and lacks the sys‐

<sup>12</sup> Such as having a dysmorphic baby (the recurrence risk: the possibility that the problem would happen again in an‐ other pregnancy) and more than that such as early diagnosis, disease prevention, patient management, or even adjunc‐ tive therapies to be developed.

tematization and throughput compared to large-scale genotyping efforts [22]. Beyond these clinical aspects, dysmorphology has contributed much to current understanding of the ge‐ netic basis of human development [1]. Moreover, imprecise and nonstandardized nomencla‐ ture, especially of facial features, places a major difficulty for the communication between clinical geneticists [2]. It has to be noted that neither 2D nor 3D methods have direct applica‐ bility in clinical practice yet, as the number of specified syndromes is still very small [2].

background change would be minimized and diagnosing success would be better which is going to be an improvement as a further study. Furthermore, we couldn't reach the raw material of 7 syndromes in which the photos we utilized in our study are not in good condi‐ tion in terms of their appearances. Moreover the patients are not same ages and same sexes for each syndrome. The results would be better if better images were utilized and if the patients were in similar age group and sexes in the study. The more faces bearing the characteristics of any syndrome included in the training, the better the recognition of that syndrome will be.

Diagnostic Decision Support System in Dysmorphology

http://dx.doi.org/10.5772/51118

85

In terms of future technological support, two (2D) or three-dimensional (3D) models of fa‐ cial morphology are showing potential in syndrome delineation and discrimination, in ana‐ lyzing individual dysmorphology, and in contributing to multi-disciplinary and multispecies studies of genotype–phenotype correlations [7]. Our study is an example of substantiating this potential. We describe a new approach to syndrome identification by merging several algorithms. The algorithms that we included in our study are not novel. They have been utilized in many studies so far. However we included most essential ones in a robust composite understanding in a way to serve the everyday needs of the medical pro‐

The preliminary results indicate that computer based diagnostic decision support systems such as the one we have established might be very helpful to assist medical professionals in genotype-phenotype dysmorphic diagnosis. The study reveals that the differences between facial regions such as facial landmarks, eyebrows, hair, lips, and chins can give the possibili‐ ty of predicting the diagnosis of syndromes. It may contribute to the medical professionals

**•** To support medical professionals who do not have expertise in the particular domain of

**•** To support geneticists throughout thousands dysmorphic diseases, most of which are

**•** Generally to support investigators who wish to diagnose their patients with dysmorphic

**•** To support investigators who strive to expand their studies by including their cases into the system with the implementation whose database is able to be broadened dynamically

**•** To guide geneticists to employ correct cyto- and/or molecular genetic analysis that is the appropriate route of investigation in order to confirm a diagnosis with - known genetic

**•** No preprocessing of data manually that may cause the users to avoid the utilization of

and easily, which provides them to keep and deal with their data more efficiently.

dysmorphology such as general practitioners or pediatricians in rural areas,

**5. Conclusion**

fessionals who work in dysmorphology.

in several aspects. Some of these are:

very rare and difficult to memorize,

causes by ruling in probable diseases.

any system is required.

diseases quickly, effectively and successfully,

As Boehringer [2] emphasize, database support with respect to facial traits is limited at present to apply similar studies as we do to establish better applications. Distinctive dys‐ morphic frontal faces specific to dysmorphic genotype-phenotype diseases are needed to train the system. Currently, in our study, a very limited number of dysmorphic genetic dis‐ eases by using frontal faces have been trained for further recognition process. There are sev‐ eral genetic databases such as eMERGE (Electronic Medical Records and Genomics) and PhenX (Consensus Measures for Phenotypes and Exposures), Dysmorphology Database in Oxford Medical Databases (OMD) and OMIM. One of which named OMD is more appropri‐ ate for our study, because it is better prepared to reveal genotype-phenotype associations in terms of images and taxonomy of dysmorphology, although it has very limited number of frontal faces for syndromes. One of the reasons that we work on frontal 2D image analysis is that this prominent database (OMD) that we aim to include into the study contains 2D ge‐ netic dysmorphic images rather than 3D videos by which the number of frames are captured and recorded. Moreover, most of the geneticists studying on dysmorphic diseases usually keep 2D images of their patients in their databases. The main drawback of the majority of 3D face recognition approaches is that they need all the elements of the system to be well calibrated and synchronized to acquire accurate 3D data (texture and depth maps) [19]. That will make it easier for investigators to collect and analyze the 2D dysmorphic data associat‐ ed with genotypes. Whereas capturing of 3D information results in a richer data set and al‐ lows for excellent visualization despite the difficulties in possessing the technology and in detecting 3D as mentioned by Kau [13]13, 2D analysis has several advantages in practical use: equipment is cheap and it is easy to handle [2]. Conventional and digital two-dimen‐ sional (2D) photography offer rapid and easy capture of facial images. 3D analysis of syn‐ dromes would sure give better results as depicted in Hommond's study [6]. However, the lack of available data in 3D invalidates any methodology implemented for the near future.

Of course, recognition of face shape does not imply a diagnosis. A diagnosis is made by an appropriately trained clinician backed up, whenever possible, by genetic testing. For some dysmorphic syndromes there is no definitive genetic test and a clinical diagnosis has to suf‐ fice. For others, for example Noonan syndrome, a number of important genes may have been identified, but mutations for those genes may not be found in some children for whom there is a compelling clinical diagnosis [3].

A masking is not applied to remove the background of the cropped faces in our study. By employing a background mask, which simply provides a face shaped region, the effect of

<sup>13</sup> Due to inherent faults in technology and the distortion of light, none of the 3D imaging systems is accurate over the full field of view. Furthermore, all systems suffer from a potential for patient movement and alterations of facial ex‐ pression between the multiple views needed to construct a 3D model of the face.

background change would be minimized and diagnosing success would be better which is going to be an improvement as a further study. Furthermore, we couldn't reach the raw material of 7 syndromes in which the photos we utilized in our study are not in good condi‐ tion in terms of their appearances. Moreover the patients are not same ages and same sexes for each syndrome. The results would be better if better images were utilized and if the patients were in similar age group and sexes in the study. The more faces bearing the characteristics of any syndrome included in the training, the better the recognition of that syndrome will be.
