2.1.4. Classification and evaluation

Shape features

Circularity <sup>¼</sup> <sup>4</sup>πROIArea

Compactness <sup>¼</sup> ROIPerimeter <sup>2</sup>

Table 1. Shape features.

XN y¼1

coordinates x and y.

1 MN�1 PM x¼1 PN y¼1

s

variance <sup>¼</sup> ffiffiffi σ p

Skewness <sup>¼</sup> <sup>1</sup>

Kurtosis <sup>¼</sup> <sup>1</sup>

Kurtosis

Coefficient of variation <sup>¼</sup> <sup>σ</sup>

MN PM x¼1 PN y¼1

> MN PM x¼1 PN y¼1

Table 2. Intensity and texture features.

PRIOð Þ x; y ,

where PRIO is the intensity pixel value in the

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

<sup>μ</sup> <sup>¼</sup> <sup>1</sup> MN PM x¼1

σ ¼

ROIArea

172 Advanced Applications for Artificial Neural Networks

Intensity features Texture features

Mean Energy

Standard deviation Contrast

Variance Correlation

Coefficient of variation Homogeneity

Skewness Entropy

PRIOðx; yÞ�μ σ � �<sup>3</sup>

PRIOðx; yÞ�μ σ h i<sup>4</sup> ( )

� 3

�2

PRIOð Þ� <sup>x</sup>; <sup>y</sup> <sup>μ</sup> � � �

Circularity Extend

Energy <sup>¼</sup> <sup>P</sup><sup>M</sup> x¼1 PN y¼1

Contrast <sup>¼</sup> <sup>P</sup><sup>M</sup>

Correlation <sup>¼</sup> <sup>P</sup><sup>M</sup>

Entropy ¼ � <sup>P</sup><sup>M</sup>

<sup>μ</sup> Homogeneity <sup>¼</sup> <sup>P</sup><sup>M</sup>

x¼1 PN y¼1 ð Þ <sup>x</sup> � <sup>y</sup> <sup>2</sup>

> x¼1 PN y¼1

PxRIO and PyRIO, respectively.

x¼1 PN y¼1 PRIO ðx; yÞ 1þ∣x�y∣

x¼1 PN y¼1

PRIOð Þ <sup>x</sup>; <sup>y</sup> <sup>2</sup>

PRIOð Þ x; y

<sup>x</sup>�<sup>μ</sup> ð Þ<sup>x</sup> <sup>y</sup>�<sup>μ</sup> ð Þ<sup>y</sup> PRIOðx; <sup>y</sup><sup>Þ</sup> <sup>σ</sup>xσ<sup>y</sup> , where μ<sup>x</sup> , μ<sup>y</sup> , σx, and σ<sup>y</sup> are the mean values and the standard deviation

PRIOð Þ x; y log ½ � PRIOð Þ x; y

Compactness Ellipse\_ratio

ROIPerimeter <sup>2</sup> Extend <sup>¼</sup> ROIArea

BoxArea

EllipseArea

Ellipseratio <sup>¼</sup> ROIArea

Solidity <sup>¼</sup> ROIArea ConvexHullArea

Solidity

For automatic classification of BC on DIM, a GRANN was used to separate malignant and benign tumors [25]. GRANN falls into the category of probabilistic neural networks (PNN) [26–30]. GRANN is a neural network architecture of one-step-only learning that can solve any function approximation problem. The learning process is equivalent to finding a surface in a multidimensional space that provides a best fit to the training data. During the training process, it just stores training data and later uses it for predictions. This neural net is very useful to perform predictions and comparisons of system performance in practice. In GRANN architecture, there are no training parameters, just a smoothing factor (σ) that is applied after the network is trained. The choice of this factor is very important [26–30].

In this research, as is showed in Figure 8, a GRANN was trained and tested using a data set of 361 mammograms extracted from BCDR public database. For each mammogram, 35 image descriptors were calculated through an automated computer tool specifically designed for this purpose. These image features were used to train the neural net in order to classify benign and malignant BC for decision making in BDC.

As can be appreciated from Figure 8, the image features were used as entrance data, and the malignant (cancerous) and benign (noncancerous) instances were used as output data. In order to train the network, the dataset was randomly divided into two subsets, one with about 80% of the instances to training and another with around the remaining 20% of instances to testing.

Figure 8. Training of GRANN for BCD. (a) Breast image, (b) segmentation, (c) image descriptors, (d) network training, and (e) BCD.

After 2000 network trainings, a smoothing factor equal to 1e�4 was calculated. This value was used for training the neural net reaching an accuracy of 95.83%. The results obtained in this work show that GRANN is a promising and robust system for BCD. The performance of a trained GRANN was evaluated using four performance measures: accuracy, sensitivity, specificity, and precision. These measures are defined by four decisions: true positive (TP), true negative (TN), false positive (FN), and false negative (FN). TP decision occurs when malignant instances are predicted rightly. TN decision benign instances are predicted rightly. FP decision occurs when benign instances are predicted as malignant. FN decision occurs when malignant instances are predicted as benign.

Accuracy can be calculated as:

$$Accuracy = \frac{TP + TN}{TP + TN + FP + FN} \tag{1}$$

Sensitivity can be calculated as:

$$Sensitivity(recall) = \frac{TP}{TP + FN} \tag{2}$$

Specificity can be calculated as:

$$Specificity = \frac{TN}{TN + FP} \tag{3}$$

Precision can be calculated as:

$$Precision = \frac{TP}{TP + FP} \tag{4}$$

The confusion matrix for the data set, showed in Table 3, was computed using these values into above equations to find accuracy, sensitivity, specificity, and precision. Table 3 shows the


Table 3. Confusion matrix.

After 2000 network trainings, a smoothing factor equal to 1e�4 was calculated. This value was used for training the neural net reaching an accuracy of 95.83%. The results obtained in this work show that GRANN is a promising and robust system for BCD. The performance of a trained GRANN was evaluated using four performance measures: accuracy, sensitivity, specificity, and precision. These measures are defined by four decisions: true positive (TP), true negative (TN), false positive (FN), and false negative (FN). TP decision occurs when malignant instances are predicted rightly. TN decision benign instances are predicted rightly. FP decision occurs when benign instances are predicted as malignant. FN decision occurs when malignant

Figure 8. Training of GRANN for BCD. (a) Breast image, (b) segmentation, (c) image descriptors, (d) network training,

Accuracy <sup>¼</sup> TP <sup>þ</sup> TN

Sensitivity recall ð Þ¼ TP

Specificity <sup>¼</sup> TN

Precision <sup>¼</sup> TP

The confusion matrix for the data set, showed in Table 3, was computed using these values into above equations to find accuracy, sensitivity, specificity, and precision. Table 3 shows the

TP <sup>þ</sup> TN <sup>þ</sup> FP <sup>þ</sup> FN (1)

TP <sup>þ</sup> FN (2)

TN <sup>þ</sup> FP (3)

TP <sup>þ</sup> FP (4)

instances are predicted as benign.

174 Advanced Applications for Artificial Neural Networks

Accuracy can be calculated as:

and (e) BCD.

Sensitivity can be calculated as:

Specificity can be calculated as:

Precision can be calculated as:

classification results of BC. The confusion matrix is a table that allows to visualize the execution of an algorithm, usually a supervised learning. In this case, it is a classifier of two classes, the current or expected and those obtained by the system (predictions). Each column of the matrix represents the cases that the system, in this case, the neural network, predicted, while the rows represent the expected values.

The diagonal indicates the successes achieved by the system; that is, the trained neural network obtained 91.7% + 4.2% = 95.8% of accuracy, successfully predicting 66 + 3 = 69 lesions of a population of 72, representing an error of 4.2% with three errors from a population of 72 lesions.

With both biomarkers obtained from mammograms and the trained GRANN, a CADx technology system, as showed in Figure 9, is being created to be used at second stage in collaboration with GHZ and MML.

Figure 9. CADx system being developed based on DIP, KDD, and AI methodologies.

The Computer Aided Diagnosis system consists of two main stages: the first one detects the suspicious regions with high sensitivity presenting the results to the radiologist aiming to reduce false positives. This process is given in the first instance by a preprocessing algorithm based on advanced DIP techniques designed to reduce the noise acquired in the image and in an improvement of the same, and then it executed a segmentation process of different ROIs designed to detect high suspicion of some signs of cancer. By using the information obtained in the segmentation process, the classification of positives or negatives prediction of BC is obtained through a GRANN. After concluding the development of this CADx technological computer tool, it is planned to be used at real workplaces such as GHZ, making a validation of the prediction obtained by the neural network compared with predictions made by specialized oncologists aiming that it can be used as an aid in the early breast cancer diagnosis.

The aim of developing this CADx system for Mexican patients is to expand the knowledge of the database of BCD including image mammograms information, clinical data, risk factors, biopsy results, genomic information, etc., as is showed in Figure 9, at the bottom of the main window, with the aim to obtain more information on the resulting diagnostics such as degree of malignancy, time of presence, etc.
