2.1.3. Feature extraction: shape, intensity, and texture

The next step in the operation of regular CAD systems is the feature extraction of the RIO. The feature extraction can be defined as the process to infer and quantify the parameters that characterize the object being studied. The feature extraction contributes to the analysis of the ROI. It is possible to quantify the shape, texture, size, border, and other tissue parameters that can contribute to the diagnosis and detection of a cancer risk factor. As is showed in Figure 7, in this work, shape, intensity, and texture features were extracted in order to create a bio-

The image features of all Digital Image Mammography (DIM) of BCDR database were extracted and used to build a biomarker to train an ANN. The BCDR digital images are in RGB and gray-level digitalized in JEPG format with a depth of 24 and 8 bits per pixel, respectively, and a resolution of 3328 4084. The RGB mammograms are used to show the

marker for BCD using a CADx system that uses AI technology.

170 Advanced Applications for Artificial Neural Networks

red remarked section by a radiologist to delimit the found anomaly.

Figure 7. Image features extracted.

Image processing is the most important area where the feature extraction is applied, in which mathematical algorithms are used to detect and isolate various desired portions or shapes, features, of digitalized images or video streams, it is particularly important in the area of pattern recognition or character recognition.

The feature extraction method is the measure of physical parameters visualized in a segmented region of an image. The aim of feature extraction is to find a mathematical way to represent the image information, which is important, in a compact form, for solving a computational task. In BCD, these features help to determine the kind of tumor detected in a mammogram image. The choice of features has a crucial influence on the accuracy of classification, the time needed for classification, the number of examples needed for learning, and the cost of performing classification. In breast abnormalities, classification of the differences in mass between benign and malignant on a mammography can be distinguished from their shape, textures, and the intensity in the image.

In this research, an automated computer tool was designed to calculate shape, intensity, and texture features from ROI extracted from BCDR mammograms. The shape features of MDI use the pixels inside and the border of the ROI. These descriptors, showed in Table 1, only have a valid meaning in binary or logical images, and some simple shape features are used to describe a ratio between some geometrical figures; for example: extend, ellipse\_ratio, and solidity. Most common shape features are the area and perimeter of the region, but they are applied when the ROI size is invariant. However, the area and perimeter can be used to create a relation as the circularity and compactness.

The intensity features, showed in Table 2, use the shape intensity histogram to get information that describes the image; i.e., the intensity features use the probability and statistics from the values of the pixels in the image. The mean is the average intensity level. The standard is used


Table 1. Shape features.


Energy <sup>¼</sup> <sup>P</sup><sup>M</sup> x¼1 PN y¼1

Mean Energy

$$\overline{\mu} = \frac{1}{\text{MN}} \sum\_{\text{-}1}^{M} \sum\_{y=1}^{N} P\_{\text{ROO}}(x, y)\_{\text{'}} $$

where PRIO is the intensity pixel value in the coordinates x and y.

Standard deviation Contrast

x¼1

 $\square$  contrast

Contrast <sup>¼</sup> <sup>P</sup><sup>M</sup>

x¼1 PN y¼1 ð Þ <sup>x</sup> � <sup>y</sup> <sup>2</sup>

$$\sigma = \sqrt{\frac{1}{^{\mathcal{M}\mathcal{N}-1}\sum\_{x=1}^{\mathcal{M}}\sum\_{y=1}^{\mathcal{N}}\left|P\_{\mathcal{R}\mathcal{O}}(x,y) - \overline{\mu}\right|^2}$$

Variance Correlation

variance <sup>¼</sup> ffiffiffi σ p

$$\text{Correlation} = \sum\_{x=1}^{M} \sum\_{y=1}^{N} \frac{\left(x - \overline{\mu\_x}\right)\left(y - \overline{\mu\_y}\right)\mathbb{P}\_{\text{kGO}}(x, y)}{\sigma\_{\text{k}}\sigma\_y}$$

PRIOð Þ <sup>x</sup>; <sup>y</sup> <sup>2</sup>

where μ<sup>x</sup> , μ<sup>y</sup> , σx, and σ<sup>y</sup> are the mean values and the standard deviation PxRIO and PyRIO, respectively.

PRIOð Þ x; y

Coefficient of variation Homogeneity

Coefficient of variation <sup>¼</sup> <sup>σ</sup>

Skewness Entropy

$$\text{Skewness} = \frac{1}{\text{MN}} \sum\_{\mathbf{x}=1}^{M} \sum\_{y=1}^{N} \left( \frac{P\_{\text{RM}}(\mathbf{x}, \mathbf{y}) - \overline{\mathbf{x}}}{s} \right)^{\mathbf{3}}$$

Kurtosis

$$Kurtosis = \left\{ \frac{1}{\mathsf{NN}} \sum\_{\mathbf{x}=1}^{M} \sum\_{y=1}^{N} \left[ \frac{p\_{\mathrm{EO}}(\mathbf{x}, y) - \overline{\mu}}{\sigma} \right]^4 \right\} - \mathbf{3}$$

Table 2. Intensity and texture features.

$$\mathbf{u} \cdot \mathbf{v}$$

$$\overleftarrow{\#} \newline \qquad \qquad Homogeneity = \sum\_{\substack{\mathbf{x} = 1 \ \mathbf{y} = 1}}^{M} \sum\_{\mathbf{y} = 1}^{P\_{\mathbf{R} \cup \mathbf{\hat{x}}}(\mathbf{x}, \mathbf{y})} $$

$$Entropy = -\sum\_{\mathbf{x}=1}^{M} \sum\_{y=1}^{N} P\_{RIC}(\mathbf{x}, \mathbf{y}) \left[ \log \left[ P\_{RIC}(\mathbf{x}, \mathbf{y}) \right] \right]$$

to quantify the amount of variation of the set of intensities levels. The variance refers to the variation of the intensities around the mean value. The coefficient of variation is a standardized measure of dispersion in the values. Finally, the skewness and kurtosis measure the histogram symmetric.

The texture features, Table 2, describe the roughness of an image. Texture features attempt to capture features of the intensity fluctuations between groups of neighboring pixels. The texture is something to which the human eye is very sensitive. In this research, the energy, contrast, correlation, homogeneity, and entropy were used. The energy is a measure of textural uniformity of an image. The contrast refers to the difference in luminance in the ROI. Correlation texture measures the dependence of gray levels on those of neighboring pixels. Homogeneity measures the similarity of values in the ROI. Entropy measures the disorder of value pixels of an image.

As before mentioned, medical diagnosis is an important and complicated task that needs to be executed accurately and efficiently. At present, new techniques based on data mining, KDD, and AI in healthcare are being used mainly for predicting various diseases as well as assisting doctors in diagnosis in their clinical decision. One area where this effort has been most felt is the diagnosis of breast cancer in women. However, the absence of any fully effective, efficient method of BCD has led researchers to develop automated computational systems. In this research, automated CADx technology based on ANN as decision-making tool in the field of BCD is being developed.
