**3. Computer-aided detection**

our previous work [11, 12]. Feature selection is very important as it contains information that can be used to train the system to identify specific patterns. The pixels are rich with qualitative abstractions or values of the input. Second step is analyzing all these features for detecting and classifying possible pattern or abnormality. Finally, the step is involving a ML algorithm to determine a best suitable model to represent the behavior or the pattern of the data [13].

60 Breast Cancer and Surgery

Various machine learning algorithms are now used to develop high-performance medical image processing systems such as computer-aided detection (CADe) system that detects clinically significant objects from medical images and computer-aided diagnosis (CADx) system that quantifies malignancy of manually or automatically detected clinical objects [14]. Therefore, CADe for mass in mammogram detects the suspicious region in the mammogram then tries to reduce the false positive and finally classifies this region to a mass or nonmass. In CADx for mass in a mammogram, most researchers use a region of interest (ROI) that contains the mass as an input to the CADx. Then, CADx tries to classify it into benign or malignant and gives the appropriate recommendation to do biopsy or follow-up screening [15]. Recent studies have shown that CAD systems, when used as an aid, have improved radiologists' accuracy of detection of breast cancer and also pathology decision [1, 7, 16]. It is worthwhile to distinguish ML from traditional computer-aided detection (CAD) algorithms. Traditional CAD algorithms are mathematical models that identify the presence or absence of image features known to be associated with a disease state. One of the examples is a microcalcification on a mammogram. Traditional CAD allows the developer to identify a feature explicitly and attempts to determine the presence or absence of that feature within a set of images. In contrast, ML techniques focus on a particular labeled outcome (ductal adenocarcinoma), and in the process of training, clusters of nodes evolve into algorithms for identifying features. The power and promise of the ML approach over traditional CAD is that useful features can exist

**Figure 1.** CADe vs. CADx. Source: Sampat et al. [7].

Digital medical image recognition (DMIR) might give a promising solution. DMIR is considered as an essential aspect of artificial intelligence. DMIR techniques aim to extract specific information from medical images to assist doctors in diagnosing certain diseases and follow their progress. Many image processing techniques have been utilized in DMIR, such as segmentation, object detection, and classification. DMIR is concerned with numerous imaging modalities in the field of diagnosis including computed tomography (CT), digital mammography, magnetic resonance imaging (MRI), and microscopic histopathological images [16, 17]. Depending on the type of breast tissue, breast mass appears different in a mammogram. While it appears as solid block in dense breast, it appears as a roundish pie in a fatty breast. The mass may be alone or with microcalcifications [1]. In some cases, healthy breasts are also diagnosed as suspicious of cancer by the radiologist, and unfortunately, unnecessary biopsy is performed on them. Knowing that there are many possibilities of masses in breast cancer, detecting these features and localizing them are important. In general, localizing the mass is important in computer-aided detection, where it searches for the location in the mammogram images and segments it. Refs. [1, 18] examine the most important approaches used for mass segmentation in mammogram. In general, localizing the mass is important in computer-aided detection where it searches for the location in the mammogram images and do segmentation. Cheng et al. [18] examine the most important approaches used for mass segmentation in mammogram. Image segmentation using thresholding is the simplest way to isolate the object from its background when the image has a distinct gray level distribution. Segmentation separates the regions by assuming that the region that have gray levels below a specific value, called the threshold, as a background and the region with gray levels higher than the threshold as the object or vice versa. Identifying the threshold value is the key point in this algorithm. By selecting a representable threshold, object extraction will be more accurate. Mostly, image histogram is used to identify the threshold value. Mass localization method is discussed in this chapter. This section is based on our previous work on SVM rejection model for breast cancer. This method is a rejection model based on SVM algorithm used to reduce the FP of the output of the Chan-Vese segmentation algorithm that was initialized by the MCWS algorithm.

Abnormal findings on screening mammograms lead to recall for further assessment, which includes additional imaging procedures and if considered necessary fine needle aspiration cytology, core needle biopsy, or surgical biopsy. Women recalled for further assessment without having a breast cancer diagnosed are considered to have had a FP screening result. FP results are a concern of mammographic screening as they might cause distress, anxiety, and other psychological problems to the women [19, 20]. It also implies additional hospital visits and diagnostic tests, as well as additional costs [21, 22]. The rates of FP screening results depend on the screening performance and organization, such as the screening interval, single versus double reading, participation patterns, sensitivity of the radiologists performance, equipment, and characteristics related to the screening population [22–26]. From image segmentation perspective, the FP is an over-segment result where the noncancerous pixel is segmented as a cancer pixel. The FP rate is considered a challenge in localizing masses in mammogram images. Hence, in this section, a rejection model is proposed by using SVM.

The goal of the rejection model which is based on SVM is the reduction of FP rate in segmenting mammogram through the Chan-Vese method, which is initialized by the MCWS algorithm. The MCWS algorithm is utilized for segmentation of a mammogram image. The segmentation is subsequently refined through the Chan-Vese method, followed by the development of the proposed SVM rejection model with different window size as well as its application in eliminating incorrect segmented nodules MCWS algorithm. SVM rejection model consists of three important stages: (i) initial segmentation, (ii) segmentation using Chan-Vese, and (iii) refined segmentation using SVM rejection model. First, the source image is cropped to remove any unnecessary parts in an image. Based on the high dimensionality in digital mammogram images, the image is then resized to speed up the subsequent processes. Second, completing the pre-processing stage, the SVM rejection model is built to reduce the FP rate. Presegmentation and postsegmentation enhancement for Chan-Vese level set algorithm is then proposed to localize masse in the mammogram. The key to achieve a good segmentation result using Chan-Vese is the initial contour. Instead of getting the initial contour from the expert, here, MCWS algorithm is used to obtain the initial contour, as well as to eliminate the noise. This makes the proposed method fully automated and reduces the time of interference. Lastly, localization of mass in mammogram, Chan-Vese active contourbased algorithm was used. Chan-Vese can find and maximize the convergence ranges, as well as treat the topological change. This ensures that Chan-Vese performs well in image segmentation. Support vector machine is a learning machine algorithm expounded by Cortes and Vapnik [15] at the AT&T Bell Laboratories that strives to address the issues pertaining to a two-group classification. The underlying working principle of this algorithm is to search for the optimal hyperplane that sets positive classes (+1) apart from negative classes (−1). In this context, the two classes are the nodules and the nonnodules of breast images, of which the provided training data were used for the SVM to build a model in predicting the target values of the two test data attributes. In this work, the radial basis function (RBF) kernel is employed in complementary with the SVM. The two best parameters, C and γ, are prerequisites for the generation of an accurate breast nodule and nonnodule classification by the RBF kernel. The SVM rejection model has three phases: extracting teacher image, training, and testing as shown in **Figure 2**. The grid has been used as a straightforward search on the training data to find the best parameters, and the reason for using the grid search instead of other

search algorithms is because of its short computation time. Additionally, the grid search can be easily parallelized because it is independent. The search spaces used in this research are {2⁻5, 210}. It is important to note that this study used the strategy of dividing the data set into

Machine Learning Methods for Breast Cancer Diagnostic http://dx.doi.org/10.5772/intechopen.79446 63

**Figure 2.** The process of SVM rejection model.

**Figure 2.** The process of SVM rejection model.

Abnormal findings on screening mammograms lead to recall for further assessment, which includes additional imaging procedures and if considered necessary fine needle aspiration cytology, core needle biopsy, or surgical biopsy. Women recalled for further assessment without having a breast cancer diagnosed are considered to have had a FP screening result. FP results are a concern of mammographic screening as they might cause distress, anxiety, and other psychological problems to the women [19, 20]. It also implies additional hospital visits and diagnostic tests, as well as additional costs [21, 22]. The rates of FP screening results depend on the screening performance and organization, such as the screening interval, single versus double reading, participation patterns, sensitivity of the radiologists performance, equipment, and characteristics related to the screening population [22–26]. From image segmentation perspective, the FP is an over-segment result where the noncancerous pixel is segmented as a cancer pixel. The FP rate is considered a challenge in localizing masses in mammogram images. Hence, in this section, a rejection model is proposed by using SVM.

62 Breast Cancer and Surgery

The goal of the rejection model which is based on SVM is the reduction of FP rate in segmenting mammogram through the Chan-Vese method, which is initialized by the MCWS algorithm. The MCWS algorithm is utilized for segmentation of a mammogram image. The segmentation is subsequently refined through the Chan-Vese method, followed by the development of the proposed SVM rejection model with different window size as well as its application in eliminating incorrect segmented nodules MCWS algorithm. SVM rejection model consists of three important stages: (i) initial segmentation, (ii) segmentation using Chan-Vese, and (iii) refined segmentation using SVM rejection model. First, the source image is cropped to remove any unnecessary parts in an image. Based on the high dimensionality in digital mammogram images, the image is then resized to speed up the subsequent processes. Second, completing the pre-processing stage, the SVM rejection model is built to reduce the FP rate. Presegmentation and postsegmentation enhancement for Chan-Vese level set algorithm is then proposed to localize masse in the mammogram. The key to achieve a good segmentation result using Chan-Vese is the initial contour. Instead of getting the initial contour from the expert, here, MCWS algorithm is used to obtain the initial contour, as well as to eliminate the noise. This makes the proposed method fully automated and reduces the time of interference. Lastly, localization of mass in mammogram, Chan-Vese active contourbased algorithm was used. Chan-Vese can find and maximize the convergence ranges, as well as treat the topological change. This ensures that Chan-Vese performs well in image segmentation. Support vector machine is a learning machine algorithm expounded by Cortes and Vapnik [15] at the AT&T Bell Laboratories that strives to address the issues pertaining to a two-group classification. The underlying working principle of this algorithm is to search for the optimal hyperplane that sets positive classes (+1) apart from negative classes (−1). In this context, the two classes are the nodules and the nonnodules of breast images, of which the provided training data were used for the SVM to build a model in predicting the target values of the two test data attributes. In this work, the radial basis function (RBF) kernel is employed in complementary with the SVM. The two best parameters, C and γ, are prerequisites for the generation of an accurate breast nodule and nonnodule classification by the RBF kernel. The SVM rejection model has three phases: extracting teacher image, training, and testing as shown in **Figure 2**. The grid has been used as a straightforward search on the training data to find the best parameters, and the reason for using the grid search instead of other

search algorithms is because of its short computation time. Additionally, the grid search can be easily parallelized because it is independent. The search spaces used in this research are {2⁻5, 210}. It is important to note that this study used the strategy of dividing the data set into two parts, of which one is considered unknown. The prediction accuracy obtained from the unknown set will reflect on the classification performance of the independent data set. This procedure is known as cross validation. Its goal is to divide the training set into v subsets of equal size. One subset will be tested using the classifier trained on the remaining subsets. Subsequently, each instance of the training set will be predicted once. This is to ensure that the cross-validation accuracy is the percentage of data that have been correctly classified. The training data (teacher images) for the rejection model were manually extracted from the mammogram images by analyzing the false positives (FP) and true positives (TP) of the Chan-Vese segmentation result. After the teacher images were extracted, they were resized using the same factor for the original image. Next, depending on the window size that considered the number of inputs to SVM rejection model, the teacher image was resized. Based on the experiment, either a window size of (7 × 7), (9 × 9), (11 × 11), or (13 × 13) was taken into consideration. After that, the image was transferred to a vector and then written into the training data file. This file contained two variables, x and y. The first variable x is a matrix containing rows of window pixel values for the teacher images. Each row represented one image. The length of the rows depended on the window size. The number of rows in this variable depended on the number of teacher images. The other variable y is a vector containing the class for each image. The class may be "1" for nodule images or "0" for nonnodule images. Before proceeding with the SVM rejection training, training data were used to obtain the best values for parameters C, γ. As previously mentioned, the grid search was used as a straightforward search on the training data to obtain these values. Cross validation was also applied to spill the training data 10-fold into training and testing. Depending on the best accuracy value returned by SVM, the best C and best γ values were chosen. The SVM rejection model was built using the selected C and γ values and the training data set.

Based on model in **Figure 2**, each row in the training data (*xi* ) represents an observation, and each column represents features. Class labels (*y*<sup>i</sup> ) represent the class label for the corresponding row in the training data.

truth. As mentioned earlier, the grid search was used as a straightforward search on the training data to determine the best parameters C, γ. **Table 2** shows values of C, γ using various

Accuracy denotes the proportion of the correct result and it can be calculated as shown in the following Eqs. (1)–(7), where TP is true positives, TN is true negatives, FP is false positives (type 1 error), and FN is false negatives (type 2 error). In mass localization, the concept of the confusion matrix that is in **Table 2** represents the correctly segmented nodule and nonnodule with the miss segment. TP and TN are the correctly localized nodule and nonnodule, respectively, while FP is the incorrectly segmented nonnodule as a nodule and FN is incorrectly

**Result (predicted)**

**Nodule pixel Nonnodule pixel**

Machine Learning Methods for Breast Cancer Diagnostic http://dx.doi.org/10.5772/intechopen.79446 65

window sizes (7 × 7), (9 × 9), (11 × 11), and (13 × 13).

Nodule pixel TP FN Nonnodule pixel FP TN

segmented nodule as a nonnodule.

Ground truth (actual)

**Table 2.** Confusion matrix.

**Figure 3.** Hierarchy of UKMMC data set.

#### **3.1. Results and evaluation**

About 170 mammogram images from 109 patients were collected from the UKM Medical Centre (UKMMC). **Table 1** and **Figure 3** show training and testing data that have been used in the experiment. The teacher images extracted from the training data based on the segmentation result contained 35 nodule images and 35 nonnodule images extracted from the training data set. The SVM rejection model was run 10 times with a standard deviation of 0.0001, and the results showed the effectiveness of using the rejection model compared with the ground


**Table 1.** Data set for training and testing.

**Figure 3.** Hierarchy of UKMMC data set.

two parts, of which one is considered unknown. The prediction accuracy obtained from the unknown set will reflect on the classification performance of the independent data set. This procedure is known as cross validation. Its goal is to divide the training set into v subsets of equal size. One subset will be tested using the classifier trained on the remaining subsets. Subsequently, each instance of the training set will be predicted once. This is to ensure that the cross-validation accuracy is the percentage of data that have been correctly classified. The training data (teacher images) for the rejection model were manually extracted from the mammogram images by analyzing the false positives (FP) and true positives (TP) of the Chan-Vese segmentation result. After the teacher images were extracted, they were resized using the same factor for the original image. Next, depending on the window size that considered the number of inputs to SVM rejection model, the teacher image was resized. Based on the experiment, either a window size of (7 × 7), (9 × 9), (11 × 11), or (13 × 13) was taken into consideration. After that, the image was transferred to a vector and then written into the training data file. This file contained two variables, x and y. The first variable x is a matrix containing rows of window pixel values for the teacher images. Each row represented one image. The length of the rows depended on the window size. The number of rows in this variable depended on the number of teacher images. The other variable y is a vector containing the class for each image. The class may be "1" for nodule images or "0" for nonnodule images. Before proceeding with the SVM rejection training, training data were used to obtain the best values for parameters C, γ. As previously mentioned, the grid search was used as a straightforward search on the training data to obtain these values. Cross validation was also applied to spill the training data 10-fold into training and testing. Depending on the best accuracy value returned by SVM, the best C and best γ values were chosen. The SVM

rejection model was built using the selected C and γ values and the training data set.

About 170 mammogram images from 109 patients were collected from the UKM Medical Centre (UKMMC). **Table 1** and **Figure 3** show training and testing data that have been used in the experiment. The teacher images extracted from the training data based on the segmentation result contained 35 nodule images and 35 nonnodule images extracted from the training data set. The SVM rejection model was run 10 times with a standard deviation of 0.0001, and the results showed the effectiveness of using the rejection model compared with the ground

Number of images 11 17 46 96

Total number of images 28 142

**Training data Testing data**

**Nodule Nonnodule Nodule Nonnodule**

) represents an observation, and

) represent the class label for the correspond-

Based on model in **Figure 2**, each row in the training data (*xi*

each column represents features. Class labels (*y*<sup>i</sup>

ing row in the training data.

64 Breast Cancer and Surgery

**3.1. Results and evaluation**

**Table 1.** Data set for training and testing.

truth. As mentioned earlier, the grid search was used as a straightforward search on the training data to determine the best parameters C, γ. **Table 2** shows values of C, γ using various window sizes (7 × 7), (9 × 9), (11 × 11), and (13 × 13).

Accuracy denotes the proportion of the correct result and it can be calculated as shown in the following Eqs. (1)–(7), where TP is true positives, TN is true negatives, FP is false positives (type 1 error), and FN is false negatives (type 2 error). In mass localization, the concept of the confusion matrix that is in **Table 2** represents the correctly segmented nodule and nonnodule with the miss segment. TP and TN are the correctly localized nodule and nonnodule, respectively, while FP is the incorrectly segmented nonnodule as a nodule and FN is incorrectly segmented nodule as a nonnodule.


**Table 2.** Confusion matrix.

Specificity is also known as TN rate, and it represents the ability of the method to identify the nonnodule and avoiding false positives.

Sensitivity, which is also known as TP rate or recall, represents the ability to identify the nodule and avoid false negatives.

The FP rate shows the nonnodule pixel, which is segmented as nodule. It is an over segmented pixel. The FN rate shows the nodule pixel, which is segmented as nonnodule. It is the miss segmented.

$$Accuracy = \frac{\mathsf{TP} - \mathsf{TN}}{\mathsf{TP} + \mathsf{FN} - \mathsf{TN} + \mathsf{FN}} \tag{1}$$

$$\text{Specificity} \quad (SI') = \frac{NN}{TN + FP} \tag{2}$$

$$\text{Newstivity}(\text{SIN}) - \frac{\text{TP}}{\text{TP} + \text{FN}} \tag{3}$$

$$
\mu\gamma\mu\text{ }r\pi tc - \frac{F\mathcal{P}}{F\mathcal{H}' + I\mathcal{N}} \tag{4}
$$

$$A\text{ NN }r\alpha\pi\pi = \begin{array}{c} F\text{-}\overleftarrow{\text{V}} \\ F\text{N}+T\text{N} \end{array} \tag{5}$$

images as well as the extraction of the initial contour was performed through MCWS, of which the proposed method comprises. The Chan-Vese algorithm is employed as the initial contour to enhance the result of the segmentation. The three steps of the SVM rejection model are in

the FP rate using SVM rejection model (a7, b7, c7, and d7) ground truth images.

**Figure 4.** Result before and after using SVM model. (a1, b1, c1, and d1) original nonnodule and nodule images. (a2, b2, c2, and d2) segmentation result without using SVM rejection model, (a3, b3, c3, and d3) segmentation result after reducing the FP rate using SVM rejection model, (a4, b4, c4, and d4) ground truth images,. (a5, b5, c5, and d5) binary segmentation result without using SVM rejection model (a6, b6, c6, and d6) binary segmentation result after reducing

Machine Learning Methods for Breast Cancer Diagnostic http://dx.doi.org/10.5772/intechopen.79446 67

$$\text{Negative Rate Matrix } (\text{NRM}) = \frac{\text{FP rate} + \text{FN rate}}{2} \tag{6}$$

The NRM shows the mismatch between the predicted results and the actual ground truth. Our method was evaluated by comparing the segmented images to the ground truth. To show the effectiveness of the method, a comparison was done before and after the rejection model, as shown in **Figure 4**. This process was performed first by comparing each pixel in the resulting image with the corresponding pixel in the ground truth image. Then, objective evaluation was used to evaluate the method by calculating the confusion matrix as in **Table 2**, based on the prediction result and the actual ground truth. **Table 3** and **Figure 4** show the quantitative analysis of the results and sample of the result. The effectiveness of our method can be proven by comparing the result before and after using the rejection model. **Table 3** shows the FP rate of the rejection model is inversely proportionate to the window size. On the other hand, the specificity rate of the rejection model is linearly proportional to window size.

This section discussed on reducing the FP rate based on SVM machine learning. The SVM rejection model was built to reduce the FP rate after segmentation. Our method has three steps in the segmentation phase: first, MCWS was used to obtain the initial contour by segmenting the mammogram image. Then, the output of MCWS was used as an initial contour to the Chan-Vese algorithm. Finally, the rejection model based on SVM was used in order to reduce the FP rate. The SVM rejection model has three steps in the following order: extracting teacher images, training the rejection model, and testing the model. The FP rate reduction by means of SVM machine learning been put forth, wherein the FP rate, upon segmentation, had been reduced by the developed SVM rejection model. The segmentation of the mass in mammogram

Machine Learning Methods for Breast Cancer Diagnostic http://dx.doi.org/10.5772/intechopen.79446 67

Specificity is also known as TN rate, and it represents the ability of the method to identify the

Sensitivity, which is also known as TP rate or recall, represents the ability to identify the

The FP rate shows the nonnodule pixel, which is segmented as nodule. It is an over segmented pixel. The FN rate shows the nodule pixel, which is segmented as nonnodule. It is the

The NRM shows the mismatch between the predicted results and the actual ground truth. Our method was evaluated by comparing the segmented images to the ground truth. To show the effectiveness of the method, a comparison was done before and after the rejection model, as shown in **Figure 4**. This process was performed first by comparing each pixel in the resulting image with the corresponding pixel in the ground truth image. Then, objective evaluation was used to evaluate the method by calculating the confusion matrix as in **Table 2**, based on the prediction result and the actual ground truth. **Table 3** and **Figure 4** show the quantitative analysis of the results and sample of the result. The effectiveness of our method can be proven by comparing the result before and after using the rejection model. **Table 3** shows the FP rate of the rejection model is inversely proportionate to the window size. On the other hand, the specificity rate of the rejection model is linearly proportional to window size. This section discussed on reducing the FP rate based on SVM machine learning. The SVM rejection model was built to reduce the FP rate after segmentation. Our method has three steps in the segmentation phase: first, MCWS was used to obtain the initial contour by segmenting the mammogram image. Then, the output of MCWS was used as an initial contour to the Chan-Vese algorithm. Finally, the rejection model based on SVM was used in order to reduce the FP rate. The SVM rejection model has three steps in the following order: extracting teacher images, training the rejection model, and testing the model. The FP rate reduction by means of SVM machine learning been put forth, wherein the FP rate, upon segmentation, had been reduced by the developed SVM rejection model. The segmentation of the mass in mammogram

(1)

(2)

(3)

(4)

(5)

(6)

nonnodule and avoiding false positives.

nodule and avoid false negatives.

miss segmented.

66 Breast Cancer and Surgery

**Figure 4.** Result before and after using SVM model. (a1, b1, c1, and d1) original nonnodule and nodule images. (a2, b2, c2, and d2) segmentation result without using SVM rejection model, (a3, b3, c3, and d3) segmentation result after reducing the FP rate using SVM rejection model, (a4, b4, c4, and d4) ground truth images,. (a5, b5, c5, and d5) binary segmentation result without using SVM rejection model (a6, b6, c6, and d6) binary segmentation result after reducing the FP rate using SVM rejection model (a7, b7, c7, and d7) ground truth images.

images as well as the extraction of the initial contour was performed through MCWS, of which the proposed method comprises. The Chan-Vese algorithm is employed as the initial contour to enhance the result of the segmentation. The three steps of the SVM rejection model are in


it from decaying. Then, the tissue is sectioned into fragile slices (e.g., 2–15 μm) using a microtome machine, which creates very thin slices. The slices are then arranged on the glass slide before being stained. The tissue is stained using certain pigments to reveal the tissue components (e.g., lumen, nuclei, cytoplasm, and stroma). This helps the pathologist to view the individual tissue component more clearly. This procedure is called cells marker. The pathologists use different methods of staining depending on the diagnostic process at hand. Among the common staining types, Hematoxylin and Eosin combination H&E is the most popular for diagnosis and grading. After staining the tissue slide, the pathologist evaluates the tissue slide using the microscope as in UKMMC or through a digital scanner used to produce digital pathology images. In UKMMC, a specific type of microscope (Olympus BX50 microscope) is used for the diagnosis [16]. This microscope has a camera to capture images of the region of interest. The next subsection will explain the image acquisition steps involved in the creation of the prostate and breast cancer data sets required for this study. Subsequent subsections will present a brief overview of the devices required for the image

Machine Learning Methods for Breast Cancer Diagnostic http://dx.doi.org/10.5772/intechopen.79446 69

In this study, prostate histological images were captured from tissue slides. All the images were viewed using an Olympus BX50 microscope (Olympus Corporation, Japan), and images were captured using a DP72 digital camera (Olympus Corporation) and cellSens Life Science imaging software, version 1.6 (Olympus Corporation) [16]. The sensitivity of the illumination source and camera's intensity were kept constant. The microscopes were adjusted manually to form clear magnified images, and the cameras were controlled through desktop computers to capture color digital images. Before image acquisition, the pathologists in UKMMC had selected the ROIs under the microscope. However, this requires substantial time and effort from pathologists, and more importantly, a subjective choice of the ROIs could introduce biases into the database and harm the generalizability of the developed

Prior to acquiring the images, the microscope components, such as the light condenser, diffusing screen, and objective lens, were properly cleaned to remove any dust in the light path, which might badly affect the clarity of the acquired image. The focal plane was adjusted manually for clear images and was readjusted before every new image was taken. A light condenser was used to increase the light intensity for high-resolution image acquisition. To acquire an image from an ROI, the pathologist in UKMMC first reviewed the tissue section at a low magnification (e.g., 1× or 4×) to locate the ROI at the center of the image's field of view [16]. Usually, fine tuning is needed at higher magnification (40× magnification) to ensure a region with a typical Gleason pattern in the ROI is selected. The focal plane was then adjusted to produce a sharp image, and the light intensity was tuned so that the largest pixel value was slightly lower than the upper limit of the pixel's dynamic range. When all those adjustments were satisfactory, a still image was captured and saved onto the desktop computer as a color RGB digital image with a (tiff) extension. This process was repeated for all images that were

acquisition and image acquisition flow.

**4.2. Image acquisition devices**

computer CAD system.

**4.3. Image acquisition work flow**

captured for breast pathologists.

**Table 3.** Quantitative analysis.

the following sequence: extracting teacher images, training the rejection model, and testing the model. Credence can be given to the MCWS algorithm in surmounting the challenges associated with the Chan-Vase algorithm. The Chan-Vese algorithm can be made more autonomous and converge faster by using a good initialization generated by MCWS.

Nevertheless, the reliance mammogram segmentation on the divergence and convergence of the intensity value of the image pixels is the constraint factor for this algorithm. The tendency has been toward segmenting the outlier component as part of the contour component, resulting in an incremental FP rate of the selected contour pixels. Accordingly, to overcome this issue, the SVM rejection model is geared toward reducing the FP rate. T-test was performed to determine the mean difference of two samples, that is, the accuracy before and after using rejection model with the best window size, which is (13 × 13). The T-test was applied to determine if there was a difference before and after applying the rejection model. The hypothesized mean difference of T-test was set to value 0, also named as null hypothesis. That means, assuming that there was no difference in the result whether using the rejection model. The alpha was set to value 0.05. The concept of T-test states that if the P value is less than the assumed alpha, the null hypothesis is not correct and there is a difference between the mean of the two samples. T-test result shows that the proposed method is considered statistically significant with (P = 0.00001 < 0.05). Furthermore, the proposed rejection models also showed less standard deviation (0.0001) and yields to stability in its performance. In general, this proposed method offers alternative decision-making ability and is able to assist the medical expert in giving second opinion on more precise nodule detection. Hence, it reduces FP rate that causes over segmentation.
