**5. Discussion and conclusion**

using nonlinear mapping Φ. The product rule Eq. (8) is utilized to produce the final decision for the proposed ensemble framework to combine the prediction outputs of all four base classifiers. The product rule is preferred in the ensemble when the single classifiers posterior probabilities are correctly estimated [16]. The final prediction () for the test image () based

In the ensemble framework, the stages of feature selection and classification are executed 50 times for each classification task. In each run, the data set of each base classifier (i.e., tissue component) is randomly divided into 50% training and 50% testing) after normalizing, as per [16]. It should be pointed out that in each run of the ensemble framework, similar numbers of selected features are used with all base classifiers. The base classifiers utilize the SVM with Radial-Basis-Function (RBF) kernel, while the SVM-RFE utilizes the linear SVM. To deploy RBF, one needs to set an appropriate value of the cost penalty, c, and gamma, *γ*. The grid search tool is one of the most common methods to identify suitable values for c and γ [1, 16]. The SVM implementation is utilized by the LibSVM toolbox [1, 16], while the *C* and *γ* in the SVM are estimated using a grid search with different internal threefold cross-validations on the training data set only from {2–20, 220}. In this data set, the low vs. high grades classification task is dealt with, which is the most well-known task in state-of-the-art breast cancer analyses [1]. The results reported by this data set are shown in **Table 4**. As shown in **Table 4**, the proposed ensemble framework can effectively classify the low vs. high grades breast images. The AUC of low vs. high grade reached an average of 90.7%, which was greater than both the naïve and typical CAD. Moreover, when comparing the structure-method, the proposed method was far more superior. In using the proposed ensemble CAD, classification performance in the context of AUC can be substantially improved by 15% for the structurebased method. The results in **Figure 5** show that the ensemble framework was significantly quite accurate (90.8%) compared to the accuracy of each individual tissue components in the low vs. high grades in breast histopathology images. This framework has also been

*<sup>c</sup>*=<sup>2</sup>∏∏*<sup>t</sup>*=<sup>1</sup> *t*=4 *pj t*

**Naïve approach**

Accuracy 90.8 ± 5.0 89.9 ± 4.8 89.8 ± 3.9 — — Sensitivity 87.11 ± 8.4 87.1 ± 8.8 88.5 ± 7.7 — — Specificity 94.3 ± 5.3 92.7 ± 6.3 91.1 ± 6.9 — —

**Typical CAD [22]** **Significant of ensemble** 

**Naive Typical CAD [22]**

**with**

(*x*) (8)

on product rule is computed using (Eq. (8))

**4.6. Results and evaluation**

72 Breast Cancer and Surgery

**Classification task Breast UKM**

*class*(*x*) = max*<sup>j</sup>*=<sup>1</sup>

**Measure Proposed ensemble framework**

Low vs. high grade AUC 90.7 ± 5.0 89.9 ± 4.8 89.8 ± 3.9 — —

**Table 4.** The performance of the proposed ensemble framework on breast histopathology images data set.

This chapter discusses how machine learning, particularly SVM can improve the performance for detection and diagnosing of breast cancer. SVM for now is one of the most powerful machine learning techniques that is able to model the human understanding of classifying data. It can find the relationship between data and segregates them accordingly. Using pixel values in mammogram images, SVM helps to improve the mass detection and segmentation of Chan-Vese algorithms by classifying correctly the false positive pixels. As a result, a sharper mass was detected with better estimation of its shapes and sizes. Hence, radiologist can give better diagnosis and biopsy location. Then, images of cell structure or tissue textures from the biopsy sample were examine by the pathologist. These pathology slides were analyzed under the pathologist sharp eyes to locate and identify any abnormal pattern of tissue texture or architecture. The process is tiring and subjective to the pathologist experience in interpreting the tissue condition. Thus, inter-observer and intra-observer variations exist. However, the proposed SVM algorithm can identify the different tissue component and model the pattern of relationship between these components spatially and statistically. The model is then used to grade any new pathology slides into its modified Bloom-Richardson grading, according to what the SVMs have learned from previous examples. Using the technique, it helps the radiologist and pathologist reducing their work load by automating the automation for decision making, especially for common and mundane cases. Radiologist and pathologist would have more time to spend on special or rare cases. The learning curve for young apprentice can

**Figure 7.** Single vs. ensemble classification results for low vs. high grade.

be steeper. The automate grading of breast cancer helps to reduce the variation of inter- and intra-observation by the pathologist. In our work, it should be noted that we are not using the identical patient data of mammogram and pathology due to some limitation. However, in the future it is possible to take the identical patient. Via the automatic decision making we are able to create a platform that integrate diagnostic reporting system that supports both specialties and, therefore, improves the overall quality of patient care (**Figure 7**).

**References**

[1] Ashwaq Q, Siti Norul Huda SA, Shahnorbanun S, Rizuana IH, Fuad I. An accurate rejection model for false positive reduction of mass localisation in mammogram. Pertanika

Machine Learning Methods for Breast Cancer Diagnostic http://dx.doi.org/10.5772/intechopen.79446 75

[2] The American Cancer Society. How Common Is Breast Cancer? 2017. Available from: https://www.cancer.org/cancer/breast-cancer/about/how-common-is-breast-cancer.

[3] James S, Denise RA, Dena E, Silvana L, Ossama T, Dean WW. Integrating pathology and radiology disciplines: An emerging opportunity? BMC Medicine. 2012;**10**:100

[4] Ebert J, Xu Y, Smith G, Shen Y, Jiang J, Buchholz T, Hunt K, Black D, Giordano GW, Yang W, Shen C, Elting L, Smith B. Surgeon influence on use of needle biopsy in patient with breast cancer: A national medicare study. Journal of Clinical Oncology. 2014;

[5] Adepoju L, Qu W, Kazan V, Nazzal M, Williams M, Sferra J. The evaluation of national time trends, quality of care and factors affecting the use of minimally invasive breast biopsy and open biopsy for diagnosis breast lesions. American Journal of Surgery.

[6] Wan T, Cao J, Chen J, Qin Z. Automated grading of breast cancer histopathology using cascaded ensemble with combination of multi-level image features. Journal of

[7] Sampat MP, Markey MK, Bovik AC. Computer-aided detection and diagnosis in mam-

[8] Anju J.Machine learning techniques for medical diagnosis: A review. In: 2nd International Conference on Science, Technology and Management. New Delhi; 27 September 2015 [9] Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and prospects. Science.

[10] Marc K, Luciano MP, Ross WF, Geis JR. Implementing machine learning in radiology practice and research. American Journal of Roentgenology. 2017;**208**(4):754-760

[11] Afzan A,Khairuddin O. Computerized breast cancer diagnosis with Genetic Algorithm and Neural Network. In: Proceedings of the 3rd International Conference on Artificial Intelligence and Engineering Technology (ICAIET), Universiti Malaysia Sabah; 22-24

[12] Shahnorbanun S, Albashish D, Azizi A, Nordashima AS, Suria HMP. Absolute cosinebased SVM-RFE feature selection method for prostate histopathological grading.

[13] Azizi A. Supervised Learning Algorithms for Visual Object Categorization. Netherlands:

mography. Handbook of Image and Video Processing. 2005;**2**(1):1195-1217

Journal of Science and Technology. 2017;**25**(S):49-62

html [Accessed: 22-01-2018]

**32**(21):2206-2216

2014;**208**(3):382-390

2015;**349**:255-260

November 2006. pp. 533-538

Artificial Intelligence in Medicine. 2018;**87**:78-90

Universiteit Utrecht; 2010. ISBN: 978-90-393-5440-7

Neurocomputing. 2017;**229**(C):34-44

However, combining these tissue components' features resulted in dense feature vectors, which suffers from overfitting. The use of the ensemble learning framework that allows prediction using several training subsets could help mitigate this problem. These different subsets are clearly shown in the proposed ensemble framework. The results indicate that proposed ensemble framework significantly outperformed the typical CAD, naïve approach, and structure-based method.
