**7.1 Individual IMFs**

Table 1 tabulates the classification performance achieved by the texture information extracted from each individual IMF (FVindiv=FVi, where FVi is the feature vector constructed from IMFi as described in §5.3.3) for all classifiers. The highest classification rates obtained for each IMF are noted in bold. The format %±% corresponds to mean acc., sens. and spec. ± standard deviation.

The low classification rates for IMFs 1-2 denote their inefficiency in discriminating between ulcer and normal tissue. The performance of IMF1, in terms of classification accuracy, ranges

The WCE images used in this study for the development and assessment of the proposed approach were drawn from six patients with ulcerous diseases, such as unexplained ulceration, ulceration from NSAID, ulcerative colitis and Crohn's diseases, who have undertaken a WCE examination in NIMTS Gastroenterology Clinic in Athens, Greece. The examinations were conducted with Pillcam SB (Given Imaging) WCE system. Rapid Reader 6.0 software (Given Imaging) was employed to export the images from the video sequence. The dataset collected consists of 87 ulcer and 87 normal images. An example of the two categories is given in Fig. 11. The images were obtained by manual segmentation of the initial, complete WCE images. Two gastroenterologists reviewed the endoscopic video and manually isolated regions of interest (ROI), as the ones shown in Fig. 11, according to their expertise and upon agreement. It must be highlighted that the 87 ulcer images were obtained from 87 different events (ulcer regions) to achieve the lowest possible similarity. Furthermore, the normal images include both simple and confusing healthy tissue (folds, villus, bubbles etc.) in order to hamper the discrimination process. The ROI for the normal images varies from 110x110 to 220x220 pixels whereas the crop area of the ulcer images depends on the size, shape and position of the ulcer. The variety in ROI sizes does not affect the tissue discrimination procedure, since the feature vectors extracted from the images are

utilized as the basis for comparison instead of the images themselves.

Fig. 11. An example of ulcerous (left) and normal (right) region of interest.

The performance of the proposed AR-DLac scheme is evaluated through the experimental results derived from the application of the introduced approach to the dataset described in §6. To this end, results from every individual IMF analysis as well as results from both AR-DLac implementation scenarios (i.e., R-case and NR-case) are presented in this section.

Table 1 tabulates the classification performance achieved by the texture information extracted from each individual IMF (FVindiv=FVi, where FVi is the feature vector constructed from IMFi as described in §5.3.3) for all classifiers. The highest classification rates obtained for each IMF are noted in bold. The format %±% corresponds to mean acc., sens. and spec. ±

The low classification rates for IMFs 1-2 denote their inefficiency in discriminating between ulcer and normal tissue. The performance of IMF1, in terms of classification accuracy, ranges

**6. Experimental phase** 

**7. Results and discussion** 

**7.1 Individual IMFs** 

standard deviation.

from 51.9% to 61.3%, while the one of IMF2 varies from 53.3% to 62.0%. The sensitivity index is even lower (up to 29.1 percentage points) for the majority of classifiers (i.e., LDA, QDA and SVM). This performance implies that the texture information that lies in IMFs 1 and 2 is not eligible for ulcer detection. This behaviour is consistent with the concept of noise "contamination" of IMF1-2 and validates the noise reduction procedure described in §5.2 and illustrated in Fig 8. IMFs 1-2 contain the high frequency components of the image (i.e., the noise) and, therefore, should be discarded. On the contrary, the classification accuracy of IMF3-8 is 24.8% to 35.5% improved. IMF3 and IMF5 exhibit the lowest (77.4%) and highest (84%) performance (in terms of accuracy), respectively. Despite the superior classification rates, the performance of individual IMFs indicates that texture information that resides in a single image component is inadequate for efficient ulcer detection. Additionally, low classification sensitivity (<76%) suggests extensive misidentification of ulcer regions as healthy. It should be highlighted that IMFs 5, 6 and 8, that are the most commonly selected IMFs (Fig. 10) in IMF selection procedure, deliver the three highest classification accuracy rates among the IMFs. The convergence of these results testifies the optimal IMF selection procedure.

As far as the classification algorithms are concerned, the results in Table 1 imply that the most efficient classifiers include SVM and QDA. SVM achieves the best performance for the majority of IMFs (1-4, 8) due to its more advanced nature. However, the capabilities of QDA should not be underestimated since it exhibits 4.6, 5.7 and 4.8 percentage points higher classification accuracy than SVM for IMF5-7. LDA also proves competent, delivering slightly inferior performance. At last, MD is the most inappropriate classifier for our approach. The extremely high classification sensitivity (up to 99.2% for IMF6) in conjunction with the extremely low classification specificity (down to 15.6% for IMF6) denote over fitting to ulcer texture information.


Table 1. Mean classification accuracy, sensitivity and specificity (%) for each individual IMF (FVi, i=1 to 8), for all classifiers (LDA, QDA, MD, SVM).

Enhanced Ulcer Recognition from Capsule Endoscopic Images Using Texture Analysis 205

The comparison of the results in Tables 1 and 2 denotes the inability of a single IMF to separate ulcer from normal images. R-case and NR-case deliver 12.7 and 7.2 percentage points improved classification accuracy compared to the most effective IMF (i.e., IMF5), respectively. The superiority of NR-case suggests that the utilization of a group of

The aforementioned experimental results highlight the potential of the proposed scheme towards ulcer and healthy intestinal tissue discrimination. The optimum image components (IMFs) that contain the majority of texture information include IMFs 5, 6 and 8. Individual IMFs score up to 84% classification accuracy, while their exploitation as a group enhances the detection rate up to 91.2%. On the contrary, the refined image reconstruction process achieves 96.7% successful tissue identification. When compared with other approaches, the proposed scheme seems to be more effective exhibiting increased classification ability by using a smaller feature vector. One of the most efficient ulcer detection approaches (Kodogiannis *et al.*, 2007b), used as a comparison baseline, employs 54 features (instead of 8)

In spite of the promising performance of the proposed scheme, there are still some issues for improvements that should be taken under consideration for its use in a future computer aided diagnosis system. Firstly, the number of ulcer and normal images should be increased in order to secure maximum diversity between the ulcer cases, develop more robust algorithms and obtain more accurate conclusions. Secondly, automatic segmentation of the regions of interest (ROI) is mandatory. In our approach, WCE images are manually cropped to the ROI. A potential solution is to divide each WCE image (576x576) into small patches (64x64) and choose the patches that depict mucosa. These patches contain almost all possible ROI by covering the most valid area of the original WCE image. At last, the computational cost of the proposed techniques should be revised. In this work, the introduction of a real time application was not our objective. In this context, the computational cost of the current unoptimized MATLAB code is 9.3 seconds per ROI, considering an average size of 145x145 pixels. BEEMD analysis consumes 83.4% of this time while DLac analysis and classification absorb 16.2% and 0.4%, respectively. When focusing on real time application, dedicated hardware and programming languages, more efficient implementation algorithms (for

and results in 94.5% accuracy when applied to our dataset (Charisis *et al.*, 2011).

BEEMD and DLac) and multithreading programming should be considered.

Wireless capsule endoscopy (WCE) is a novel, non-invasive form of endoscopy that has started a new era for the visual inspection of the entire small bowel. A WCE system consists of a pill-shaped, wireless capsule that the patient swallows. The capsule, propelled by the natural bowel movements, captures and transmits images from the internal mucous membranes, along its journey through the digestive tract. Despite the revolution WCE has introduced, there are several limitations that pose serious questions about the competency of WCE compared to probe gastroscopy and colonoscopy. Some of the major challenges include camera speed/quality, power supply, controllable manoeuvring and interventional capabilities. Significant research is conducted towards this direction, various approaches have been proposed and the first achievements have emerged. A new capsule equipped

**9. Conclusion** 

individual IMFs is more fruitful than standalone IMF exploitation.

**8. Overall perspective and future work** 

#### **7.2 Reconstruction (R-case) – Non Reconstruction (NR-case) scenarios**

The proposed AR-DLac scheme, thoroughly described in §5.3, selects the optimum IMFs. In the reconstruction scenario (R-case) the selected IMFs and the residue reconstruct a new image from which the FV is extracted, while in non reconstruction scenario (NR-case) the FVs from the selected IMFs are concatenated in order to form the final FV. The results of both scenarios are tabulated in Table 2, for all classifiers. The best classification rates for each implementation scenario are notated in bold. The format %±% corresponds to mean acc., sens. and spec. ± standard deviation.


Table 2. Mean classification rates (%) for both R-case and NR-case scenarios, for all classifiers (LDA, QDA, MD, SVM).

The most efficient performance of the proposed AR-DLac scheme is delivered in R-case scenario where the classification accuracy reaches 96.7% by exploiting SVM classifier. However, the difference between the most (SVM) and the least (LDA) efficient classifier is only 1.2 percentage points, in terms of accuracy, implying that the features extracted by the proposed analysis are quite robust, exhibiting advanced overall performance regardless of the classification algorithm engaged. It is also remarkable the fact that the high accuracy rate is accompanied by high and relatively consistent rates for both sensitivity (96.5%) and specificity (96.9%). These results indicate that AR-DLac scheme is capable of equally recognising ulcer and normal data without exhibiting bias towards a specific pattern. This behaviour applies to all examined classifiers since the difference between sensitivity and specificity does not exceed 2.6 percentage points.

The classification results of the NR-case, suggest inferior performance of the AR-DLac scheme during the non reconstruction scenario. The most effective classifier for NR-case is QDA delivering 91.2% accuracy, 86.2% sensitivity and 95.9% specificity. These rates, compared to those of R-case, are 5.5, 10.3 and 1.0 percentage points lower, respectively. Even the worst scenario for R-case (LDA classifier) achieves 4.3 percentage points higher accuracy than QDA in NR-case. Moreover, the balance between sensitivity and specificity rates deteriorated ranging from 4.4 (for SVM) to 40 (for MD) percentage points. MD, as in the individual IMF case, fails to recognize correctly the normal images since the specificity rate is 58.3%. The divergence in performance between R-case and NR-case indicates that the recomposed image by the optimal IMFs and the residue represents more efficiently the intestinal texture information than the individual components of the image. This may be explained by the fact that in NR-case the trend of the image (residue) is ignored. Moreover, the fact that FVNR includes 3 IMFs x 8 features = 24 features in total (33% larger than FVR) may also affect the classification performance.

The comparison of the results in Tables 1 and 2 denotes the inability of a single IMF to separate ulcer from normal images. R-case and NR-case deliver 12.7 and 7.2 percentage points improved classification accuracy compared to the most effective IMF (i.e., IMF5), respectively. The superiority of NR-case suggests that the utilization of a group of individual IMFs is more fruitful than standalone IMF exploitation.
