**2. Date acquisition**

The diagnosis of plant diseases is usually based on the appearance of the disease. When the leaves of a plant are infected by a disease, the appearance of the leaves will change significantly. Each disease usually has a discernible leaf color and texture symptom, and plant diseases can be diagnosed based on these characteristics. However, farmers mainly rely on their own experience to diagnose plant diseases with their own senses. Due to the limitation of knowledge background, there are ambiguities in the diagnosis. Most tea trees in China are planted in mountainous areas, which are large, difficult to investigate in the field, and inefficient. Relying on agricultural experts to diagnose tea leaf diseases is not only time-consuming but also costly. The transportation and infrastructure conditions in these places are limited. Finally, the expert must have experience and knowledge in various disciplines and need to understand all the symptoms of the disease and the causes of the diversity of the disease. At the same time, because China's agricultural population is relatively large and the number of experts engaged in agricultural services is extremely limited, it is necessary to establish a system that can diagnose tea leaf diseases in a

The current diagnostic methods of plant diseases mainly include microscope identification, molecular biology technology, and spectroscopic technology, but the first method is time-consuming and subjective. Even experienced plant pathologists may have wrong judgments, leading to inaccurate conclusion. The latter two methods are currently considered more accurate, and their main disadvantages are

With the rapid development of intelligent agriculture and precision agriculture, machine learning methods and computer image processing technologies have been applied to the identification of plant diseases [2, 3], providing a new method for detecting plant diseases, which can help farmers and researchers quickly and accurately identify the types of plant diseases. The general approach based on machine learning and computer image processing technology is first to manually design and extract disease image features, namely, global features, such as color features [4], shape features [5], texture features [6], or two or more than three features [7–11], and local features, using scale-invariant feature transform (SIFT), speeded-up robust features (SURF), dense scale-invariant feature transform (dense SIFT), and pyramid histograms of visual words (PHOW) [12–14]. After extracting the features, they are identified and classified using different classifiers, such as artificial neural networks [15, 16] and support vector machines [17, 18]. Because traditional machine learning relies on features extracted manually, the resulting recognition

At present, most of the researches on tea using computer vision technology focus on tea quality detection [19], tea species identification [20], and tea leaf disease information query and management based on expert systems [21]. Because the expert system has limited knowledge and needs to be updated and maintained on a regular basis, it is also limited for noncomputer professional technicians. For some literatures, the identification of tea diseases is based on hyperspectral [22] or infrared thermal images [1]. These methods are easy to operate and have high accuracy, but the cost of the instrument is not suitable for widespread promotion. In recent years, the popularity of the Internet has led to the explosive growth of Internet data, and the technical performance of computers and smartphones has continued to improve. These factors are the main reasons that have led to widespread attention for deep learning. Deep learning refers to the process of learning sample data through a certain training method to obtain a deep network structure containing multiple levels [23]. Deep learning is a branch of machine learning. Its essence is also a neural network, but the number of hidden layers is more than one layer, which is an extension of artificial neural networks. "Neural network" is a

the high labor intensity and the requirement of specific instruments.

timely and accurate manner.

*Advances in Forest Management under Global Change*

system is not fully automated.

component of deep learning.

**140**

The existing databases on the network such as ImageNet, PlantVillage, and CIFAR-1 datasets do not have sufficient tea leaf disease images and some studies

#### **Figure 1.**

*Typical example images of tea leaf diseases used in this manuscript. (1) Red leaf spot (*Phyllosticta theicola Petch*). (2) Algal leaf spot (*Cephaleuros virescens Kunze*). (3) Bird's-eye spot (*Cercospora theae Bredde Haan*). (4) Gray blight (*Pestalotiopsis theae Steyaert*). (5) White spot (*Phyllosticta theaefolia Hara*). (6) Anthracnose (*Gloeosporium theae-sinensis Miyake*). (7) Brown blight (*Colletotrichum camelliae Massee*).*

have collected disease photos in indoor or controlled environments. These factors have made the recognition system designed to identify diseases under natural light conditions to have certain limitations, so a new disease data set is constructed in this paper.

Tea leaf disease images were all captured using the Canon PowerShot G12 camera in the natural light environment of the tea garden in Chibi and Yichang within Hubei Province. The images were taken about 20 cm directly above the leaves with autofocus mode at resolution of 4000 3000 pixels. A total of 3810 disease images were collected, which contained 7 diseases, and all disease images have been identified by plant pathologists. The identification criteria used for the tea leaf diseases were based on the previously described identification schemes [27, 28]. In order to meet the requirements of the model algorithm and reduce the computational complexity of the network, all disease images are resized to 256 256 pixels and 750 750 pixels, respectively. **Figure 1** shows the types of tea leaf diseases used in this experiment. Data amplification processing is performed on a smaller number of disease images so that the number of the seven diseases image is balanced. Data amplification processing improves the generalization ability of the classifier, which is more conducive to network training. Three different methods were used to alter the image input and improve classification (**Figure 2**). A total of

7905 tea leaf disease images were obtained after the amplification treatment (**Table 1**). The 80/20 ratio of training/test data is the most commonly used ratio in neural network applications. In addition, a 10% subset of the test dataset was used

**Class Training Validation Testing** (1) White spot 941 118 117 (2) Bird's-eye spot 955 120 119 (3) Red leaf spot 890 111 111 (4) Gray blight 893 112 111 (5) Anthracnose 880 110 110 (6) Brown blight 920 115 115 (7) Algal leaf spot 846 106 105 Total 6325 792 788

*Automatic Recognition of Tea Diseases Based on Deep Learning*

*DOI: http://dx.doi.org/10.5772/intechopen.91953*

Traditional machine learning algorithm is a shallow architecture that contains one or two nonlinear transformation layers. It can automatically learn the underlying laws in the data and use the learned rules to make predictions. In the field of computer vision, many models can be realized by manually designing and

extracting the visual characteristics of the image in advance, and the image content is converted into a quantitatively calculated information description form, after

The extraction and selection of image visual features is an important means to transform the image content into a quantitatively calculated information description form, which mainly include global features and local features. Global features refer to the overall attributes of the entire image, mainly including color features, texture features, and shape features. These features are features that can be directly observed by the eyes. Global features are pixel-level shallow features with good stability, real-time performance, and simple and easy-to-implement algorithms. However, their shortcomings are high feature dimensions, large amount of calculations, and changes in image scale, lighting, and perspective. Local features are features extracted from local areas of the image, including corners, lines, edges, and areas with special attributes. Local features are distinguishable and robust to changes in lighting, rotation, perspective, and scale, as well as low dimensions and

The scale-invariant feature transform (SIFT) is local feature descriptor proposed by David G. Lowe in 1999 [30]. The SIFT descriptor maintains invariance to image rotation, translation, scaling, affine transformation, perspective and brightness changes, and noise and also maintains stability. And it can be combined with other algorithms to form a new optimization algorithm, thereby increasing the operation

**3. Tea leaf disease identification based on BOVW model**

being processed by the shallow structure model.

to validate the dataset [29].

*Tea leaf disease dataset in this manuscript.*

**Table 1.**

**3.1 Image visual feature**

easy implementation.

speed.

**143**

**Figure 2.**

*Examples of data augmentation used for red leaf spot images. (a) Initial; (b) flip horizontal; (c) flip vertical; (d) rotated 180°; (e–g) randomly cropped; (h) right-rotate 90°; (i) left-rotate 90°.*

*Automatic Recognition of Tea Diseases Based on Deep Learning DOI: http://dx.doi.org/10.5772/intechopen.91953*


**Table 1.**

have collected disease photos in indoor or controlled environments. These factors have made the recognition system designed to identify diseases under natural light conditions to have certain limitations, so a new disease data set is constructed in

*Advances in Forest Management under Global Change*

Tea leaf disease images were all captured using the Canon PowerShot G12 camera in the natural light environment of the tea garden in Chibi and Yichang within Hubei Province. The images were taken about 20 cm directly above the leaves with autofocus mode at resolution of 4000 3000 pixels. A total of 3810 disease images were collected, which contained 7 diseases, and all disease images have been identified by plant pathologists. The identification criteria used for the tea leaf diseases were based on the previously described identification schemes [27, 28]. In order to meet the requirements of the model algorithm and reduce the

computational complexity of the network, all disease images are resized to

256 256 pixels and 750 750 pixels, respectively. **Figure 1** shows the types of tea leaf diseases used in this experiment. Data amplification processing is performed on a smaller number of disease images so that the number of the seven diseases image is balanced. Data amplification processing improves the generalization ability of the classifier, which is more conducive to network training. Three different methods were used to alter the image input and improve classification (**Figure 2**). A total of

*Examples of data augmentation used for red leaf spot images. (a) Initial; (b) flip horizontal; (c) flip vertical;*

*(d) rotated 180°; (e–g) randomly cropped; (h) right-rotate 90°; (i) left-rotate 90°.*

this paper.

**Figure 2.**

**142**

*Tea leaf disease dataset in this manuscript.*

7905 tea leaf disease images were obtained after the amplification treatment (**Table 1**). The 80/20 ratio of training/test data is the most commonly used ratio in neural network applications. In addition, a 10% subset of the test dataset was used to validate the dataset [29].
