**4. AI learning algorithms and model types**

There are three main types of learning: *supervised* that learns known patterns, *unsupervised* that learns unknown or hidden patterns, and *reinforced* that learns rules or actions in data to learn a pattern or decision process and can be value-, policy-, or model-based in how it optimizes its solution to a given complex problem. Classification and regression problems are supervised, clustering and anomaly detection are unsupervised. Learning algorithms differ according to the problem and their ability to be trained on different types and amounts of data without being overfitted. Overfitting is a concept in AI and data science, which occurs when a statistical model fits exactly against its training data because it memorizes the noise and fits too closely. Deep double descent is the phenomenon where performance improves, then gets worse as the model begins to overfit, and then finally improves more with increasing model size, data size, or training time. Essentially, there is a given level of complexity where models are more prone to overfitting, but if enough complexity is captured in the model, the larger the model and data, the better. Learning can be sequential, in which one part of a task is learnt before the next, or incremental, in which an algorithm learn from scratch and gradually obtains more knowledge with an increasing amount of training inputs or examples by adjusts weights of an observation based on the last classification. How algorithms are trained on data differs as well. Bagging (i.e., bootstrap aggregating) generates additional data for training a model by resampling a given dataset through repeatedly re-combinations to produce multi-sets of the original data. Learning can also be ensemble-based (termed batch learning or stacking) that combines several base models in order to produce one optimal predictive model. Bagging is suitable for high variance, low bias problems, boosting is suitable for low variance, high bias problems, and stacking combines different models to learn some parts of a problem, in solving the whole space of a complex problem. Popular ML algorithms differ in terms of how they find solutions and partition a given problem space. A Support Vector Machine (SVM) uses hyperplane partitioning, Random Forest (RF) uses tree-based ensemble partitioning, and Gradient Boosting (GB) use an ensemble of weak prediction decision trees. Adaboost or Adaptive Boosting assigns higher

#### *Artificial Intelligence and Big Data Analytics in Vineyards: A Review DOI: http://dx.doi.org/10.5772/intechopen.99862*

weights to incorrectly classified data and Stochastic Gradient Boosting uses statistical bootstrapping of data to generate samples for implementing boosting. XGBoost is a boosting algorithm that benefit from 'regularization' that penalizes various parts of the algorithm to improve its performance by reducing overfitting.

ANNs comprise a collection of connected units or nodes called artificial neurons aggregated into different layers which transmit and process signals between their connections (edges). The signal of a given node is prescribed by a mathematical 'activation' function. Signals travel from a first 'input' layer, through one or more intermediate or 'hidden' layers, to an 'output' layer. Nodes in the hidden layer have values that are unknown and determined mathematically from their input and output signals as a network learns. Different layers may perform different transformations on their inputs. Connections can exist between nodes in different layers or between nodes within a given layer. Feedforward neural networks (FNNs) are a type of ANN having no memory, whereby signals only move in one direction from the input through to the output layer, never being processed by a node more than once. An extreme learning machine (ELM) is a FNN with a one or many hidden layers whose nodes can signal randomly, never update, or inherit previous signals without requiring any tuning of the mathematical function parameters of its node activation functions, or the weight values that alter the strength of how its inputs are connected within the network. A wide range of different DL model structures have evolved from FNNs. Recurrent neural networks (RNNs) are FNNs with memory whose nodes process signals in loops/feedbacks/cycles that considers current inputs and also what it has learned from previous inputs. Long-short-termmemory (LSTM) are a type of RNN that uses special units that include a â€~memory cellâ€™ that maintains information in memory for longer periods of time. Convolutional neural networks (CNNs) have several layers whose nodes are sparsely connected (i.e., nodes are not fully connected) whose flexibility is particularly useful for image recognition and object classification. A CNN typically comprises four types of layers, namely, the convolution layer, rectifier (ReLU) layer, pooling, and fully connected layers. Every layer has its own functionality and performs feature extractions and discovers hidden patterns in input data. RNNs can use sequential information, while CNNs cannot.

Restricted boltzman machines (RBM) consist of a two-layer network of fully connected nodes with both forward and backwards connections (i.e., a cycle) that can share weights (i.e., bidirectional). This two-layer network was originally designed to better determine good starting weights (i.e., pretraining) of FNNs. A deep belief network (DBN) consists of RBMs which are sequentially connected, comprising multiple hidden layers, with connections between hidden units are in separate layers. Deep q-learning networks (DQLNs) use reinforcement learning to make a sequence of decisions through trial and error within an interactive environment involving 'agents' that have 'states' that change, learn, and adapt over time. Q-learning is a specified form of reinforcement learning (i.e., values-based learning) that is model-free i.e., does not require a model of the environment. It learns expected values of future rewards for actions of agents that are in a given state with a given 'value'. It uses q-learning (i.e., learning from delayed rewards) based on Bellman's Equation that decomposes the value of an agent's state into an immediate reward and the value of a cumulative set of successor states according to a discount factor that determines the importance of future rewards. Bayesian learning (or belief) networks (BLNs) are a type of network model that is stochastic or probabilistic and involves 'priors'. Prior is short for 'prior probability distribution' and is the probability distribution that express one's beliefs about an uncertain quantity before some data or further evidence is taken into account. They are used to represent spatial or temporal dependence (represented by conditional probability distribution

#### **Figure 1.**

*Overview of the interactions of major climate, biotic, and abiotic drivers, stressors, and risks within vineyards.*

functions) between multiple stochastic variables (i.e., nodes), describing how the variables depend on each other in terms of cause-and-effect or causality (i.e., connections or arcs between nodes). Variables can be discrete or continuous. BLNs can be prepared by experts or learned from data, then used for inference to estimate the probabilities for causal or subsequent events. Copula bayesian networks (CBNs) use a tailored mathematical function called a copula that provides an efficient way to represent and compute the joint probability represented by such networks along with how its variables depend on each other.

New methods and frameworks to use and integrate BD and AI for complex problem-solving and enhanced decision making will, very likely, be needed to support sustainable vitiviniculture. Such approaches will need to consider complex interactions between climate, biotic, and abiotic drivers, stressors, and risks within vineyards, influencing grape and wine production, and value-chain resiliency and sustainability (**Figure 1**).

#### **5. AI use-cases and knowledge gaps**

Structured data is highly organized and easily understood by machine language, whereas unstructured data is often categorized as qualitative data that cannot be processed and analyzed using conventional tools and methods and includes text, video files, audio files, mobile activity, social media posts, and satellite imagery. BD can include also vague and imprecise information, qualitative data, and rule-based logic. An expert system (ES) is a computer program, model, or algorithm that uses AI to simulate the judgment and behavior of a human or an organization that has expert knowledge and experience in a particular domain or field. It provides supervision for AI algorithms by human experts termed human-in-the-loop (HITL), whereby a model requires human interaction and intervention and is not fully automated or self-reliant. AI in winemaking based on an ES approach was explored in 2000 [16], with limited research on ES, and closely associated, fuzzy inference systems (FIS) in viniviticulture. Fuzzy theory and FIS represent vagueness and imprecise information often used in making decision in a mathematical way using

#### *Artificial Intelligence and Big Data Analytics in Vineyards: A Review DOI: http://dx.doi.org/10.5772/intechopen.99862*

fuzzy sets and rule-based logic. Several leading examples are noteworthy. An ES for automated forecasting of optimal grape ripeness dates using data gathered from a vineyard wireless sensor network (WSN) has been developed and tested, but uses the Holt method (exponential adaptive forecasting for trended data) instead of ML or DL models/algorithms [17]. Also, an FIS that enables automating the classification of grape quality at harvest for grape growers has been developed and tested [18]. An ES for evaluating the sustainability of vineyards based on their management called Vigneto uses a fuzzy logic indicator [19]. A decision support system called FGRAPEDBN that uses fuzzy logic and expert knowledge is able to predict grape berry maturity. Berry maturity is measured as sugar concentration that increases rapidly, and acidity concentration, that decreases along with pH levels as berry mature. This ES attains high predictive accuracy (i.e., a root-mean-squarederror (RMSE) of 7 g/l (i.e., 0.44 g/l or 0.11 g/kg) [20]. The coupling of ES to AI (i.e., ML and DL models/algorithms) in viticulture, or agriculture in general, is still unexplored and in its infancy. Also, ES systems generally have no ability to learn decision rules, so could benefit also from being informed by AI/ML analytics and predictive insights.

A wide array of applications and use-cases of AI in vitiviniculture are evident, and are summarized in **Table 1**. This shows that there is substantial interest, applied expertise, and future potential in developing such approaches to help mitigate and adapt to climate change, address inter-related risks, and enhance decision-making and foresight. Current AI work is, however, concentrated heavily on grapevine yield prediction and grape variety classification using on the pattern recognition, detection, counting, and clustering of grape berries and bunches in imagery collected by observers, unmanned aerial vehicles (UAVs), and/or robots. Such imagery differs based on vineyard environmental conditions and grape variety altering illumination, occlusions, colors and contrast in images. Existing research limitations and challenges point to the need for robotics and mobile sensing platforms, the combination or fusion of both fine-scale hyperspectral and coarser-scale multispectral imagery data, as well as spatially-distributed sampling within vineyards to better measure and assess micro-climate variability linked with meso- and macroclimate and landscape suitability requirements that are changing with climate change.

Suitability requirements for vineyards would benefit from other AI/ML techniques to explore geospatial data and cross-validate geographical locations determined from CNN models applied to identify vineyards in satellite data. A wide range of different models for disease and pest control (i.e., a hybrid BLN, CNN, RF, GB) have been applied, and these multiple AI approaches could be coupled to provide a fully-integrated solution for processing field imagery, conducting data mining and analytics, and forecasting of disease risk in vineyards. Vineyard management is already exploring decision rule applications via case-based reasoning, and sequential methods of AI, but in isolation, and such work could greatly benefit from being coupled together to accelerate advancement. This would enable them to be tested on a broader set of vineyard data and to better identify best management practices, rather than a more incremental, siloed approach. Much more work is needed to explore opportunities and potential of BD and AI in vineyard biotic and abiotic factors and stress. Only a handful of studies have explored the use of satellite remote-sensing (i.e., Earth Observation or EO) data for detecting and mapping water and heat stress, yet large amounts of data for training and validating AI models now exists from EO data centers and providers. This could help to validate whether satellite indices can reliably detect and map stress variability in vineyard, what data fusion and satellite indices perform best, to port such BD and capabilities to support stakeholders proactive decision making ahead of extreme weather


#### *Refer to abbreviation list for model/algorithms.*

#### **Table 1.**

*Showcase of AI/ML in vitiviniculture (partial set from the review).*

*Artificial Intelligence and Big Data Analytics in Vineyards: A Review DOI: http://dx.doi.org/10.5772/intechopen.99862*

impacts like heatwaves. Most work on wine aroma and sensory profiling still employs traditional statistical techniques and clustering with limited work on global optimization (GO). While decision tools already exist in the market to track the wine preferences of consumers, they could be better informed from AI analysis and prediction that links more objective, scientific data on new varieties, wine constituents, alternative wine blends and new wine grown in newly establish vineyards in more suitable areas as climate change shifts grape and wine suitability. The application of BD and AI in traceability, authenticity, and protection also relies on more traditional statistical methods, rather than BD and AI. This is surprising and was not expected before conducting this review, as this area involves large extents of the value-chain and major business risk. Here, government could play a vital role to codesign and pilot test new solutions alongside experts in BD and AI, as developing broad-based solutions in this aspect likely require broad collaboration, multidisciplinary expertise, substantial BD collection and sharing, and industry wide involvement, adoption, and deployment.

## **6. Proposed BD and AI framework**

An existing ontology framework called the Agri-Food Experiment Ontology (AFEO) has been developed to guide the integration of data in a way that provides researchers with the information necessary to address extended research questions [63]. It contains 136 concepts spanning viticulture practices, wine-making products, and operations. It utilizes the Resource Description Framework (RDF) format, a standard model for relational data queries, interchange, and metadata processing, to represent these data in a standard format. Based on this review, an analytical framework is proposed that integrates BD analytics and AI prediction as part of a BD value-chain using expert knowledge as HITL intervention and guidance is outlined in **Figure 2**.

BD is distributed across different remote-sensing platforms (e.g., drone and satellite), across vineyards (e.g., networks of AI and climate-smart vineyards), and within vineyards (e.g., field sensor networks), and across data centers and

#### **Figure 2.**

*Depiction of a vineyard BD value-chain that incorporates diverse, distributed vineyard data alongside an expert system. This system integrates traditional, cultural perspectives, knowledge, and reasoning of grape growers, viticulture specialists, and other wine industry stakeholders.*

providers (e.g., long-term climate stations and weather monitoring networks providing both historical climate and near-real-time weather station data). Using a distributed cloud approach, an application of cloud computing technology, BD can be interconnected with public and private applications served from varied geographical locations for preprocessing quality control, data quality checks, model identification (i.e., variable selection, quantile classification), indicator model benchmarking, and the development of risk forecast models using AI. An ES system comprising conditional, decision rules provides traditional and expert knowledge, while informing AI model training and validation. An AI model then also learns by selecting rules from the master ES ruleset, adjusting and updating rules as it learns. In this way, the framework is agile and scaleable to address a wide range of stakeholder needs along the value-chain. This includes life-cycle assessment (LCA), providing data to support monitoring and tracking of vineyard sustainability indicators, and providing forecasts (i.e., foresight) to better anticipate future impacts, having additional lead time to mitigate and safeguard operations in time, and deciding between different possible actions and interventions to climate change (i.e., irrigation needs and limitations, disease outbreaks, extreme weather events) risks for more informed vineyard management scheduling and planning. Weather and climate transformed into tailored information and knowledge that vineyard stakeholders and users need and require are provided through customized Climate Information Services (CIS) help to drive forecasts of relevant vineyard indicators. This could integrate sub-seasonal and seasonal forecasting, alongside longer-term, downscaled inter-annual and decadal scenario projections. The quantification of risk (i.e., levels and associated uncertainties) is essential to determine an appropriate response. With an approach that can be scaled up to the entire vitiviniculture value-chain the adoption of BD and AI can be accelerated. This would enable all stakeholders to co-learn and collaborate in evidence-based and model-tested design tactics and strategies. Such an approach can ensure mitigation and adaptation actions and interventions are enabling, rather than inhibiting, to maximize perceived benefits and organizational readiness, while minimizing external pressures [64].
