Business Intelligence Applications

#### **Chapter 6**

## Recent Advancements in Commercial Integer Optimization Solvers for Business Intelligence Applications

*Cheng Seong Khor*

### **Abstract**

The chapter focuses on the recent advancements in commercial integer optimization solvers as exemplified by the CPLEX software package particularly but not limited to mixed-integer linear programming (MILP) models applied to business intelligence applications. We provide background on the main underlying algorithmic method of branch-and-cut, which is based on the established optimization solution methods of branch-and-bound and cutting planes. The chapter also covers heuristic-based algorithms, which include preprocessing and probing strategies as well as the more advanced methods of local or neighborhood search for polishing solutions toward enhanced use in practical settings. Emphasis is given to both theory and implementation of the methods available. Other considerations are offered on parallelization, solution pools, and tuning tools, culminating with some concluding remarks on computational performance vis-à-vis business intelligence applications with a view toward perspective for future work in this area.

**Keywords:** integer programming, valid inequalities, local branching, relaxation induced neighborhood search (RINS), evolutionary algorithms, solution polishing

#### **1. Introduction**

The ongoing drive on Industrial Revolution 4.0 particularly to take advantage of big data analytics has impacted business intelligence applications significantly spanning various areas including resource assessment, corporate development, and advanced technology R&D research and development [1]. A key enabler supporting the transformation to digitalization is optimization technology which encompasses the established methodologies of linear and nonlinear programming with extensions to discrete or integer programming. This chapter focuses on recent advancements in commercial optimization solvers notably the industry-leading software package of IBM ILOG CPLEX [2] as applied to variants of integer programming problems particularly mixed-integer linear programming (MILP) models.

This chapter aims to contribute towards highlighting the growing and maturing capability of integer optimization especially in the last decade or so towards addressing, solving, analyzing, and eliciting insights from practical business intelligence applications. With rapid developments in the realm of big data analytics

as spurred by Industry Revolution 4.0, advancement in optimization technology including integer optimization is imperative to support if not spearhead the changes at the forefront of the transformation taking place. The rest of the chapter is organized as follows. Section 2 gives an overview of the present role of integer optimization in business intelligence applications. Major solution methods and algorithms with certain enhanced features typically available in standard integer optimization solvers are detailed in Section 3 including those intended to exploit model formulations. Section 4 describes and discusses several real-world use cases on practical business intelligence applications that illustrate the applicability and strengths of integer optimization solvers. Finally, concluding remarks on the salient features of standard integer optimization solvers for business intelligence applications are offered including perspectives for future research directions.

#### **2. Overview of integer optimization in business intelligence applications**

Numerous business intelligence applications can be posed as mathematical programming problems that can be handled by commercial optimization solvers such as CPLEX, Gurobi [3], or KNITRO [4]. The problems can be formulated as models that include linear programming (LP), mixed-integer linear programming (MILP), quadratic programming (QP), mixed-integer quadratic programming (MIQP), quadratically-constrained programming, and mixed-integer quadratically-constrained programming. Such solvers are also used in tandem with other appropriate optimization solvers to handle other mainly nonlinear problems such as mixed-integer nonlinear programming (MINLP) models or in general, mixed-integer programs (MIP) [5].

#### **2.1 Computational performance of commercial integer optimization solvers**

The actual computational performance of a commercial optimizer (or optimization package) such as CPLEX results from a combination of improvement in several aspects. They include LP solvers with capability and features including preprocessing, algebra for sparse systems, solution methods (primal or dual simplex and barrier), and techniques to overcome degeneracy and numerical difficulties [6]. Equally important is the use of cutting planes as valid inequalities in solving problems that bridges the gap from theory to practice [7]. Further improvement involves applying heuristics including node heuristics (e.g., local branching, guided dives) and relaxation-induced neighborhood search, invoking evolutionary algorithms for solution polishing; and implementing parallelization for efficient computations [8].

#### **Figure 1.** *Historical background of IBM ILOG CPLEX integer optimization solver.*

*Recent Advancements in Commercial Integer Optimization Solvers for Business Intelligence… DOI: http://dx.doi.org/10.5772/intechopen.93416*


**Table 1.**

*Software release history of IBM ILOG CPLEX integer optimization solver.*

#### **2.2 A commercial success story: CPLEX integer optimization solver**

CPLEX is a state-of-the-art commercial integer optimization solver currently marketed by IBM. It represents an early commercial success story of an optimization package with various acquisitions and a spin-off solver (called Gurobi) which is now a success story of its own. **Figure 1** presents brief historical facts of CPLEX while **Table 1** summarizes the software release history.

#### **3. Solution methods and algorithms**

#### **3.1 Integer optimization algorithms**

A suite of algorithms is available in various integer optimization solver to exploit the underlying problem structure of a business intelligence application towards achieving efficiency and accuracy. **Table 2** summarizes the typical main algorithms employed by CPLEX according to the problem type identified together with remarks on the enhancement provided to increase computational performance [9].

#### **3.2 Branch-and-bound**

A general structure of mixed-integer program is given by:

minimize

$$\mathbf{c}^r \mathbf{x} \tag{1}$$

subject to

$$A\mathfrak{x} = b \tag{2}$$

$$l \le x \le u \tag{3}$$

$$\mathbf{x} \text{ some } \mathfrak{x} \text{ are integers.}\tag{4}$$

Branch-and-bound is a base algorithm to solve MIP which uses LP as a subroutine [10]. The key strategies of a branch and bound procedure involve splitting (i.e., branching) the solution space into disjoint subspaces, bounding the objective function values for all solutions in the subspaces, and pruning or fathoming nodes of branches that cannot yield better solutions. Although it is provably exponential in time, tricks are available to accelerate its search which mostly apply to a subset of models with a suite of algorithms available.

The branching strategies are performed on the integer variables and comprise two main steps: (1) Choose an integer variable as a branching variable ,*<sup>j</sup> x* (2) Split the problem into two submodels: *x i <sup>j</sup>* ≤ or ≥ + 1 *x i <sup>j</sup>* where for the special case of binary variables, the problem becomes *xj* = 0 or =1. *xj*

The bounding problem given by the continuous (LP) relaxation to determine a lower bound *<sup>L</sup> IP z* on the objective function value of the original MIP problem can be described as follows: minimize (= ) *T L IP cx z* subject to *Ax b* = , *lxu* ≤ ≤ (simple bounds), and some *xj* are integers. The continuous relaxation problem gives solution of an optimal objective value of *<sup>L</sup> IP z* , which is a lower bound on the objective function value of the original MIP problem by relaxing the integrality restriction. There are two useful properties of continuous relaxation: (1) If its solution satisfies integrality restrictions, there is no need to further explore the subspace; (2) It offers natural branching candidates as the integer variables with fractional values in a relaxation solution.

Key steps in the branch-and-bound procedure are summarized in **Figure 2**. As described in **Figure 2**, node selection in step 1 involves a tradeoff between achieving feasibility and optimality. The options available for node selection include depth first, breadth first, best first, limited discrepancy, and best estimate. When exploring nodes deep in a search tree, one is more likely to find integer feasible solutions and explore nodes that would be pruned by later feasible solutions. The method called plunging (as combined with those aforementioned) always choose a child node of previously explored node.

In step 2, the node relaxation step is ideally suited to dual simplex method. It involves only a small change from the parent relaxation solution (at the root node) and gives a new bound on the branching variable while maintaining dual feasibility of the previous basis. Thus, the solution is likely to be close to the previous basis.


#### **Table 2.**

*Algorithms available in IBM ILOG CPLEX integer optimization solver.*

*Recent Advancements in Commercial Integer Optimization Solvers for Business Intelligence… DOI: http://dx.doi.org/10.5772/intechopen.93416*

**Figure 2.**

*Key steps in the branch-and-bound procedure [9].*

Typically, a few dual simplex iterations are sufficient to restore optimality, and the cost per node is quite small. The subsequent step 3 entails generating cutting planes as needed to obtain a continuous (LP) relaxation solution.

Step 4 involves variables fixing using reduced cost. If the following condition as given by Eq. (5) holds at a branch-and-bound node:

$$\left| \mathbf{z}\_{\rm LP} + \left| \mathbf{D}\_{\dagger} \right| \right| \geq \mathbf{z}^\* \tag{5}$$

where *LP <sup>z</sup>* = objective value of LP relaxation solution at the root node, <sup>∗</sup> *<sup>z</sup>* = objective value of an incumbent (i.e., best known integer feasible solution), and *Dj* = reduced cost (marginal cost of releasing a variable from its bound), then we apply the strategy of fixing *xj* to its current value in this subtree of the search. The goal here as described by step 5 is to obtain integer feasible solutions which are similar to the relaxation solution.

Selecting an appropriate branching variable can significantly affect the search tree size, which is emphasized in the subsequent step 6. In this regard, the guiding principles are to make the important decisions early (as modeled by the integral branching variables) by being aware of the impact of both branching directions. To illustrate by using a factory building problem, such a decision involves whether to build a factory first while the decision on the number of lines to be placed in the factory can be made later. In general, we can predict the impact of a branch by considering variables that are furthest from their bounds which indicate maximum infeasibility. Thus, the impact for each branching candidate can be measured to allow for strong branching to be performed, e.g., by using historical information such as pseudo-costs.

Finally, in step 7, the main idea in propagating implications logically is to fix the binary variables to possible values during tree exploration and determine the binary variable values. Bound strengthening is used to tighten variable bounds.

Practical considerations render implementing branch-and-bound to be unsuitable for large scale problems chiefly because the number of iterations grows

exponentially with number of variables. Therefore in practice, a commercial business intelligence solver such as CPLEX uses a branch-and-cut procedure as a modification which applies model reformulation by using presolve strategies and adding cutting planes (or cuts) as shown in **Figure 3** with possible enhancement in practice around the root node computations [11].

#### **3.3 Presolve and cutting planes**

The original MIP formulation can be improved by tightening it with fewer constraints and variables thus entailing less data handling requirement (yet with the same solution quality). A tighter formulation also leads to a smaller difference between the space of the feasible continuous and feasible integer solutions, hence relying less on branching to refine the continuous relaxation computation. Two techniques are used: (1) presolve which combines preprocessing and probing strategies [12, 13]; and (2) cutting planes [14].

Presolve generates a new tighter improved model without size increase that is independent of the relaxation solution. Preprocessing aims to identify feasibility and redundancy while improving bounds (e.g., through rounding) while that of probing improves coefficients by fixing the binary variable values while checking for their logical implications. In both cases, we achieve a tighter model reformulation using similar steps of adding or replacing constraints that maintain the same integer solutions but with fewer continuous relaxation solutions. Adding a single constraint can produce an exponential number of tighter constraints. Such tighter constraints dominate the existing constraints without creating a larger problem. Note that reformulation solution is different from that of relaxation.

In contrast, we add a cutting plane (or valid inequality) to an existing model (typically the presolve-reformulated model) to remove a relaxation solution—this feature constitutes an important difference between the two techniques. Therefore, cutting planes introduce tighter constraints that cut off a particular relaxation solution and in so doing, achieves focused growth in model size.

*Recent Advancements in Commercial Integer Optimization Solvers for Business Intelligence… DOI: http://dx.doi.org/10.5772/intechopen.93416*

5 times is not uncommon) or runtimes (similarly by up to 10 times). On the other hand, cutting planes are available in numerous varieties with many valid types applicable for a particular model. Thus we need to identify relevant ones which serve to cut off appealing relaxation solutions. There is a need to strike a balance in terms of how many cuts to generate for a relaxation solution. Since we need to cut off relaxation solution only once, and it is expensive to resolve in obtaining a new relaxation solution for each cut added, we conduct multiple rounds of cutting plane generation while limiting the number of cuts per round in view of the increased model size [15].

#### **3.4 Heuristics**

Heuristics for solving MIP aims to produce good and possibly feasible solutions quickly without relying on branching in satisfying user demands for a problem. Thus, heuristics avoid exploring unproductive subtrees (in a branch-and-cut scheme) while exploring parts of tree that a solver typically will not. In doing so, heuristics help to prove optimality explicitly by pruning nodes more efficiently as well as implicitly by giving integer solutions [16].

Heuristics can be classified into two classes as available in a solver like CPLEX: (1) plunging (diving) heuristics, and (2) local improvement heuristics which explore interesting neighborhoods around potential solutions using search strategies such as local branching, relaxation induced neighborhood search (RINS), guided dives, and evolutionary algorithms for solution polishing. Plunging heuristics maintains linear feasibility in trying to achieve integer feasibility while local improvement heuristics operate conversely [17]. A typical strategy for heuristics applied at the root node involves the sequence shown in **Figure 4**.

Some considerations in applying plunging heuristics include tradeoffs of how many variables to fix per computation round and in what order. While it is computationally inexpensive to fix all variables rather than a few variables, LP relaxation solutions in the latter (not needed in the former) can guide later choices (e.g., on variable values and reduced costs). Variations in variable fixing order can be useful for diversification. On the other hand, a high-level structure of local improvement heuristics involves choosing integer values for all the integer variables, which produces linear infeasibility; iterating over the integer variables; and applying infeasibility metrics [16].

The effectiveness of heuristics is evidenced in that feasible solutions are found for most models before branch-and-bound is performed. Approximately 10% improvement in computational time to proven optimality has been reported [16]. Furthermore, heuristics often get solutions not obtained by branching.

#### **3.5 Combined local search and heuristics**

A combination of local search and heuristics offers a powerful optimization framework to solve difficult MIP or combinatorial optimization problems. Examples of local search methods include simulated annealing, tabu search, and genetic algorithms. Local search methods consist of the key strategies of neighborhood (i.e., considers a set of solutions in the vicinity of current solution); intensification (i.e., temporary focus on part of solution space), and diversification (i.e., mechanism to change focus occasionally). In applying local search to MIP, generally neighborhoods are based on the problem structure, e.g., nodes and edges in graphs with no high level structural information available in arbitrary MIP models [16]. A question that arises is how we can generate and explore an interesting

**Figure 4.** *Heuristics at root node.*

neighborhood given an incumbent solution. In this regard, two methods are available, namely local branching [18] and relaxation induced neighborhood search (RINS) [19].

#### **3.6 Parallelization**

Parallelization is available in an integer optimization solver such as CPLEX, which encompasses the MIP solution engine, barrier algorithm, and concurrent optimization techniques for solving LP and QP problems. In the instance of CPLEX, parallelization involves launching several optimizers to solve the same problem the process stops when the first solver reaches a solution. Within a branch and bound scheme, parallelization involves solution of the root node and nodes as well as strong branching in parallel [20].

#### **3.7 Solution pools**

The motivation to consider solution pools lies in the value of having more than one solution due to inaccurate data, approximations in model formulations, or inability of a model to capture the full essence of a problem. Thus, solution pools aim to generate and keep multiple solutions by using various options and tools that involve collecting solutions within a given percentage of optimal solution or those with diverse solutions and properties. However, difficulty is noted in implementing solution pools with the strategy of rolling horizon decompositions [17].

#### **3.8 Tuning tools**

As MIP solvers have multiple algorithm parameters which dictate their performance, the objective of tuning tool is to identify solver parameters that improve the performance for a given problem set. While default parameter values of MIP solvers are defined to work well for a large collection of problems, there is no such guarantee for a specific user problem [21].

#### **4. Use cases**

This section presents three use cases of applying commercial integer optimization solvers to implement and improve or enhance business intelligence applications. The model formulations for the use cases are implemented on GAMS modeling platform (and available in GAMS Model Library) from which the CPLEX solver is accessed.

#### **4.1 Use case 1: energy optimization**

The first use case presents a practical application of CPLEX as a standard solver for an energy business portfolio optimization problem for an electric utility company. For such electricity distribution public service, the problem involves to determine the amount to produce internally (i.e., in one's own power plant) and that to purchase

*Recent Advancements in Commercial Integer Optimization Solvers for Business Intelligence… DOI: http://dx.doi.org/10.5772/intechopen.93416*

externally (i.e., from the spot market or load following contracts). The problem formulation leads to a medium-to-large scale MILP model with size and computational statistics as described in **Table 3**. To accelerate solution convergence, several computational options are invoked including priority branching within a branch-and-bound procedure and multiple processing through parallelization (i.e., techniques introduced in the foregoing section). The computational results and implications as discussed in the cited reference demonstrates the applicability of the solver as an effective tool for 1-day ahead planning within a real-world electricity market in Germany.

#### **4.2 Use case 2: financial optimization**

The second use case involves financial optimization of risk management with commercial implications . The problem is amenable to be posed as an integer optimization model to capture an extensive set of rules and regulations that governs the delivery and settlement of mortgage-backed securities. The availability of reliable, robust, and efficient commercial integer optimization solvers alongside computing technology developments have facilitated the deployment and validation of such models with the computational statistics summarized in **Table 4**. The advancement achieved has led to optimization models including (if not particularly) integer programs to become essential omnipresent tools in current financial operations, which is comparable to the application of operations research and management science models in the domains of manufacturing, transportation, and logistics.


#### **Table 3.**

*Model size and computational statistics for use case 1.*


#### **Table 4.**

*Model size and computational statistics for use case 2.*


#### **Table 5.**

*Model size and computational statistics for use case 3.*

#### **4.3 Use case 3: manufacturing optimization**

The third use case concerns production planning for a manufacturing facility . The application can be formulated as a standard integer optimization model of an uncapacitated lot-sizing problem. The objective function seeks to minimize production cost in meeting market demand constraints with cost components on production, stocking, and machine setups. **Table 5** gives the model size and computational statistics for the largest problem instance solved for this use case.

#### **5. Conclusions**

Performance variability across commercial integer optimization solvers applied to business intelligence applications (such as that for the use case in Section 4) occurs due to opportunistic parallelization, use of heuristics particularly by invoking polishing option (which involves random seed), or simply numerical reasons. Variability may be observed in computational time, performance in terms of number of nodes and iterations, or solution quality. A main limitation of the applicability of integer optimization solvers typically pertains to the number of integer variables that can be handled within acceptable computational load or solution time. Therefore, it is worthwhile for future research in this area to consider further improvement in the mentioned areas [22, 23] towards achieving acceptable performance levels that are requisite and crucial for business intelligence applications.

#### **Acknowledgements**

This work is completed partly under support from UTP-UCTS private grant no. 015MD0-037.

*Recent Advancements in Commercial Integer Optimization Solvers for Business Intelligence… DOI: http://dx.doi.org/10.5772/intechopen.93416*

### **Author details**

Cheng Seong Khor1,2

1 Chemical Engineering Department, Universiti Teknologi PETRONAS, Perak Darul Ridzuan, Malaysia

2 Centre for Process Systems Engineering, Institute of Autonomous Systems, Universiti Teknologi PETRONAS, Perak Darul Ridzuan, Malaysia

\*Address all correspondence to: chengseong.khor@utp.edu.my; khorchengseong@gmail.com

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

### **References**

[1] Tsay C, Baldea M. 110th anniversary: Using data to bridge the time and length scales of process systems. Industrial & Engineering Chemistry Research. 2019;**58**(36):16696-16708

[2] IBM. IBM ILOG CPLEX Optimization Studio V12.9.0. 2020. Available from: https://www.ibm. com/support/knowledgecenter/ SSSA5P\_12.9.0/ilog.odms.studio. help/Optimization\_Studio/topics/ COS\_home.html

[3] Gurobi Optimization. Gurobi Optimizer Reference Manual. Beaverton, Oregon: Gurobi Inc.; 2020

[4] Byrd RH, Nocedal J, Waltz RA. KNITRO: An integrated package for nonlinear optimization. In: Pillo GD, Roma M, editors. Large-Scale Nonlinear Optimization. Springer; 2006. pp. 35-59

[5] Jünger M et al. 50 Years of Integer Programming 1958-2008: From the Early Years to the State-of-the-Art. Berlin Heidelberg: Springer-Verlag; 2010. p. 804

[6] Rardin RL. Optimization in Operations Research. New Jersey: Prentice-Hall; 1998

[7] Williams HP. Model Building in Mathematical Programming. 4th ed. Chichester, West Sussex, England: John Wiley & Sons; 1999

[8] Danna E. Performance variability in mixed integer programming. In: Workshop on Mixed Integer Programming 2008 (MIP 2008). New York City, NY: Columbia University; 2008

[9] Rothberg E. The CPLEX library: Mixed integer programming. In: 4th Max-Planck Advanced Course on the Foundations of Computer Science (ADFOCS 2003). Saarbrücken, Germany: Max-Planck-Institut für Informatik; 2003

[10] Land AH, Doig AG. An automatic method of solving discrete programming problems. Econometrica. 1960;**28**(3):497-520

[11] Lima RM, Grossmann IE. On the solution of nonconvex cardinality Boolean quadratic programming problems: A computational study. Computational Optimization and Applications. 2017;**66**(1):1-37

[12] Savelsbergh MWP. Preprocessing and probing techniques for mixed integer programming problems. ORSA Journal on Computing. 1994;**6**(4):445-454

[13] Wolsey LA. Integer Programming. Wiley-Interscience Series in Discrete Mathematics and Optimization. Chichester: Hoboken, NJ: Wiley; 1998. pp. 203-258

[14] Nemhauser G, Wolsey L. The theory of valid inequalities. In: Integer and Combinatorial Optimization. Hoboken, NJ: Wiley; 1988. pp. 203-258

[15] Rothberg E. The CPLEX library: Presolve and cutting planes. In: 4th Max-Planck Advanced Course on the Foundations of Computer Science (ADFOCS 2003). Saarbrücken, Germany: Max-Planck-Institut für Informatik; 2003

[16] Rothberg E. The CPLEX library: MIP heuristics. In: 4th Max-Planck Advanced Course on the Foundations of Computer Science (ADFOCS 2003). Saarbrücken, Germany: Max-Planck-Institut für Informatik; 2003

[17] Rothberg E. An evolutionary algorithm for polishing mixed integer programming solutions. INFORMS Journal on Computing. 2007;**19**(4):534-541

*Recent Advancements in Commercial Integer Optimization Solvers for Business Intelligence… DOI: http://dx.doi.org/10.5772/intechopen.93416*

[18] Fischetti M, Lodi A. Local branching. Mathematical Programming. 2003;**98**(1):23-47

[19] Danna E, Rothberg E, Pape CL. Exploring relaxation induced neighborhoods to improve MIP solutions. Mathematical Programming. 2005;**102**(1):71-90

[20] Lima R. IBM ILOG CPLEX: What is inside of the box? In: Enterprise-Wide Optimization (EWO) Seminar. Pittsburgh, PA: Carnegie Mellon University; 2010

[21] IBM. CPLEX Performance Tuning for Mixed Integer Programs. 2019. Available from: https://www.ibm.com/ support/pages/cplex-performancetuning-mixed-integer-programs

[22] Bixby R et al. MIP: Theory and practice—Closing the gap. In: System Modelling and Optimization: Methods, Theory, and Applications. Boston, MA: Kluwer Academic Publishers; 2000. pp. 19-49

[23] Bixby R, Rothberg E. Progress in computational mixed integer programming—A look back from the other side of the tipping point. Annals of Operations Research. 2007;**149**(1):37-41

#### **Chapter 7**

## Recent Advances in Stock Market Prediction Using Text Mining: A Survey

*Faten Subhi Alzazah and Xiaochun Cheng*

#### **Abstract**

Market prediction offers great profit avenues and is a fundamental stimulus for most researchers in this area. To predict the market, most researchers use either technical or fundamental analysis. Technical analysis focuses on analyzing the direction of prices to predict future prices, while fundamental analysis depends on analyzing unstructured textual information like financial news and earning reports. More and more valuable market information has now become publicly available online. This draws a picture of the significance of text mining strategies to extract significant information to analyze market behavior. While many papers reviewed the prediction techniques based on technical analysis methods, the papers that concentrate on the use of text mining methods were scarce. In contrast to the other current review articles that concentrate on discussing many methods used for forecasting the stock market, this study aims to compare many machine learning (ML) and deep learning (DL) methods used for sentiment analysis to find which method could be more effective in prediction and for which types and amount of data. The study also clarifies the recent research findings and its potential future directions by giving a detailed analysis of the textual data processing and future research opportunity for each reviewed study.

**Keywords:** machine learning, deep learning, natural language processing, sentiment analysis, stock market prediction

#### **1. Introduction**

Stock market prediction aims to determine the future movement of the stock value of a financial exchange. The accurate prediction of share price movement will lead to more profit investors can make. Predicting how the stock market will move is one of the most challenging issues due to many factors that involved in the stock prediction, such as interest rates, politics, and economic growth that make the stock market volatile and very hard to predict accurately. The prediction of shares offers huge chances for profit and is a major motivation for research in this area; knowledge of stock movements by a fraction of a second can lead to high profits [1]. Since stock investment is a major financial market activity, a lack of accurate knowledge and detailed information would lead to an inevitable loss of investment. The prediction of the stock market is a difficult task as market movements are always subject to uncertainties [2]. Stock market prediction methods are divided into two

main categories: technical and fundamental analysis. Technical analysis focuses on analyzing historical stock prices to predict future stock values (i.e. it focuses on the direction of prices). On the other hand, fundamental analysis relies mostly on analyzing unstructured textual information like financial news and earning reports. Many researchers believe that technical analysis approaches can predict the stock market movement [3–5]. In general, these researches did not get high prediction results as they depend heavily on structured data neglecting an important source of information that is the online financial news and social media sentiments. These days more and more critical information about the stock market has become available on the Web. Examples include BBC, Bloomberg, and Yahoo Finance. It is hard to manually extract useful information out of these resources. This draws a picture of the significance of text mining techniques to automatically extract meaningful information for analyzing the stock market. In this research, the most crucial past literature was reviewed, and a major contribution was made to the subject of using text mining and NLP for market prediction.

We revealed the finding of the selected studies to show the significantly improved performance of stock market forecasting via many machine learning methods. This study also clarifies the recent innovation researches and its potential future contribution. Comparisons and analyses of different researches are made on the financial domain of market prediction that can help to establish potential opportunities for future work. In this research, we also focused on the promising results accomplished by machine learning methods for analyzing the stock market using text mining and natural language processing (NLP) techniques.

In contrast to the other current survey articles that concentrate on summarizing many methods used for forecasting the stock market, we aim to compare many machine learning (ML) and deep learning (DL) methods used for sentiment analysis task of social media and financial news articles to find which method could be more effective in prediction. **Figure 1** represents the reviewed study framework. The rest of this work is organized as follows. Section 2 provides a review of background concepts that are needed to be known before the detailed analysis of the literature. Section 3 illustrates the relationship between stock market prediction and text mining. Section 4 includes a review of the machine learning main methods used for stock market prediction based on textual resources. Section 5 explained the least frequently used algorithm for stock prediction based on text mining. Section 6 describes the reviewed work text sources and period and number of collected items. Section 7 contains the reviewed works finding, limitation, the measurement used, and future work. Finally, Section 8 concludes this paper.

**Figure 1.** *The reviewed study framework.*

### **2. A review of background concepts**

Our work defined the following concepts as important to understand this research topic.

#### **2.1 Sentiment analysis**

Sentiment analysis uses text mining, natural language processing, and computational techniques to automatically extract sentiments from a text [6]. It aims to classify the polarity of a given text at the sentence level or class level, whether it reflects a positive, negative, or neutral view [7]. In stock market prediction task, two important sources of the text are used either social media mainly using Twitter data or online financial news article.

#### *2.1.1 Twitter sentiment*

Twitter is a significant source of data, and many researchers have examined its relationship with stock market movements [8]. While each tweet is restricted to 140 characters, it is believed that the information can accurately reflect public mood [9].

#### *2.1.2 Online financial news sentiment*

Financial news articles are perceived to be a more consistent and reliable source of information. Many researchers suggested that the financial news articles have a strong relationship with stock market fluctuation; therefore, analyzing financial news reports can help in predicting the stock market movements [10]. In [11], the author used a unified latent space model to examine the relationship between stock prices and news article releases. The result indicates a good return accuracy, which proves that news article analysis has an important impact on stock market movement.

#### **2.2 Textual data preprocessing**

Textual data need to be prepared before used by the machine learning algorithm for sentiment analysis task using these methods.

#### *2.2.1 Feature extraction*

Feature extraction or sometimes called attribute selection aim to select features, attributes, or piece of text that is more relevant to the prediction task. Many methods have been used for feature selection. The commonly used feature selection procedure for document or sentence classification task is the bag-of-words (BOW) approach, which was recently used for market prediction by many authors [12–14]. In the mentioned model, each word in a text or document will be treated as a feature neglecting the grammar or word order and only preserving the abundance. The second most popular method used recently for the feature selection process is Word2vec [12]. In this technique, the aim is to learn word embedding using a twolayer neural network. The input to that neural network is a text, and the output is a group of vectors (i.e. the input is a corpus and the output is a vector of words).

Another important feature selection method is the latent Dirichlet allocation (LDA) technique used recently for market prediction in [13]. In the LDA model, the text is viewed as probabilistic collections of terms or words, and the collections are then treated as selected features. Other researches [12, 14] used a Skip-Gram model

that aims to predict the context word (surrounding words) for a given target word. However, feature selection is a crucial step in the textual data preprocessing, and many other strategies may also be used for text analysis.

#### *2.2.2 Feature representation*

After feature selection, every feature must be illustrated by a numeric value so that it can be analyzed by machine learning techniques. The most common technique of feature representation is a binary representation (BR), which is a number system that uses two values such as 0 and 1 exclusively to represent the information. This technique has been exploited for market prediction researches by many authors [15–17]. The second most popular method used in text mining for financial application is the term frequency-inverse document frequency (TF-IDF), which is a numeric value that represents the significance of a word for a document or corpus that is used recently by many authors [12, 18]. Other feature representation methods can also be used successfully in text preprocessing, and we will discuss those with more details in the following sections.

#### **3. The relationship between stock market prediction and text mining**

Many papers study the relationship between stock price movements and the market sentiments, and the most relevant studies will be discussed in this section.

Ref. [19] examined the ability to use sentiment polarity (positive and negative) and sentiment emotions selected from financial news or tweets to predict the market movements. For sentiment analysis, they have collected a large dataset of the top 25 historical financial news headlines in addition to a large set of financial tweets collected from Twitter. Furthermore, they collected stock historical price data for many S&P 500 companies and used the close price as an indicator of the stock movements. For evaluation, they used the Granger causality test [20] that is a statistical test technique commonly used to reveal causality in time series data and explore if one-time series data can predict the other. For sentiment analysis, the authors examined two machine learning methods SVM and LSTM. The experiment result illustrated that in some cases sentiment emotions contribute to Granger-cause stock price fluctuation, but the finding was not inclusive and must be examined for each case. Also, it has been revealed that for some stocks, adding sentiment emotions to the machine learning market prediction model will increase the prediction accuracy. Comparing the two machine learning methods, SVM achieved better and more balanced results, and that's because the size of the dataset is quite small to be sufficiently used with SVM.

Another paper [21] examined the efficiency of using sentiment analysis of microblogging sites to forecast the stock price returns, volatility, and trading volume. The extracted intraday data from the two sources of information, Twitter and StockTwits, were collected for 2 years. For the evaluation, the authors used five famous stocks, namely, Amazon, Apple, Goldman Sachs, Google, and IBM. Prices were represented every 2 min, and the sentiment data were collected for the same period span of each trading day. To find the links between stock price outcomes and tweet sentiment, they applied Granger causality analysis. The experiments indicate that there is a causal link between Twitter sentiments and stock market returns, volatility, and volume. Among all five stocks, market volatility and volume seem to be more predictable than market direction or return.

In [22], the author exploited a multiplex network approach to study the correlation between market movements and social media sentiments. The proposed

#### *Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*

model merges information from two sources of data: Twitter posts and market price data. The authors selected 100 of the biggest capitalized companies of the S&P 500 index for a 5-year period from May 2012 to August 2017. In their model, they suggested that financial network correlation was established by the integration of the two techniques. The first one suggests that two stocks tend to be associated if they share joint neighbors. The other techniques suggest that two connected stocks usually remain connected in the future. The findings demonstrated that a multiplex network approach incorporating information from both social media and financial data can be used to forecast a causal relationship framework with high accuracy.

The authors in [23] investigated the ability of economic news to predict Taiwan stock market returns. The proposed model used text mining techniques throughout many steps. Firstly, they converted the textual news into numerical values. Secondly, they append the resulting numerical variable to regression models with macroeconomic attributes to examine the role of news articles in predicting stock price returns. The model also defines specific keywords and calculates the number of positive, negative, and neutral words in each news text and then converts them into three news attributes, which are then fed to the regression model. The experiments find that adding news articles was able to reduce the root mean square error (RMSE) that proves that the economic news has crucial impacts on market returns. The experiments also indicate that negative news has more influence on the stock market returns than positive news articles.

The study proposed in [24] aims to analyze whether tweet messages could be used to predict future trends of stocks for particular companies listed on the Dow Jones stock market, focusing on 12 companies related to 3 distinct and crucial economic branches in technology, services, and health care. The authors gathered the company's market data and Twitter posts for a 70-day period for analysis. The companies of each category were chosen based on the volume of messages that mention the company names on the StockTwits website. The study illustrates that some of the proposed ad hoc forecasting models well predict the next day direction of the stock movements for some companies with 82% of success and there is no unified method to be used with all cases. The results also indicate that more volume of a tweet will yield better prediction results. Moreover, the study proved the robust correlation between tweet's posts and the trend movements for some companies.

Overall, past studies indicate that there is a strong relationship between market movements and information published in news and social media. The information on social media contributes to enhancing the prediction models with all of the discussed papers. The evaluation of event sentiment may affect the market returns further and boost the outcome of forecasting.

#### **4. Machine learning for market prediction**

Recently, many research studies used machine learning via text mining innovation methods to successfully predict the stock market changes, and the most significant ones are going to be discussed in this section.

#### **4.1 Support vector machines**

Support vector machines (SVMs) are a supervised machine learning model used extensively in classification and regression tasks. SVM is a hyperplane that divides a collection of documents into two or more classes with a maximum margin [25].

SVM was first applied to the text classification task by Joachims [26]. In his approach, the author used a limited vocabulary as the feature collection by using a list of the most occurred words and discard of uncommon words from the feature set. Utilizing 12,902 documents from the Reuters-21578 document group and 20,000 medical summaries, the author compared the effectiveness of many machine learning techniques such as SVM and Naive Bayes (NB). For both document groups, the experiments demonstrated that the SVM achieve better classification result compared to NB classifier.

For stock market prediction, many research papers used the SVM for text classification and sentiment analysis. Combining both textual information and historical stock prices for stock market prediction [27] research applied the SVM to forecast the Chinese stock direction and stock prices between the years 2008 and 2015. For text mining, the authors formed a stop word and sentiment dictionary based on a specific domain. In the study, there were two kinds of input. The first one includes 2,302,692 news items, whereas the other contains only stock data of the largest 20 Chinese stocks based on trading volume. Support vector regression (SVR) is used to predict stock price, and support vector classification (SVC) is exploited to predict stock direction. The result indicates that both audience numbers and news quality have a crucial impact on the stock market. Moreover, for SVC, the direction accuracy was 59.1734%, which illustrates better progress than other works. The result also indicates that news articles have an important effect on the stock market fluctuations.

Another research [28] introduced a stock market prediction framework. For sentiment analysis, the researchers used two financial sentiment dictionary, namely, the Harvard IV-4 sentiment dictionary (HVD) and Loughran and McDonald (LMD) [29] financial dictionary. The dataset consists of 5 years of historical Hong Kong Stock Exchange prices and financial news collected from January 2003 to March 2008. For text classification SVM was used for training. Experiments indicate that the techniques with sentiment analysis perform better than a bag-of-words model in accuracy measures. It also revealed the small difference between the two models LMD and HVD. For LMD the accuracy was 0.5527, whereas HVD accuracy was 0.5460, which indicates that the two dictionaries can be used effectively for the market prediction task.

Another paper [30] developed a model to predict three stock price directions with 1-day, 2-day and 3-day lag. The dataset contains financial news of SZ002424 stock from September of 2012 to March of 2017. In order to analyze the structure of news and get the hiding information inside the contents, the authors proposed a semantic and structural kernel (S&S kernel). The kernel was based on SVM and uses medical industry news for evaluation. Experiments find that the proposed kernel can reach up to 73% accuracy when predicting the price trend with 2-day lag, which proves that content structure hidden in daily financial news can predict the stock market movements. The result also reveals that financial news has an important influence on stock movements that typically last for 2–3 days.

In the work of [31], the authors used a lexicon-based approach to predict the stock market based on Twitter user feelings. The authors used historical stock data in addition to Twitter messages to predict DJIA and S&P 500 indices movements. Twitter data were obtained to train support vector machine and neural networks (NN) for 7 days. The dataset was created by adding a normalized set of tweets that contains 8 categories of emotions in about 755 million tweets. The collected tweets were downloaded from the period of February 13, 2013, to September 29, 2013. For sentiment analysis, a dictionary approach has been created manually by an expert in the field. The best average accuracy was obtained by using the SVM algorithm to forecast the DJIA indicator with an accuracy equal to 64.10%. However, using NN to predict S&P 500 achieves only a 62.03% in accuracy measure, which proves that

#### *Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*

SVM performs better than the NN algorithm for market prediction. Moreover, the results achieved by the model indicate that it is possible to increase the prediction accuracy using human sentiment analysis and a lexicon-based approach.

In the paper of [15], the authors proposed a model with the user interface to predict the market movement for 1 day ahead. The proposed model consists of historical stock prices, technical indicators, Wikipedia company pages, and Google news. The model employs three machine learning methods to compare and select from, namely, ANN, SVM and decision tree (DT). The model concentrates on forecasting the AAPL (Apple NASDAQ ) stock movement for a period from May 1, 2012, to June 1, 2015. For the APPL prediction case study, the authors used SVM recursive feature elimination (RFE) to choose the most important features. RFE is applied via backward choosing of predictors relying on feature importance ranking. Combining many data sources, the financial expert system achieves 85% accuracy in prediction. The result indicates that incorporating data from multiple sources will improve the efficiency of market prediction.

In [32], the author introduced a method to predict the stock movement for 1 day ahead. The proposed technique used a manually labeled corpus. The dataset contains 16 randomly selected stocks that are commonly discussed by StockTwits users collected from the period of March 13, 2012, to May 25, 2012. The collected tweets were about 100,000 posts. For text analysis, the model used SVM to analyze sentiment in StockTwits. The results prove the outstanding performance of SVM for sentiment classification tasks with accuracy that can reach up to 74.3%, whereas the overall accuracy for predicting the market up and down change based on the suggested model was 58.9%.

From the findings recorded in **Table 1**, it can be noted that SVM efficiency surpasses the effectiveness of approaches that used neural network models as we discussed earlier.

#### **4.2 Deep learning**

A deep learning concept is derived from machine learning methods that utilize many layers of data processing for the extraction of features, patterns, and classification. Recently, deep learning techniques are launched to sentiment analysis tasks, and they are considered effective in most cases [33].

In [34] the authors investigated whether deep learning methods can be modified to improve the accuracy of StockTwits sentiment analysis. Several neural network variants such as LSTM, doc2vec, and CNN were examined to discover stock market sentiments posted on StockTwits. The results prove that the convolutional neural network is one of the best deep learning methods for predicting authors' sentiment in the StockTwits dataset. Many other types of research discussed the successful use of deep learning for sentiment analysis and natural language processing tasks. On the survey research in [35], some of the different methods used in sentiment analysis tasks are compared. The main result showed the excellent performance of deep learning methods for sentiment analysis, in particular, CNN and LSTM methods.

Another paper [36] proposed a method to predict the French stock market based on sentiment and subjectivity analysis of Twitter data. The author applied a simple feedforward neural network to analyze tweets and predict CAC40 index movements for the next day. The Twitter collected data for the period of February 27, 2013, to June 16, 2013, was about 25,930 tweets. In addition to Twitter data, Martin also used historical stock market prices for the CAC40 index and other stocks. The results yield a direction accuracy of 80%, which indicates that using a neural network can be used successfully to predict the stock market movements.


**Table 1.** *Support vector machine for stock market prediction based on text mining studies.*

**108**

#### *4.2.1 Artificial neural networks*

Artificial neural networks are a subset of deep learning technology that falls within the large artificial intelligence domain, and it mimics the human brain and its nervous system work. The simplest form of artificial neural networks is a feedforward neural network where the data go through the different input nodes until they reach the output node using only one direction, which is obtained by using a categorizing activation function.

In [37], the authors proposed a market investment recommendation system to predict intraday stock returns. The authors tested many prediction methods to find the best resulting algorithm. The dataset includes 72 S&P 500 companies for evaluation. Using both historical market data with financial news, the authors implemented the modeling technique many times to select the best model. For the first time, they have applied a feedforward neural network algorithm. For the second time, they used a stepwise logistic regression (SLR). For the third time, they implemented the decision trees with a genetic algorithm (GA) proposed by [38]. The best result was obtained by using the neural network prediction technique, which indicates that the NN algorithm is profitable for any initial investment. The result also confirms that combining market data with financial news can predict the market movement with better accuracy.

In [39] the producers predicted the stock market movements based on sentiment analysis of comments and tweets extracted from Twitter and StockTwits famous social media sites. User comments are classified into four different categories, which are up, down, happy, and rejected. The market data of the popular companies like Apple, Microsoft, Oracle, Google, and Facebook was collected from the period of January 1, 2015, to February 22, 2016. Both market data and polarity data were fed to an artificial neural network to predict the movements of the stock. The best prediction result was obtained for Apple Company with MSE equal to 0.14.

In [40], the proposal adopted a two-layer RNN-GRU technique to forecast the Chinese stock market movements. The model exploited sentiment analysis of Sina Weibo (a very popular Chinese social network) news and posts. The authors constructed their sentiment dictionary using user posts on the website. The authors also collected stock prices of the Shanghai Shenzhen 300 Stock Index (HS300) to use as an input to the recurrent neural network (RNN) model with gated recurrent units (GRU). The experiments revealed that the news and posts on Sina Weibo can predict the market movements with MAE equal 0.625 and with MAPE equal to 9.38.

In [13] the authors proposed a multi-source multiple instance (M-MI) model to predict the stock market index movements. In the proposed frameworks, the authors collected data from multiple resources, namely, quantitative data of Shanghai Composite Index historical prices for each trading day, financial news data to extract events, and social media data taken from Xueqiu (a famous trader social network in China to explore user sentiments user posts). Then, the analyzed sentiments, events, and the stock historical data are given as input to the M-MI model to make the prediction. For event extraction, the authors used HanLP (the popular method used for text parsing to grab the syntax of a sentence). Event extracted is used to feed the Restricted Boltzmann Machines (RBMs), which is a creative theoretical artificial neural network. In the model, the authors also examined the importance of specific sources to the index movements by giving them specific weights. The proposed framework prediction accuracy was about 60%, which reveals many findings. Firstly, the integration of features from multiple resources can make a more effective prediction. Secondly, both news events and

market historical data have a more important effect on stock movements than social media sentiments. Thirdly, both news events and quantitative data have larger impacts on stock fluctuations than using sentiments alone.

Recently [41] applied a technique to forecast the stock directions. The authors used sentiment analysis of news headlines in addition to historical market data of Apple stock to predict the market trend. Hive ecosystem was used to preprocess the data, and the naive Bayes classifier was utilized to calculate the sentiment scores. With two inputs from news headlines sentiment score and historical numeric market data, the multilevel perception artificial neural network (ANN) is applied to forecast the stock movements. In the training procedure, the authors used backpropagation, and in the output layer, they used the identity function. Moreover, the model tested two different periods for training the data; in the first method, they trained a 3-year data period, and the second method trained a year data period. The result represents an accuracy of 91% in the first methods, while 98% accuracy was achieved in the second method, which indicates that stock price forecasting is more efficient for a shorter time.

More recently [42] predicted future market trends by using both market historical prices and financial news article sentiments as input to the neural network. The authors collected historical prices of the 20 biggest companies listed in the NASDAQ100 index to predict the fluctuations of the stock for the portfolio that consists of 20 firms historical stock prices, with a periodicity of 15 min, obtained from Google Finance API. For new article analysis, two approaches of feature selection have adopted the dictionary of Loughran and McDonald (2011) (L&Mc) and affective space [43]. The Loughran and McDonald dictionary is commonly used for market prediction and consists of many critical words for the classification task that represents negative, positive, and uncertain sentiments that can be found commonly in financial news, whereas affective space (AS) dictionary is a vector space dictionary that depends on the similarity and relationships between words as natural language processing methods. For dimensionality reduction, the affective space mapped each term to a 100-dimensional vector that allows concepts to be grouped based on their semantics and relations.

The proposed model with Loughran and McDonald's dictionary confirms to be more effective, resulting in an annualized return of 85.2%, while the use of affective space feature dictionary as an input to the neural network model proved to be more effective in obtaining high accuracy results. **Table 2** summarizes the studies that used NN extensively for market prediction techniques.

#### *4.2.2 Recurrent neural network*

Recurrent neural network is an important variant of artificial neural network that starts as normal with front direction but preserves the relevant data that may need to be utilized later. In other words, every node will act as a memory cell that remembers some information it had in the earlier step.

A well-known variant of RNN model is long short-term memory (LSTM), which was proposed by Hochreiter and Schmidhuber in 1997 [44]; it is a standard recurring neural network that solves the exploding gradient problem. LSTM can depict the long dependencies in a sequence by adopting a memory unit and a gate mechanism to determine how information stored in the memory cell can be used and updated [45]. Each LSTM is a set of cells or system modules that catch and store streams of data. The cells represent a transport line that carries data from the past and collects them for the present module from one module to another. Through the use of certain gates in each cell, data can be disposed of, filtered, or added for the next cells [46].


#### *Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*

**Table 2.**

In the paper of [47], the proposal adopted a method to predict the stock market movements based on the bidirectional gated recurrent unit (BGRU), which is considered a variant of LSTM. The model used financial news that comes from Reuters and Bloomberg websites and historical stock prices to predict the market fluctuation with a better result. The S&P stock prices and news data were collected in the period of 2006–2013. Also, the model examined the method performance on the individual stock that comes from different sectors, namely, Google Inc., Walmart, and Boeing. In the proposed method, the authors used the word embedding model introduced by [48] to select the most efficient features from the collected financial news. In word embedding model, the words were encoded as vectors in a highdimensional space, and then the analogy between words in meaning is interpreted to closeness in the vector space. The proposed model achieved accuracy equal to 59.98% in the S&P 500, whereas individual stock prediction accuracy was more than 65%. The authors also examined the performance of many LSTM variants like standard LSTM, GRU, and BGRU. The finding shows that BGRU obtained the best results compared to other LSTM variants.

However, conventional LSTM is unable to detect what is the most crucial part of the sentence for the sentiment categorization task. Therefore, [49] proposed a design mechanism capable of detecting the crucial part of the sentence related to a specific aspect and explained the architecture of attention-based LSTM in detail.

To predict the stock market directional movements, [50] proposed an Attentionbased LSTM model (AT-LSTM) to predict the movements of Standard & Poor's 500 index and individual companies' stock price using financial news titles. The attention techniques were divided into two classes. The first class of attention assigns there weight to the news that contains positive sentiments to the stock market such as "raise," "growth," etc. While the second class of attention assigns there weight to the news that mentions the major companies in the S&P 500 such as "Microsoft" and "Google." Therefore, the attention model is trained continuously to assign more attention to the relevant news based on its content. The proposed method achieved more than 66% accuracy, and the company WALMART obtained a max accuracy of 72.06%. The results prove that attention mechanisms can achieve good results for market prediction in specific cases.

In [51] proposal support decision system based on deep neural networks and transfer learning was applied. To enhance the prediction accuracy, the authors pretrain the networks on a different corpus. The main aim of the study was to recommend the best deep learning techniques in terms of market prediction. The system provides its corpus with a length of 139.1 million words. The authors trained the deep neural networks by using the Adaptive Moment Estimation Algorithm (Adam), which can effectively solve sparse gradient problems. Then the use of transfer learning aims to initialize the weights of parameters with values that might be close to the optimized ones. In order to account for unbalanced classes in their dataset, they have used classification balanced accuracy that can be defined as the arithmetic mean of sensitivity and specificity. They also predicted the direction of nominal returns. The result proves that LSTM models surpass all traditional machine learning models based on the bag-of-words technique, specifically when they used transfer learning to pretrain word embeddings.

Recently [12] examined the effect of financial news articles on stock trend fluctuation either rise or fall. The financial new articles related to the Taiwan 50 Index were collected from Google. For textual data analysis and NLP tasks, the authors used their lexicon and then exploited the LSTM to make the final prediction. The use of LSTM features was joint with historical data and adjusted in each step. The results prove that individual stock prediction using the study polarity lexicon was better than the benchmark model. Moreover, the proposed model reaches an

#### *Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*

accuracy of 76.32, 80.00, and 77.42% for each of the following stocks TSMC, Hon Hai, and Formosa Petrochemical, respectively, which reveals the effectiveness of the LSTM model in market prediction based on text analysis.

Another study proposed in [52] examined the effectiveness of using the LSTM technique to predict market movements, using market data and textual resources as input to the model. The authors analyzed user sentiments from forum texts about the CSI300 index using the naive Bayes algorithm and then using LSTM, which contains a merged layer, a ReLU layer, and a softmax layer to combine the investor sentiment taken from forum posts with the historical market. The fall or rise trend prediction accuracy achieved was 87.86%, outperforming other commonly used machine learning methods such as SVM algorithm by at least 6%, which highly indicates that LSTM can achieve a better result in prediction when using larger datasets. **Table 3** summarizes the recent studies that used RNN networks for stock market prediction based on text analysis.

#### *4.2.3 Convolutional neural network (CNN)*

Convolutional neural network used for natural language processing was first explained by Collobert and Weston in [53]. A typical convolutional neural network is composed of multiple convolutional layers at the bottom of a classifier. Conventional inputs for text processing are characters, phrases, paragraphs, or documents that are converted into a matrix representation. Each row of the matrix represents a token, which is typically a word or character [54].

In [16], framework proposal for stock market prediction based on long-term events and short-term events extracted from financial news articles about the S&P 500 index was applied. The collected financial news articles come from October


#### **Table 3.**

*Recent studies that concentrate on RNN variants for market prediction based on text analysis.*

2006 to November 2013, which was released initially by Ding et al. [55]. The long-term events represent events over the past month, while the short-term events represent events on the last day of the stock price fluctuate. The proposed frameworks train the extracted events using a neural tensor network and then a convolutional neural network to predict both the short-term and the long-term impact of extracted events on stock price fluctuations. The proposed framework examined two different ways for representing the input to CNN. The first method (WB-CNN) used word embedding as input and convolutional neural networks for prediction. The second method (EB-CNN) used event embedding as input and convolutional neural networks for prediction. The experiments achieve accuracy of 61.73% for WB-CNN, while the EB-CNN method achieved an accuracy equal to 65.08%, which illustrates that the proposed model is more effective in stock market prediction than other models that predicted the S&P 500 index based only on stock historical data analysis. The model also proves that CNN can extract the longer-term influence of financial news events than traditional feedforward neural networks.

In [17], writers proposed a model to predict the intraday stock market directional movements of the S&P index using financial news title and financial time series market data as input. The paper compared two commonly used deep learning methods, which are RNN and CNN algorithms using many text representation methods. The RNN method used in the paper was the LSTM model. The proposed model examined many types of text representation as an input to the CNN prediction model. The (W-CNN) represents a word embedding as input and a CNN as a forecast model. The (S-CNN) represents sentence embedding input and CNN forecast model. The (W-RCNN) word embedding input and RCNN forecast model. The (S-RCNN) represents sentence embedding input and RCNN forecast model. The (WI-RCNN) shows word embedding and historical time series input and RCNN prediction model. The (SI-RCNN) illustrates sentence embedding and historical time series data input and RCNN prediction model. Experiments on each of the previous models revealed that CNN is more effective than RNN on capturing


**Table 4.** *CNN use for stock market prediction based on text mining results.* semantic from new financial, and RNN is more efficient in capturing the context information for the stock market prediction. Moreover, the results prove that the sentence embedding for text representation is more effective than the word embedding. **Table 4** summarizes the studies that used CNN for stock market prediction based on sentiment analysis and NLP.

#### **5. Other machine learning methods**

Many other machine learning methods were used successfully and less frequently for market predation applications based on text mining. Summaries of these studies are illustrated in **Table 5**. In the study of [18], a method was proposed to predict the stock trend movements of three NASDAQ companies, namely, Yahoo Inc., Microsoft Company, and Facebook Inc. (FB Inc). The model used financial news sentiment analysis with historical stock data to predict the market with higher accuracy. The task is accomplished with two steps: Firstly, they used naive Bayes classifier to classify news sentiment into two classes, positive or negative. Secondly, to forecast the stock trend fall or raise, they used k-Nearest Neighbor algorithm (K-NN) (a clear algorithm that saves all possible instances of data and categorizes the new data based on a scale of closeness and is often used to classify a new data based on the current classification of its neighbors). The results show that the accuracies of sentiment analysis of news only can go up to 63%, while combining news sentiments with historical stock prices can achieve trend prediction accuracy up to 89.80%, which proves that adding historical stock prices to the classification model will be able to improve the prediction performance.

In the work of [56], the authors suggest a method to predict the daily up and down price fluctuation of four tech companies of NASDAQ stock, which are Apple (AAPL), Google (GOOG), Microsoft (MSFT), and Amazon (AMZN). The model analyze Twitter user messages in addition to three previous days of the stock price movement. The model constructs a named-entity recognition (NER) approach to identify and remove the noise of Twitter data. A decision tree approach was used to build the classification model. The proposed model achieved the highest accuracy of 82.93% in predicting the daily up and down changes of Apple Company, which indicates that using named-entity recognition method for noise removal of Twitter data can improve the accuracy results.

The research in [8] proposed a method to predict the stock market movements based on two feature extraction methods, using a novel aspect-based sentiment model to improve the prediction performance. The first methods tempt to excerpt hidden topics and sentiments together and use them for the prediction, while the aspect-based sentiment methods treat every message as a list of topics and correlative sentiment values. To build the prediction model, the authors used SVM with the linear kernel and collected data of 18 stocks for a period of 1 year from July 2012 to July 2013. Exploiting the aspect-based sentiment feature method obtained the best result with 54.41% average accuracy. The proposed model also proves to be 3.03% more effective than using the human sentiment method for stock movement prediction.

In [61] proposal a method to forecast the Indonesian stock movements based on Twitter sentiment analysis was introduced. Naive Bayes and random forest algorithm was used to find the user sentiments of the 13 most popular companies in Indonesia. The linear regression technique was used to build the prediction model. The highest accuracy was achieved by the categorization model using the random forest algorithm with 60.39% accuracy, whereas naive Bayes classifier was able to classify tweet data with 56.50% accuracy. For the price movement's prediction, the



#### *Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*

 *Summaries of machine learning methods that were used successfully and less frequently for market prediction based on text mining.*

**Table 5.** proposed models can predict the upcoming price fluctuation of either rise or fall with the accuracy of 67.37% achieved by the naive Bayes algorithm and 66.34% obtained by using Random Forest classifier.

Other research [62] introduced a stock market prediction service framework that allows users to choose different data sources and machine learning techniques. The authors gathered all news summaries and historical prices of all the stocks for a 1-year period. Using the Hong Kong market stock dataset for evaluation, they found that metric learning-based methods can improve the prediction results. The study also shows that adding news to the historical prices for stock market prediction will be more useful on large and popular stocks.

Recently [14] applied a numerical-based attention (NBA) method for multiple sources of stock market prediction. News headlines and numerical data combined to predict the stock prices. For evaluation, the authors collected news headlines and numerical data from two sources: the China Security Index 300 (CSI300) and the Standard & Poor's 500 (S&P500). They used NBAa-NBAd to denote different variations of the models with different textual representations. In these three datasets, the proposed structure accomplishes the best outcomes. Especially, NBAd raises the accuracy of 2.32 and 1.35% higher than the best baseline models on S&P500 and CSI300.

More recently, [64] investigated the effect of the most important event from 2012 to 2016 into the stock exchange prediction of four selected countries, which are the USA, Hong Kong, Turkey, and Pakistan. The events are then categorized into local and global events for each country according to their economic effects on the country stocks. Twitter data were gathered to find the sentiment for each one of these events. The model used a total of eight events for all countries. For classification, the authors investigated linear regression, support vector regression, and deep learning model for market prediction. The results revealed that linear regression achieves the worst prediction results compared to the other two methods used in their analysis, while the support vector regression achieves the best results. Event sentiment illustrates noted development in the forecasting results. For example, the US election 2012 event achieves the best prediction results in all methods, which indicates that a local event that appears in the USA has a very great effect on stock market future forecasting.

In [63], the authors predicted the Argentinian stock market by using online message boards with topic discovery methods in addition to daily historical stock prices. The authors exploited Latent Semantic Analysis (LSA) approach that finds the latent topics in the text. The experiments are trained with multiple combinations of features selected from online texts. The results show that the most predictive features are derived from the texts that contain the most relevant semantic content. Moreover, the experiments illustrate that combining LSA with ridge regression was able to identify the structure of the texts that later improves the prediction performance of the model.

In [57], the authors proposed a model that aims to find the influence of negative terms represented by the financial media on investor behavior. The proposed model relays on the counting of negative words from the dictionary and word counting methods to extract contextual information. The model also used a Latent Dirichlet allocation model to derive the financial media statements of negative influence. The model combines the two inputs in an ensemble tree to categorize the effect of financial media news on stock market fluctuation. The results indicate that there is a strong relationship between negative effect derived from financial media news and a company stock market fluctuation.

In the same year, authors in [58] suggested algorithm predicts 30 NASDAQ and New York stock exchange companies' movements. The algorithm used NLP

#### *Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*

methods to categorize Twitter messages. Then the authors applied association rules to find interesting rules and associations between the stock movements and the Twitter messages. The collected tweets were about 15 million Twitter messages. The big data then stored it in MongoDB, which is an open-source database used to save and process the huge data. The suggested method has explained the relationships hidden in social media as a graph with several layers, with the top layer, intermediate layer, and the bottom layer attributes to show the relations. The proposed method has increased the dimensionality of whole variables that would measure the hidden and embedded data among the Twitter messages. The results indicate the outstanding performance of using tweet message sentiment to predict the stock market movements 3 days later.

In [60], the researchers exploited the multiple kernel learning method to integrate data from the stock special (SS) and subindustry special (SIS) news items effectively to predict future market movements. Multiple kernel learning (MKL) applies many different kernels to learn from various sections of data. Pairs of Gaussian, linear, and polynomial kernels were used to compare each model performance. For evaluation, the authors used five stocks from the S&P 500 index that belongs to managed healthcare subindustry. The results indicate that using Gaussian, linear, and polynomial kernels jointly in MKL achieves higher prediction results. The results also indicate that exploiting two types of news increases prediction accuracy in comparison with models that used only a single news source.

The study in [59] combined information on historical stock prices with financial market news to enhance the market forecasting accuracy of intraday trading status. For evaluation the model used the Hong Kong Stock Exchange (HKEx) tick prices; more specifically the authors used 23 stocks in Hang Seng Index10 (HSI) intraday prices in the year 2001. Multi-kernel support vector regression (MKSVR) was used with two subkernels: one for the news items and the other kernel for the stock historical prices. The results indicate that MKSVR outperforms other benchmark models that exploited only one source of information.

The evaluation measurements vary in all of the reviewed works; some of the researches calculate accuracy, F-measure, or recall and precision with accuracy being the most commonly used. However, other researchers calculated the error in prediction using mean absolute percent error (MAPE), mean squared error (MSE), or root mean square error (RMSE).The variances in using different evaluation measurements and exploratory data make an accurate comparison between different models difficult to achieve.

#### **6. The reviewed work text source and period and number of collected items**

The textual data input comes from different several sources, and the period and the numbers of collected data are varied, and all are illustrated in **Table 6**.

The majority of writers have analyzed primary news websites like the Reuters and Bloomberg [16, 17, 37, 47, 50], Dow Jones [57], and Yahoo Finance [8, 18]. Most authors use financial news because it is associated with less noise compared to the general news. They either select the news text or the news headline as input to their machine learning model. Recently news titles and headlines are specifically extracted and are regarded to be more clear, concise, and associated with less noise [14, 16, 17, 50]. Other authors have examined less formal sources of news information such as Google News [12, 15]. Other researchers collect their textual information merely from social media websites especially Twitter to analyze the public user sentiments to predict the market more effectively [39, 56, 61, 64].



#### *Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*


#### **Table 6.**

*Summaries of the reviewed work text source, period, and number of collected items.*

Also, as **Table 6** illustrated, the data were collected in a variety of periods; some few papers collected data in several months, while others extracted data within a maximum of 7-year period, which resulted in more sufficient data and better results in prediction.

However, it can be noted that the insufficiency of highly structured datasets containing text data of markets prevents researchers from accumulating their analysis and assessment efforts with others. Another problem is the imbalanced dataset that has been used by many researchers, which is discriminating the accuracy of prediction. In future, potential researchers are encouraged to locate new datasets for market forecasting based on text mining analysis.

Market predictive text mining could become much more advanced by concentrating on a particular source of text, such as a specific social media website or the new news source from specialized financial news websites. As mentioned in Section 3 of this research, there is a strong relationship between the behavioral economics and the market fluctuations; due to this fact focusing on behavioral economics studies and its impact on market movements will be of great research opportunity in the future.

#### **7. The reviewed work findings, limitations, and future work**

Developments in sentiment analysis approaches and deep learning have enabled the development of stock market prediction systems to turn future web content, tweets and financial, and news contents into investment decision systems. Online text mining processes are evolving and have been intensively investigated using machine learning advancements, and this trend will continue to achieve progression especially for market prediction.

Many researchers believe that analyzing only the historical prices of the stock market will be able to predict the stock market movement [3–5]. However, other researchers combine both textual information with historical prices of stock to predict the stock market movements [8, 13, 15, 47, 62]. The previous studies' major limitation is that they depend heavily on either structured data (historical stock prices) or unstructured data (news articles or social media). However, for the researchers that used both structured and unstructured data, the major limitation for most of them is that they combined either news articles or social media with past stock prices to predict the stock movements and they neglect the critical impact of combining social media and financial news information's with time series market data to improve the forecasting results.






*Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*

#### **Table 7.**

*Summaries of the reviewed work findings, limitations, and future work.*

Moreover, as **Tables 2–5** indicate, the main trends in recent studies are to utilize deep learning methods instead of conventional machine learning to analyze the stock market textual information in the news or social media due to the advantages of DL that offer overconventional machine learning. DL promises enough amount of data and training time that conventional machine learning methods are unable to handle effectively.

Many recent studies only exploit sentiment analysis of textual data, and they neglect the important influence of historical stock prices, which affect their prediction accuracy results; this suggests that the incorporation of data from multiple sources will improve market prediction effectiveness. The more data fed into the prediction model, the better accuracy can be achieved.

Machine learning models described previously have been discussed to show how SVM and LSTM are highly preferred by investigators because of their high accuracy result in text classification and market prediction, whereas many other machine learning methods like K-nearest neighbors (k-NN), random forest (RF), linear regression, decision tree, artificial neural networks (ANN), etc. illustrate promising results for text mining and sentiment analysis task for market analysis but are least frequently used and need to be further investigated.

However, the reviewed work has some limitations; one of the main limitations is the insufficiency of highly structured datasets containing text data on markets for certain periods that researchers can use to integrate their analysis and assessment efforts; another problem is the imbalanced dataset that has been used by many researchers, which make discriminating result in prediction.

Future work should focus on predicting the movement of the stock market using structured data (past stock prices) along with textual data from different resources like financial news and social media. Moreover, to achieve better results

#### *E-Business - Higher Education and Intelligence Applications*

in predicting the stock market, the text mining procedure should improve feature selection, feature representation, and dimensionality reduction methods.

In general, many techniques will be able to improve the prediction methods such as adding the structural information to the prediction model, expanding the training period, using more effective expanded lexicons, adding different sources of information such as financial news articles, increasing the number of collected news for longer period, applying the deep learning models, upgrading the sentiment analysis task by increasing the words that may affect the stock movements more, using of more improved machine learning techniques for sentiment analysis such as Interdependent Latent Dirichlet allocation (ILDA), adding historical stock prices to the dataset with the news and social media information, and considering of event sentiments analysis as illustrated in **Table 7**.

#### **8. Conclusion**

Knowledge of stock movements by a fraction of a second can lead to high profits investors can make which makes stock market studies a major motivation for a researcher. The great advances and success of natural language process and sentiment analysis of online news based on machine learning and deep learning have gained huge popularity recently in the financial domain especially in market prediction models. This survey has discussed the recent current studies on market prediction systems based on text mining techniques with comprehensive clarifying of the model's main limitations and future improvement methods. The survey was undertaken on many major portions such as text preprocessing, machine learning algorithms, evaluation mechanisms, findings, and limitations associated with detailed discussion and explanation of the most successful used techniques. Moreover, this review provides a serious attempt to address the problem of market prediction based on the most recent text mining methods and provide a clear view of the future research direction. Recently, more extensive observations into the financial markets are required in the current dynamic world, since the absence of it can have a detrimental effect on the investments around the globe. It is therefore essential to undertake prediction models based on text mining research as a practical solution that can lead to a much greater degree of confidence in the understanding of market movements and make valuable investments. With the considerable amount of textual data available online, the need to build specialized text mining systems gradually evolves for each field of market analysis.

This study is intended to support other researchers to place the different theories in this research area more easily into practice and become able to make key decisions in the development of future models. The researches mentioned in this paper proved the effectiveness of text mining and sentiment analysis methods in predicting market movements. By comparing many ML methods such as SVM or decision tree and deep learning models like LSTM or CNN, we discussed some of these model's limitations and future work and debated the best result obtained by each one of these models. After all, the proposed survey displayed the need of improving the prediction methods such as adding the structural information, considering of event sentiments analysis, using more effective expanded lexicons, increasing the number of collected news, expanding the training period, applying the deep learning models, adding different sources of information, upgrading the sentiment analysis task by increasing the words that may affect the stock movements more, and using unified benchmark dataset and evaluation measures.

*Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*

#### **Author details**

Faten Subhi Alzazah\* and Xiaochun Cheng Department of Computer Science, Middlesex University, London, UK

\*Address all correspondence to: fatensubhi@gmail.com

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

### **References**

[1] Gupta A, Dhingra B. Stock market prediction using hidden Markov models. In: 2012 Students Conference on Engineering and Systems. IEEE; 2012. pp. 1-4

[2] Asadi S, Hadavandi E, Mehmanpazir F, Nakhostin MM. Hybridization of evolutionary Levenberg–Marquardt neural networks and data pre-processing for stock market prediction. Knowledge-Based Systems. 2012;**35**:245-258

[3] Saravanan S, Mala S. Stock market prediction system: A wavelet based approach. Applied Mathematics and Information Sciences. 2018;**12**:579-585. DOI: 10.18576/amis/120312

[4] Chung H, Shin KS. Genetic algorithm-optimized long shortterm memory network for stock market prediction. Sustainability. 2018;**10**(10):3765

[5] Long W, Lu Z, Cui L. Deep learningbased feature engineering for stock price movement prediction. Knowledge-Based Systems. 2019;**164**:163-173

[6] Agarwal B, Mittal N, Bansal P, Garg S. Sentiment analysis using common-sense and context information. Computational Intelligence and Neuroscience. 2015;**2015**

[7] Rajput V, Bobde S. Stock market forecasting techniques: Literature survey. International Journal of Computer Science and Mobile Computing. 2016;**5**(6):500-506

[8] Nguyen TH, Shirai K, Velcin J. Sentiment analysis on social media for stock movement prediction. Expert Systems with Applications. 2015;**42**(24):9603-9611

[9] Sun A, Lachanski M, Fabozzi FJ. Trade the tweet: Social media text

mining and sparse matrix factorization for stock market prediction. International Review of Financial Analysis. 2016;**48**:272-281

[10] Schumaker RP, Chen H. Textual analysis of stock market prediction using breaking financial news: The AZFin text system. ACM Transactions on Information Systems (TOIS). 2009;**27**(2):1-9

[11] Ming F, Wong F, Liu Z, Chiang M. Stock market prediction from WSJ: Text mining via sparse matrix factorization. In: 2014 IEEE International Conference on Data Mining. IEEE; 2014. pp. 430-439

[12] Chen MY, Liao CH, Hsieh RP. Modeling public mood and emotion: Stock market trend prediction with anticipatory computing approach. Computers in Human Behavior. 2019;**101**:402-408

[13] Zhang X, Qu S, Huang J, Fang B, Yu P. Stock market prediction via multisource multiple instance learning. IEEE Access. 2018;**6**:50720-50728

[14] Liu G, Wang X. A numerical-based attention method for stock market prediction with dual information. IEEE Access. 2018;**7**:7357-7367

[15] Weng B, Ahmed MA, Megahed FM. Stock market one-day ahead movement prediction using disparate data sources. Expert Systems with Applications. 2017;**79**:153-163

[16] Ding X, Zhang Y, Liu T, Duan J. Deep learning for event-driven stock prediction. In: Twenty-Fourth International Joint Conference on Artificial Intelligence; 2015

[17] Vargas MR, De Lima BS, Evsukoff AG. Deep learning for stock market prediction from financial news *Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*

articles. In: 2017 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA). IEEE; 2017. pp. 60-65

[18] Khedr AE, Yaseen N. Predicting stock market behavior using data mining technique and news sentiment analysis. International Journal of Intelligent Systems and Applications. 2017;**9**(7):22

[19] Mudinas A, Zhang D, Levene M. Market trend prediction using sentiment analysis: Lessons learned and paths forward. 2019. arXiv preprint arXiv:1903.05440

[20] Granger CW. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society. 1969;**1**:424-438

[21] Checkley MS, Higón DA, Alles H. The hasty wisdom of the mob: How market sentiment predicts stock market behavior. Expert Systems with Applications. 2017;**77**:256-263

[22] Souza TT, Aste T. Predicting future stock market structure by combining social and financial network information. Physica A: Statistical Mechanics and its Applications. 2019;**535**:122343

[23] Wu GG, Hou TC, Lin JL. Can economic news predict Taiwan stock market returns? Asia Pacific Management Review. 2019;**24**(1):54-59

[24] Bujari A, Furini M, Laina N. On using cashtags to predict companies stock trends. In: 2017 14th IEEE Annual Consumer Communications & Networking Conference (CCNC). IEEE; 2017. pp. 25-28

[25] Dumais S, Platt J, Heckerman D, Sahami M. Inductive learning algorithms and representations for

text categorization. In: Proceedings of the Seventh International Conference on Information and Knowledge Management; 1998. pp. 148-155

[26] Joachims T. Text categorization with support vector machines: Learning with many relevant features. In: European conference on machine learning. Berlin/ Heidelberg: Springer; 1998. pp. 137-142

[27] Xie Y, Jiang H. Stock market forecasting based on text mining technology: A support vector machine method. 2019. arXiv preprint arXiv:1909.12789

[28] Li X, Xie H, Chen L, Wang J, Deng X. News impact on stock price return via sentiment analysis. Knowledge-Based Systems. 2014;**69**:14-23

[29] Loughran T, McDonald B. When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance. 2011;**66**(1):35-65

[30] Long W, Song L, Tian Y. A new graphic kernel method of stock price trend prediction based on financial news semantic and structural similarity. Expert Systems with Applications. 2019;**118**:411-424

[31] Porshnev A, Redkin I, Shevchenko A. Machine learning in prediction of stock market indicators based on historical data and data from twitter sentiment analysis. In: 2013 IEEE 13th International Conference on Data Mining Workshops. IEEE; 2013. pp. 440-444

[32] Xu F, Keelj V. Collective sentiment mining of microblogs in 24-hour stock price movement prediction. In: 2014 IEEE 16th Conference on Business Informatics, Vol. 2. IEEE; 2014. pp. 60-67

[33] Uysal AK, Murphey YL. Sentiment classification: Feature selection based

approaches versus deep learning. In: 2017 IEEE International Conference on Computer and Information Technology (CIT). IEEE; 2017. pp. 23-30

[34] Sohangir S, Wang D, Pomeranets A, Khoshgoftaar TM. Big data: Deep learning for financial sentiment analysis. Journal of Big Data. 2018;**5**(1):3

[35] Singhal P, Bhattacharyya P. Sentiment Analysis and Deep Learning: A Survey. Bombay: Center for Indian Language Technology, Indian Institute of Technology; 2016

[36] Martin V. Predicting the French stock market using social media analysis. In: 2013 8th International Workshop on Semantic and Social Media Adaptation and Personalization. IEEE; 2013. pp. 3-7

[37] Geva T, Zahavi J. Empirical evaluation of an automated intraday stock recommendation system incorporating both market data and textual news. Decision Support Systems. 2014;**57**:212-223

[38] Goldberg DE. Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, MA: Addison-Wesley; 1989

[39] Khatri SK, Srivastava A. Using sentimental analysis in prediction of stock market investment. In: 2016 5th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO). IEEE; 2016. pp. 566-569

[40] Chen W, Zhang Y, Yeo CK, Lau CT, Lee BS. Stock market prediction using neural network through news on online social networks. In: 2017 International Smart Cities Conference (ISC2). IEEE; 2017. pp. 1-6

[41] Shastri M, Roy S, Mittal M. Stock price prediction using artificial neural model: An application of big data. EAI Endorsed Transactions on Scalable Information Systems. 2019;**6**(20)

[42] Picasso A, Merello S, Ma Y, Oneto L, Cambria E. Technical analysis and sentiment embeddings for market trend prediction. Expert Systems with Applications. 2019;**135**:60-70

[43] Cambria E, Fu J, Bisio F, Poria S. AffectiveSpace 2: Enabling affective intuition for concept-level sentiment analysis. In: Twenty-Ninth AAAI Conference on Artificial Intelligence; 2015

[44] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation. 1997;**9**(8):1735-1780

[45] Rao G, Huang W, Feng Z, Cong Q. LSTM with sentence representations for documentlevel sentiment classification. Neurocomputing. 2018;**308**:49-57

[46] Siami-Namini S, Namin AS. Forecasting economics and financial time series: ARIMA vs. LSTM. 2018. arXiv preprint arXiv:1803.06386

[47] Huynh HD, Dang LM, Duong D. A new model for stock price movements prediction using deep neural network. In: Proceedings of the Eighth International Symposium on Information and Communication Technology; 2017. pp. 57-62

[48] Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems; 2013. pp. 3111-3119

[49] Wang Y, Huang M, Zhu X, Zhao L. Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing; 2016. pp. 606-615

*Recent Advances in Stock Market Prediction Using Text Mining: A Survey DOI: http://dx.doi.org/10.5772/intechopen.92253*

[50] Liu H. Leveraging financial news for stock trend prediction with attentionbased recurrent neural network. 2018. arXiv preprint arXiv:1811.06173

[51] Kraus M, Feuerriegel S. Decision support from financial disclosures with deep neural networks and transfer learning. Decision Support Systems. 2017;**104**:38-48

[52] Li J, Bu H, Wu J. Sentiment-aware stock market prediction: A deep learning method. In: 2017 International Conference on Service Systems and Service Management. IEEE; 2017. pp. 1-6

[53] Collobert R, Weston J. A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning; 2008. pp. 160-167

[54] Ho CC, Baharim KN, Fatan AA, Alias MS. Deep neural networks for text: A review. In: The 6th International Conference on Computer Science and Computational Mathematics. Langkawi, Malaysia; 2017

[55] Ding X, Zhang Y, Liu T, Duan J. Using structured events to predict stock price movement: An empirical investigation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP); 2014. pp. 1415-1425

[56] Vu TT, Chang S, Ha QT, Collier N. An experiment in integrating sentiment features for tech stock prediction in twitter. In: Proceedings of the Workshop on Information Extraction and Entity Analytics on Social Media Data; 2012. pp. 23-38

[57] Moniz A, de Jong F. Classifying the influence of negative affect expressed by the financial media on investor behavior. In: Proceedings of the 5th

Information Interaction in Context Symposium; 2014. pp. 275-278

[58] Bing L, Chan KC, Ou C. Public sentiment analysis in Twitter data for prediction of a company's stock price movements. In: 2014 IEEE 11th International Conference on e-Business Engineering. IEEE; 2014. pp. 232-239

[59] Li X, Huang X, Deng X, Zhu S. Enhancing quantitative intra-day stock return prediction by integrating both market news and stock prices information. Neurocomputing. 2014;**142**:228-238

[60] Shynkevich Y, McGinnity TM, Coleman S, Belatreche A. Stock price prediction based on stock-specific and sub-industry-specific news articles. In: 2015 International Joint Conference on Neural Networks (IJCNN). IEEE; 2015. pp. 1-8

[61] Cakra YE, Trisedya BD. Stock price prediction using linear regression based on sentiment analysis. In: 2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS). IEEE; 2015. pp. 147-154

[62] Ghanavati M, Wong RK, Chen F, Wang Y, Fong S. A generic service framework for stock market prediction. In: 2016 IEEE International Conference on Services Computing (SCC). IEEE; 2016. pp. 283-290

[63] Gálvez RH, Gravano A. Assessing the usefulness of online message board mining in automatic stock prediction systems. Journal of Computational Scienc. 2017;**19**:43-56

[64] Maqsood H, Mehmood I, Maqsood M, Yasir M, Afzal S, Aadil F, et al. A local and global event sentiment based efficient stock exchange forecasting using deep learning. International Journal of Information Management. 2020;**50**:432-451

#### **Chapter 8**

## Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating the Data-Driven Value

*Ahmed A.A. Gad-Elrab*

### **Abstract**

Currently, business intelligence (BI) systems are used extensively in many business areas that are based on making decisions to create a value. BI is the process on available data to extract, analyze and predict business-critical insights. Traditional BI focuses on collecting, extracting, and organizing data for enabling efficient and professional query processing to get insights from historical data. Due to the existing of big data, Internet of Things (IoT), artificial intelligence (AI), and cloud computing (CC), BI became more critical and important process and received more great interest in both industry and academia fields. The main problem is how to use these new technologies for creating data-driven value for modern BI. In this chapter, to meet this problem, the importance of big data analytics, data mining, AI for building and enhancing modern BI will be introduced and discussed. In addition, challenges and opportunities for creating value of data by establishing modern BI processes.

**Keywords:** Business Intelligence, Big Data Analytics, Artificial Intelligence, IoT, Data mining, Data governance

#### **1. Introduction**

Recently, in the fourth industry revaluation, there is a very huge amount of created and generated data by computer machine such as GPS, sensors, website or application systems or by people through social media (twitter, Facebook, Instagram, or LinkedIn) [1]. Every moment, the data servers store huge amount of data which are produced by organizations. This is a huge amount of data comes from website, social media, tracking, IoT applications, sensors, and online news articles. Also, the advancement in computing and communication technologies have facilitated collecting a large volume of heterogeneous data from multiple sources. This data consists of structured and unstructured, complex and simple information.

Currently, business gets a revenue from the analysis of such data with unstructured form up to 80% [2]. So, the organization can improve the business productive process due to this analysis of unstructured data that contains valuable information. In addition, it is significant for education, security, healthcare, and manufacturing.

This can be achieved through big data analytics, artificial intelligence, and data management in order to achieve the business intelligence.

BI is the technologies, tools, systems, and applications for the compilation, analysis, combination, and exhibition of the business report with active business decision executing way. This way will give unlimited help to gain, learn and control their data to further decision making for developing business processes and procedures [3]. Also, BI can be described as the ability of a firm to make meaningful data using which is collected every day from business processes and operations [4].

Business intelligence (BI) plays an importance role for helping the decision marker to get the insights for improving productive or better and fast decision. In addition, BI can enhance and assist the effectiveness of operational rules and its impression on corporate-level decision-making, superintendence system, administration, budgeting, and financial recording which gives better strategic alternatives in dynamic business environments [5]. Also, BI can improve the organizational performance by identifying new opportunities, revealing new business insights, highlighting potential threats, and enhancing decision making processes among many other benefits [6, 7].

The first issue in business is big data management with various data formats which is the serious management problem due to that the current tools are not adequate for managing such massive big data volumes [8]. The new challenges in terms of data integration complexity, storage capacity, lack of governance, and analytical tools gives an importance for solving the big data management problem related to pre-processing, processing, security and storage. The big data management in abundant data, generated by heterogeneous sources for using in BI and decision-making, is a complex process. Therefore, some form of big data may be managed by 75% of organizations. The goal of managing big data is ensuring the effectiveness of security, storage, and analytics applications of big data [9].

Unfortunately, the practical implications using big data analytics in enhancing business intelligence remains comparatively immature and under-researched because the existing research models are mainly focused on the benefits and challenges of business intelligence and big data. So, the most important issues are studying the implications of big data analytics on business intelligence in for data collected from various sources and exploring the future directions to find further developments in use of big data analytics for business intelligence.

The second issue in BI is the determination of the most appropriate data mining technique, which is one of the most critical responsibilities. Based on business nature and difficulty suffered or object kind in the business there is a need for determining the data mining optimal technique [10]. In the data mining process, most core techniques identify the character of the reclamation option of data and its mining process. Based on the results, the data mining technique will be highly productive [11]. There are many data mining techniques as association rule, clustering, classification, decisions tree, and neural networks are profoundly successful and practical.

Data Mining is regarding to interpret the huge volume data and extract knowledge from its various objects. For some businesses, the purposes of data mining are recognized to identify different trims, develop marketing abilities, and predict the prospect based on earlier observations and modern inclinations. There is a requirement for examining the data for sustaining divestments and additional purposes of an entrepreneur. Furthermore, data mining could continue practiced for recognizing unusual performance and a strange behavior of representatives practicing on some technologies could be identified [12].

The third issue in BI is artificial intelligence (AI). AI is the main step in the technology evolution that has been actively pursued since British mathematician *Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating… DOI: http://dx.doi.org/10.5772/intechopen.97374*

and code breaker Alan Turing envisioned a clear way forward in his groundbreaking 1950 paper, "Computing Machinery and Intelligence." At the time, computer technology could not keep up with Turing ideas. But, due to the advancement in computing, AI was established. At Oxford University, the Future of Humanity Institute introduced a 2018 report for surveying a panel of AI researchers on timelines for Strong AI. This report found that in 45 years, 50% chance of AI will outperform humans in all tasks and in 120 years it will automat all human jobs. As well as, AI will bring many opportunities for creating new jobs. Also, removing the need to do tedious and repetitive tasks is one of the great values of AI, as many experts said. Instead, users can focus on their main skills and values. For reducing human error, shrinking labor costs, and subsequently increasing profit, the application of technology in many industries and business has been aimed. This was true for the advancements made during the fourth Industrial Revolution (FIR) on through to the birth of the computer, and still true for the era of AI.

In this chapter, the importance of big data analytics, data mining, AI for building modern BI and enhancing will be introduced and discussed. In addition, challenges and opportunities for creating value of data by establishing modern BI processes.

#### **2. Business intelligence (BI)**

Business Intelligence (BI) can be described as an automated process for deriving models and insights form raw data that are collected from heterogeneous data sources and are organized in a systematic way for improving business operations and processes. In enterprise BI architectures, the best practice is splitting the data collection and data organization processes that are associated with back-end architecture from data analysis and display to a user through the frontend. In BI, the processed transactions generate data, which are stored in Operational Data Sources called Online Transaction Processing servers (OLTP). With OLTP, the data is stored in a structured data repository called data warehouse after extraction and transformation processes. With data warehouse, there are different query optimization techniques can be applied for speeding-up of data analysis and running the analytics query. To achieve this speed-up, data warehouse creates subsets of the data warehouse called data marts. Also, reporting mechanisms for accessing transaction data stored in data warehouse are used in traditional BI systems. Therefore, analyzing these transaction data can help us for detecting patterns and predicting business trends.

Recently, the data sources of BI are not only traditional data sources as transaction data, but they include modern data sources as mobile devices and sensor data, and web messages which were sent by company intranets and profiles of employees and customers. Most of modern data sources are unstructured, for example, posted messages in online social networks (OSN) and data from various sensors. Therefore, the main challenge is how to maintain these modern data sources as traditional relational database and achieve query efficiency. From the data analysis perspective, additional data means additional opportunities for discovering more insights. However, the big data challenges remain the big problem from the analytic perspective.

Due to the increase in data, there are expanded opportunities within the scope of BI, which is not only a mechanism to analyze historical data trends, but it can combine data from sensors and other real time personal information for inferring insights that are not commonly available that is called situational BI [13]. For business operations, BI is called operational BI, which provides insights in real time to

**Figure 1.** *A traditional BI system.*

these operations as getting instant feedback for a call center operation as benefits from their work. In addition, the analytics rules may be composed depend on metainformation of the exposed data to his/her which can be considered as a self-service BI. Therefore, these new BI approaches must be managed carefully such that the compliance models and governance of enterprise are not violated.

The three-tier architecture of traditional BI system is shown in **Figure 1**. This architecture consists of three layers: 1) Presentation layer, 2) Application layer, and 3) Database layer. The main challenge with this three-tier architecture, is how to fulfill service level objectives such as minimal throughput rates and maximal response time. This is because, the data storage management at the low-lever layers is hidden from the application layer which makes some difficulties to predict execution times.

However, traditional BI systems are efficient in extracting and analyzing data, but they are rigid, slow, time-consuming, and requiring knowledge experts for maintenance. Therefore, many research works have been done for adding modern features to improve the three-tier architecture, which will establish the next generation BI.

#### **3. Modern business intelligence (MBI)**

In the traditional BI platforms, the main goal is giving answer "What happened?" Question by providing the efficient analyses. While, the BI modern platforms are giving the answer for "What is happening, what will happen, and why?" which offers the ability to monitor and obtain a continuous development of organization within fast analytics, while for accomplishing objectives of mission using predictive analytics.

Traditional business intelligence platforms over the past two decades have mainly succeeded to provide users with historical comprehensive reports and easy-to-use custom analysis tools. Due to the underlying data architecture, which consists of a central data storage solution such as an enterprise data warehouse (EDW), the availability of BI functionality is largely. EDWs form the backbone of traditional data management platforms and usually connect vast network systems of data source into a central data warehouse. The data is then consolidated, refined, and pulled into different reports and dashboards after converting data in EDW to display old business information, such as weekly revenue metrics or quarterly sales. *Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating… DOI: http://dx.doi.org/10.5772/intechopen.97374*

Although, this traditionally BI provides a basis for these types of dashboards and interim reports.

While users have gained immense value from traditional platforms for historical reports capabilities, there are more users now require data analysis technologies that need direct access to data without depending on IT professionals. Federal agencies highlighted the following challenges associated with traditional BI solutions in analytics [13]:


**Figure 2.** *Grows of BI platforms based on insights: Hindsight to insights to foresight [14].*

Integration of traditional and modern BI Platforms is essential to laying the groundwork for enterprise-wide data transformation and organizations are truly concerned for getting rid of IT infrastructure and starting over. Data warehouses play a major role in existing data platforms, which provide the data that fully cleaned, organized, and managed for most businesses and companies. The data warehouse gives business managers, executives and others ability to obtain insights from historical data with relative ease without deep technical knowledge. The obtained data from data warehouse is very accurate due to careful testing, IT cleaning, and accurate knowledge of data layers. However, traditional BI challenges create a demand to increase EDW with different form of optimized architecture for fast access to ever-changing data: Lake Hadoop Data.

Organizations look to upgrade their platforms of analytics are beginning to adopt the data lakes concept. Data lakes store information in its raw and unfiltered form, whether structured, semi-structured, or unstructured. Unlike the standalone EDW, the data lakes themselves perform little of the automated data cleaning and transfer operations, allowing data to be swallowed more efficiently, but they transfer the greatest responsibility for preparing and analyzing data to business users.

Data Lakes can offer a low-cost solution by using Hadoop's Distributed File System (HDFS) for efficiently storing various types of data and analyzing them in their original structure. As shown in **Figure 3**, a data lake coupled with the data warehouse to identify the next generation of BI and provide the optimal basis for data analysis.

In the system shown in **Figure 3**, EDW receives system data from different sources through the ETL process (Extract, Transform, and Load). After the data is cleaned, transformed, and standardized, it will be ready for analysis by a diverse group of users using dashboards and reports.

In the interim, a data lake collects raw data from single or multiple source systems or all systems, and the data is absorbed and ready for discovering or analyzing processes. The result: a broader user base for exploring and creating relationships between vast amounts of various data for individual analyzes, upon request.

#### **3.1 Features of modern BI**


*Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating… DOI: http://dx.doi.org/10.5772/intechopen.97374*

#### **Figure 3.**

*Data sources, data warehouse, and data analytics in modern BI [14].*

to know whether natural disasters have affected their contract suppliers. Recognizing natural disasters and enable businessmen to take appropriate measures to reduce losses [16].

3.*Self-Service BI* (SSBI)*:* it enables end users for generating analyzes and analytical queries without involving of the IT department. In SSBI, the user interface of applications must be easy to use and intuitive, therefore technical knowledge of the data repository is not needed. In addition, the user should be allowed for accessing or expanding data sources organized by IT, but also nontraditional sources.

#### **3.2 Data architecture**


#### *E-Business - Higher Education and Intelligence Applications*

#### **Figure 4.**

*Classification of database systems.*

In this architecture, each workload of OLAP must wait until the data in the date pool is completely refreshed and visible which will cause delays. Today, for reducing the delay, BI operating systems execute OLTP and short-term analytical queries together on the DBMS, as shown in **Figure 4-b**. These workloads are called short OLAP workloads. However, long-term OLAP workloads may be conflicted with many short OLTP transactions that make changes to the database. So, high synchronization is needed to deal with resource competition, which produces lower utilization of all resources.

Also, the commercial database management system (DBMS) uses special techniques as shadow copy [17], for handling mixed workloads with lower overheads. That is, on different logical versions of the data, different workloads will be separated and performed. Therefore, the additional space may be increased, which increases the infrastructure costs and requirements. Therefore, in current diskbased DBMSs a major challenge is managing these mixed workloads (OLAP and OLTP) [18].

#### **3.3 Current BI systems**


*Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating… DOI: http://dx.doi.org/10.5772/intechopen.97374*


**Table 1.**

*Systems of modern BI that use various methods to hold most or all the data in the main memory.*

**Table 1** shows systems of modern BI that use various methods to hold most or all the data in the main memory for obtaining high OLTP productivity. For example, a distributed set of shared devices is used to run H-Store system, where the data is completely located in the main memory. The H-Store can execute transaction processing at high productivity rates, by removing traditional DBMS features as buffer management, lock and close. Recently, the H-Store prototype was marketed by a startup called VoltDB [19].

• "Hybrids with on-disk database": The main-memory has become big enough for handling most OLTP databases, nevertheless this may not constantly be the best choice. For OLTP workloads by using access patterns, where some records are "cool" (rarely or not accessed at all), others, "hot" (accessed frequently). So, the coldest records are stored on fast secondary storage devices in the modern systems to ensure good performance. For example, Stoica and Ailamaki [19] suggested a way to migrate primary memory DB data to cheaper and larger secondary storage. In [20], for improving major memory heart rates and reducing I/O operating system migration, relational data structures are reorganized using access statistics for workloads of OLTP. Recently, Siberia was introduced as a cold data management framework in Microsoft Hekaton IMDB [21]. Like [19], it does not require storing an entire database in the main memory.

Hekaton focuses on how records are migrated to and from a cold store and how records are accessed and updated in a cold store in a consistent manner for transactions. So, only some tables can be declared and managed in the main memory by Hekaton. Experience evaluation shows that when cold storage is located on commodity ash, Siberia can lead to an appropriate productivity loss of 7–14%, given that cold data access rates are for an improved main memory DB.

2.*Modern features* of *BI Systems:* There are three modern information survey indicators: operational biological investigation, situational temporary survey, and self-service self-examination. Whereas, the H-Store system is only for OLTP transaction processing, a modern system called HyPer can handle it mixed workloads of both OLTP and OLAP are extremely high throughput rates using a low-overhead mechanism to create differential shots [22]. This system is used an unlocked approach which allows all OLTP transactions to be carried out in sequence or on special sections. In parallel with OLTP processing, HyPer system performs OLAP queries on the same shot and consistent.

Castellanos et al. [23] proposed a new platform called to notify business managers to situations that could affect their business. SIE-OBI integrates the functions

required for exploiting relevant rapid flow information from the web. They proposed new schemes for extracting and linking information that obtained from the web with the stored historical data in the data warehouse to reveal position patterns. The relevant information is extracted only from two or more different unstructured data sources, usually one stream of internal slow text and stream of external fast text. This time and effort minimization platform were built to build slow and fast data streams that integrate structured and disorganized flows, and to analyze them in almost real time.

#### **3.4 Data governance**


The next generation of BI supports almost insights of real-time with using of external information that generates a large data amount and its manipulations. So, this requires very mature DG for providing data quality, reliability, and integrity.

*Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating… DOI: http://dx.doi.org/10.5772/intechopen.97374*

**Figure 5.** *Framework of data governance as defined in Dama I [27].*

The three characteristics are crucial to extract accurate insight through techniques of data mining. For example, in a "self-service" BI (for example Tableau and QlikTech), allows users to discover insight from many data sources without modeling the data environment and implementing complex ETL operations, which is one of the most time-consuming and difficult tasks in BI. So, these new features allow users to easily access data, get quick results and visual data visualization. To enable the evolution of the next generation of biological information, data management is critical to the reliability of data from the discovered vision. For example, in the case of BI self-service, the fact that end users can access and process their data reduces the reliability of BI results [32]. In data management, useful functions to ensure reliability can be considered such as tracking data ratios to source and creating records of how data is processed or transferred. However, integrating data governance into the next generation of biological information has faced some challenges due to the requirements of flexible and reliable responses while there is an enormous amount of external data and public user engagement.

3.*Data governance challenges*: There are two main advantages to the next generation of biological information that affects the data management model. Decision making in the next generation BI, should be more effective and faster between a huge amount of data that comes from many data formats and sources. However, data from many sources makes data management more difficult to manage and sophisticated to control properly. This can also lead to ineffective decisions being made. In case of data comes from different conflicting sources, more research and analysis of the data and the different sources of that data to determine what is true and accurate or its approximation must be done by the decision-maker, which will be costly operations. Therefore, management of data across heterogeneous sources in the next generation BI system is very important. In the next generation of personal information, especially "selfservice" business users participate in procedures of decision-making.

In general, the central IT organization and many data supervisors have been involved in data management initiatives and have a metadata repository for the data management platform and a set of data management tools to deal with varied data. In advance, they standardize common data definitions of master data and reference data that are widely shared across many enterprise applications. When they receive

#### *E-Business - Higher Education and Intelligence Applications*

disparate data, they match it to define predefined shared data, determine its quality, determine which rules, convert, and merge them. However, in the next generation of BI, users also select, manipulate or merge their data names themselves using various "self-service" tools of BI. They may want to upload to the DB and share their vision with others. Participation of business user in the data process can lead to data in a mess where the same data can be converted and combined in various ways through data managers and a central organization using tools of data management and by business users who have tools of BI for "self-service". Consequently, metadata sharing criteria are crucial through this sharing to transfer shared data, shared data names, and shared integration rules [33].

4.*Data governance model for next generation BI*: The data management model design is designed to centralize versus. Decentralization and hierarchy versus cooperative. Central design assigns all decision-making authority in the central IT department while decentralized design assigns authority to individual business units [25].

The term big data is a group of huge and complex data sets from various sources where data the management and traditional application processing techniques face difficulties to process it. Big data is a collection of a large amount of structured or unstructured data that is processed and analyzed for informed decision-making or evaluation. These data can be taken from various sources including browsing history, geographic location, social media, medical records, and purchasing record. Big data is made up of complicated data that will smash the processing power of traditional simple database systems [34]. In [35], the authors mentioned that, there are three main characteristics associated with big data: (1) Volume is a feature used for describing the vast data amounts that big data uses. Usually, the range of data amounts starts from GB to YouTube. Big data should be able to handle any data amount even with its highly anticipated growth. (2) Variety is a feature used for describing various types of data sources that are used as portion of a large data analytics system. Currently, there are many data storage formats used by computers all over the world. One format is the structured data such as databases and. Csv, video, short message service (SMS) and excel papers. Unorganized data can be in the handwritten notes form. All data from these sources will be ideally used for Big Data Analytics. (3) Velocity is a feature used to describe the speed at which data is generated. It is also used to describe the speed at which generated data is processed. With the click of a button, an online retailer can quickly view big data about a specific customer. Speed is also important to ensure that data is updated and updated in real time, allowing the system to perform at its best. This speed is necessary as real-time data generation helps organizations accelerate operations. Which can save institutions a large amount of money.

Today, many companies are increasingly interested in using technologies of big data to support their BI, so that it becomes very important to understand the different practical issues from previous experiences in BI systems. Today's BI systems sense the world and harness these data points for recommending the best possible options and forecast results, accurately. As BI systems continue to be built in real time, the demand for data collection, integration, processing, and visualization increases almost in real time. BI systems are characterized by high sensitivity opportunities as seen in sensors with the rich diversity of sensors ranging from mobile phones, personal computers and health tracking devices to technologies of Internet of Things (IoT) designed to give contextual and semantic sound to entities that could not previously contribute Intelligent in key decisions. So, many companies are analyzing big data today.

#### *Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating… DOI: http://dx.doi.org/10.5772/intechopen.97374*

Big data analytics is needed and is machine learning techniques because of often distributed data sets, and its privacy and size considerations are evidence of distribution techniques, where data is on platforms with different computing capabilities and networks. The benefits of application diversity and big data analytics pose challenges. As an example, every hour the servers of Walmart handle more than million transactions for a customer, and this information is stored into databases that contain larger than 2.5 petabytes of data, which is 167 times the number of books in the Library of Congress. Herein, CERN's Collider Hadron Collider produces around 15 petabytes of data annually, and that is enough to fill over 1.7 million double layer DVD discs annually [36]. Big data analytics are used for education, health care, media, insurance, manufacturing and government. Big data analyzes of business intelligence and decision support systems that enable healthcare organizations to analyze data size, diversity and tremendous speed have been developed across a wide range of healthcare networks to support evidence-based decision-making and action [37]. Therefore, it is clear from the discussion that data management and big data analytics [38] are important in BI for 4 reasons:


The rapid development of business intelligence and analysis attracted the attention of researchers. The reason is that organizations no longer rely on traditional technologies as data grows exponentially. This huge amount of data requires advanced analytical techniques in order to convert it into valuable information that helps organizational growth. BI&A is the contemporary methodology for extracting value from this vast amount of data, driving strategic decision-making, and forecasting and benefiting from future opportunities.

BI&A is necessary in most organizations. BI&A has proven effective support in decision making. In addition to that data and IT infrastructure is clearly influenced by the good use of BI&A practices. Nowadays, business intelligence and analysis have played a vital role in most institutions and sectors due to their value and benefits. BI&A helps organizations gain a better view of their private data and thus improves fact-based decision-making. These methodologies and data analysis also help to maintain competitive advantage in addition to resolving technical and quality problems that will enhance the performance and productivity of enterprises [42, 43].

According to Abai et al. [44] BI&A helps to build an integrated framework that supports speeding up organizational performance. Many factors and technological developments have shaped the past and present trends of BI&A. With the rapid development of technology, it is not enough to use traditional analytical techniques. The future direction of business intelligence and analysis will expand to include areas of diversity. According to Chen et al. [45]. The success opportunities associated with data analysis technologies have generated future interest in business intelligence and analytics. Additionally, BI&A contains different practices and methodologies that can be applied to different sectors; Health care, security, market intelligence, e-government, and others. According to Mohammed and Westbury [46] BI&A is contributing to future development systems. By mapping all the facts, BI&A has become soon biotechnology in developing cities by supporting real-time information that will turn countries into smart cities.

One of the most important responsibilities in the data mining process is choosing the appropriate data extraction technology. The nature of work and the type of object or difficulty experienced by the work provides appropriate guidance for identifying the best techniques [47]. Application of data mining techniques There are some generalized approaches that can indicate enhanced efficiency and cost-effectiveness. Many of the basic techniques that are performed in the data mining process, determine the nature of the mining process and the option of data recovery.

Artificial intelligence (AI) represents a step in the evolution of technology that has been actively pursued since the British mathematician and code-breaking Alan Turing was conceived as a clear way forward in his pioneering research of 1950, "Computing and Intelligence." At the time, computer technology could not keep up with Turing's ideas. But as computing advanced, Amnesty International advanced. Most of the artificial intelligence that we see today is narrow artificial intelligence (ANI), which means it can perform a well-defined task. A 2018 report by the Future of Humanity Institute at Oxford University has surveyed a group of AI researchers in the schedules of strong AI. She found "50% chance of artificial intelligence outperforming humans in all tasks in 45 years and automating all human functions in 120 years." However, AI will bring with it many opportunities to create new business opportunities as well. As many experts have pointed out, one of the great values of artificial intelligence is its ability to eliminate the need for strenuous and repetitive tasks. Alternatively, users can focus on their core values and skills. Technology was applied in many industries mostly aimed at reducing human error, reducing labor costs, and thus increasing profit. This was true of the progress made during the Industrial Revolution until the birth of the computer, and still true of the emergence of artificial intelligence.

Artificial intelligence has advanced significantly in the past few years due to a number of factors, starting with a massive increase in the computing power available. The once-trained AI model now takes days or even hours with machine learning (more on this soon). Another factor is wider data access. You may have heard that the data is "new oil" or something similar. However, the data must be processed using advanced tools such as analyzes and machine learning algorithms to reveal useful information. This processing is where the AI in BI becomes an invaluable tool.

Machine learning is the engine of artificial intelligence systems. It strengthens artificial intelligence models by analyzing complex data sets. Machine learning enhances models by analyzing complex data sets through a set of self-acquired rules and knowledge as shown in **Figure 6**. The machine learning model learns from big data and from frequent human interactions so that it can provide information and answers related to the user's interests or goals. Big data refer to very large data sets

*Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating… DOI: http://dx.doi.org/10.5772/intechopen.97374*

that can be mathematically analyzed to reveal patterns, trends, and correlations, especially about human behavior and interactions. In the space of artificial intelligence, deep learning represents a major leap forward in technology. As we just touched on, programmers write a code that directs the device how to interpret a series of words, pictures, or commands to reach a decision and execute an order. The end user then introduces the entry (data), while internal engineers may define more specific rules for interpreting and analyzing that data. Finally, the system provides outputs (analysis) based on the specific inputs and defined rules. In [48], the authors proposed a demand-forecasting model by BI with machine learning.

#### **3.5 Why BI needs AI?**

Does it matter if the constant awareness of the original, or is the copy going to be alive anyway 19? For better or worse, the future comes faster than we realize. There will be no before or after artificial intelligence but a slow transition for a decade or more. As we have seen with Google Glass, it's currently impossible to guess what acceptable results would look like. But how much can we trust in our future assistants? Will they work with us or unknown entities? If we do not ask the right questions now, we'll get the default app. It will be free, but what will include small prints? Good morning John. Here's today's program. Any questions? Perhaps it does not matter after all: Using a good learning algorithm, the program will know what we need and what we need to do, better than we can ever guess. The power of statistics will win the war against the gods and we will lose our soul. It is known that job candidates can lose their chances in a decisive way when they think that no one is watching by bad behavior or rejection of reception staff and waiting staff. Once NLPs and other AIs are widespread, it will not be long before the same literature test is introduced. Looking at 2050, the future of humanity lies in the transition to a civilization of the first kind. We are type 0, extinct. We are about to become halfgods. Most likely, we will merge with our own processing technology and each of us will have our own virtual world to dominate it with absolute control in every aspect

of it, and the countless millions of planets of "life" that we may control or merge with as well. Just as video game programmers have absolute control over the worlds they create. Immortals, omniscient and omnipresent, are all capable of our universes. Of course, he can explore this universe as well, maybe contact directly with his creative being and know that we are characters in his game. Our last question will be morality and maturity. Will we only have one universe? Or does force drive us into madness and transform us into "invaders of the universe" and penetrate the universes of others, based on greed, against the desire for more force? Will we be good? Or evil? Or both? Will we be able to achieve wisdom, and secure peaceful and harmonious coexistence with all other demigods, or will we go to war? Or will we merge into one excessive force? Or are we tired one day from the divine and start the final game again, and transform ourselves into a universe that we will have to evolve for billions of years for us to be re-created one day? Maybe this is exactly what is happening.

#### **3.6 Improving BI with AI**

In this section, we explore how BI's AI raises and improves the way of an organization that are used for analysis and interpretation the lifeline of its business.


*Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating… DOI: http://dx.doi.org/10.5772/intechopen.97374*


AI in BI reduces data cleansing and contact preparation and provides massive aspirin for headaches. By setting up data automatically (one of the biggest artificial intelligence in saving time), you can move from making data available to working with it in minutes, instead of hours or days. The future AI function will allow users to enter structured and unstructured data without skipping any win; A big change since most of the data being created today - such as photos, videos and audio - is disorganized. Removing barriers to effective analysis is one of the ways in which the advanced AI in BI tool helps users who are not data scientists to access and interpret their data.

5.*Gaining Competitive Advantage (GCA):* AI now makes a critical difference between the companies that enable it to succeed and those that will be left behind soon. Gartner predicts that by 2021, 75% of pre-prepared reports - such as those used to extract data - will be either replaced or strengthened using automated insights. The robust AI in BI tools also provides improved accuracy for critical operational use reporting. If they do not, the data and analytics leaders should plan to adopt Enhanced Analyzes (AI) immediately in their business as the capabilities of the platform mature. Rita Sallam, vice president of Gartner Research, warned at a recent conference that "data and analytics leaders should examine the potential impact of business" from increasing reliance on predictions using enhanced and automated insights "and adjusting business and business models accordingly, or risking losing the competitive advantage of Those who do. " AI are already offered in BI solutions today, and those companies that adopt technology are poised to succeed more safely than those that do not. By uncovering trends and correlations in data and proposing ways to interpret results in natural language along with providing the best coordination for presenting these results, AI saves time and provides actionable insights to increase profitability and avoid potential problems before they arise.

#### **4. Conclusion**

In this chapter, the traditional and Modern BI were reviewed in detail which became a critical and important process and received a great interest in both industry and academia fields. So, the data management, data mining and machine learning techniques are needed for extracting the insights from big data. By using such techniques, business intelligence gets better decision making, cost reduction, new products and services and understand the market conditions. In addition, the importance of big data analytics, data mining, AI for building modern BI and

enhancing were introduced and discussed. Also, challenges and opportunities for creating value of data by establishing modern BI processes were described and how AI raises and improves the way an organization analyzes and interprets the lifeline of its business are explored. In the future work, we will study more AI tools to enhance the processes of BI and solving the cybersecurity problems in modern BI.

### **Acknowledgements**

This research was supported by the Department of Mathematics, Faculty of Science, Al-Azhar University, Cairo, Egypt. In addition, it was partially supported by King Abdul-Aziz University, Jeddah, Saudi Arabia. I thank both for providing guidance to finish this research. I also thank IntechOpen Limited for giving the opportunity for publishing this research work as a book chapter in E-Business.

### **Author details**

Ahmed A.A. Gad-Elrab1,2\*

1 King Abdul-Aziz University, Jeddah, Saudi Arabia

2 Faculty of Science, Al-Azhar University, Cairo, Egypt

\*Address all correspondence to: asaadgad@azhar.edu.eg

© 2021 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating… DOI: http://dx.doi.org/10.5772/intechopen.97374*

#### **References**

[1] Wael M.S. Yafooz Abidin, S. Z., & Omar, N. (2011, November). Challenges and issues on online news management. In Control System, Computing and Engineering (ICCSCE), 2011 IEEE International Conference on (pp. 482-487). IEEE.

[2] https://technative.io/ unstructured-data-the-hidden-threatin-digital-business/

[3] Balachandran, B. M., & Prasad, S. (2017). Challenges and Benefits of Deploying Big Data Analytics in the Cloud for Business Intelligence. Procedia Computer Science, 112, 1112-1122.

[4] Kimble, C. and Milolidakis, G. Big Data and Business Intelligence: Debunking the Myths. Global Business and Organizational Excellence. 35, (2015), 23 – 34.

[5] Richards, G., Yeoh, W., Chong, A. Y. L., & Popovic, A. (2017). Business intelligence effectiveness and corporate performance management an empirical analysis. Journal of Computer Information Systems, 1-9.

[6] Xia B.S. and Gong P. Review of business intelligence through data analysis. Benchmarking: An International Journal. 21, (2014), 300-311.

[7] Kowalczyk M. and Buxmann P. (2014). Big Data and Information Processing in Organizational Decision Processes: A Multiple Case Study. Business & Information Systems Engineering. 5, (2014), 267-278.

[8] Wael M.S. Yafooz Abidin, S. Z., & Omar, N. (2011, November). Challenges and issues on online news management. In Control System, Computing and Engineering (ICCSCE), 2011 IEEE

International Conference on (pp. 482-487). IEEE.

[9] Siddiqa, A., Hashem, I. A. T., Yaqoob, I., Marjani, M., Shamshirband, S., Gani, A., & Nasaruddin, F. (2016). A survey of big data management: Taxonomy and state-of-the-art. Journal of Network and Computer Applications, 71, 151-166.

[10] Fahad, S. A., & Alam, M. M. (2016). A modified K-means algorithm for big data clustering. International Journal of Computer Science Engineering and Technology, 6(4), 129-132.

[11] Fahad, S. A., & Yafooz, W. M. (2017). Design and Develop Semantic Textual Document Clustering Model. Journal of Computer Science and Information Technology, 5(2), 26-39. doi:10.15640/jcsit.v5n2a4 .

[12] Thuraisingham, B. (2014). Data mining technologies, techniques, tools, and trends. CRC press.

[13] A. L¨oser, F. Hueske, and V. Markl, "Situational business intelligence," in Business Intelligence for the Real-Time Enterprise. Springer, 2009, pp. 1-11.

[14] Deloitte report, "Modern Business Intelligence: The Path to Big Data Analytics",April 2018.

[15] A. Lser, F. Hueske, and V. Markl, "Situational business intelligence," in Business Intelligence for the Real-Time Enterprise, ser. Lecture Notes in Business Information Processing, M. Castellanos, U. Dayal, and T. Sellis, Eds. Springer Berlin Heidelberg, 2009, vol. 27, pp. 1-11. [Online]. Available: http:// dx.doi.org/10.1007/978-3-642-03422-0 1

[16] M. Castellanos, C. Gupta, S. Wang, and U. Dayal, "Leveraging web streams for contractual situational awareness in operational bi," in Proceedings of the

2010 EDBT/ICDT Workshops, ser. EDBT '10. New York, NY, USA: ACM, 2010, pp. 7:1-7:8. [Online]. Available: http://doi.acm.org/10.1145/1754239. 1754248

[17] R. Elmasri and S. B. Navathe, Fundamentals of database systems. Pearson, 2014.

[18] H. Kuno, U. Dayal, J. Wiener, K. Wilkinson, A. Ganapathi, and S. Krompass, "Managing dynamic mixed workloads for operational business intelligence," in Databases in Networked Information Systems, ser. Lecture Notes in Computer Science, S. Kikuchi, S. Sachdeva, and S. Bhalla, Eds. Springer Berlin Heidelberg, 2010, vol. 5999, pp. 11-26. [Online]. Available: http://dx.doi. org/10.1007/978-3-642-12038-1 2

[19] "Voltdb," https://voltdb.com/.

[20] R. Stoica and A. Ailamaki, "Enabling efficient os paging for mainmemory oltp databases," in Proceedings of the Ninth International Workshop on Data Management on New Hardware, ser. DaMoN '13. New York, NY, USA: ACM, 2013, pp. 7:1-7:7. [Online]. Available: http://doi.acm. org/10.1145/2485278.2485285

[21] A. Eldawy, J. Levandoski, and P.-A. Larson, "Trekking through siberia: Managing cold data in a memoryoptimized database," Proc. VLDB Endow., vol. 7, no. 11, pp. 931-942, Jul. 2014. [Online]. Available: http://dx.doi. org/10.14778/2732967.2732968

[22] A. Kemper and T. Neumann, "Hyper: A hybrid OLTP & OLAP main memory database system based on virtual memory snapshots," in Proceedings of the 2011 IEEE 27th International Conference on Data Engineering, ser. ICDE '11. Washington, DC, USA: IEEE Computer Society, 2011, pp. 195-206. [Online]. Available: http://dx.doi.org/10.1109/ ICDE.2011.5767867

[23] M. Castellanos, C. Gupta, S. Wang, U. Dayal, and M. Durazo, "A platform for situational awareness in operational {BI}," Decision Support Systems, vol. 52, no. 4, pp. 869 – 883, 2012, 1)Decision Support Systems for Logistics and Supply Chain Management 2)Business Intelligence and the Web. [Online]. Available: http: //www.sciencedirect. com/science/article/pii/ S016792361100217X

[24] D. M. International, The DAMA guide to the data management body of knowledge. Technics Publications, Bradley Beach, 2009.

[25] K. Weber, B. Otto, and H. ¨Osterle, "One size does not fit all— a contingency approach to data governance," Journal of Data and Information Quality (JDIQ ), vol. 1, no. 1, p. 4, 2009.

[26] V. Khatri and C. V. Brown, "Designing data governance," Communications of the ACM, vol. 53, no. 1, pp. 148-152, 2010.

[27] R. S. Seiner, "Real-world data governance bi governance and the governance of bi data," http://www. slideshare.net/Dataversity/ realworlddata-governance-bigovernance-and-the-governance-of-bidata14889552 Accessed:2015-11.

[28] D. M. Association et al., "Dama dmbok functional framework (version 3.02)," DAMA International, 2008.

[29] NASCIO, "Data governance managing information as an enterprise asset part 1 - an introduction," NASCIO Governance Series, 2009.

[30] ——, "Data governance part iii: Frameworks - structure for organizing complexity," NASCIO Governance Series, 2009.

[31] P. Aiken, M. D. Allen, B. Parker, and A. Mattia, "Measuring data

*Modern Business Intelligence: Big Data Analytics and Artificial Intelligence for Creating… DOI: http://dx.doi.org/10.5772/intechopen.97374*

management practice maturity: a community's self-assessment," Computer, vol. 40, no. 4, pp. 42-50, 2007.

[32] B. Potter and R. Software, "Selfservice bi vs. data governance," https:// tdwi.org/articles/2015/03/17/selfservice-bi-vs-datagovernance.aspx, Mar. 17. 2015.

[33] M. Ferguson, "Is self-service bi going to drive a truck though enterprise data governance?" http://intelligent business.biz/wordpress/?p=489 Accessed:2015-10.

[34] Hung, P. C. K. Big data applications and use cases, the springer international series on applications and trends in computer science. Switzerland: Springer International Publishing AG, 2016

[35] Dave, P. What is big data - 3 vs of big data. Retrieved from SQL Authority, 2013 Blog: http://blog.sqlauthority. com/2013/10/02/big-datawhat-is-bigdata-3-vs-of-big-data-volume-velocityand-varietyday-2-of-21/.

[36] Sparks, B. H., & McCann, J. T. (2015). Factors influencing business intelligence system use in decision making and organisational performance. International Journal of Sustainable Strategic Management, 5(1), 31-54.

[37] Wang, Y., Kung, L., & Byrd, T. A. (2018). Big data analytics: Understanding its capabilities and potential benefits for healthcare organizations. Technological Forecasting and Social Change, 126, 3-13

[38] Wael M.S. Yafooz., Abidin, S. Z., Omar, N., & Idrus, Z. (2013, December). Managing unstructured data in relational databases. In Systems, Process & Control (ICSPC), 2013 IEEE Conference on (pp. 198-203). IEEE.

[39] Kaisler, S., Armour, F., Espinosa, J. A., & Money, W. (2013, January). Big data: Issues and challenges moving forward. In System sciences (HICSS), 2013 46th Hawaii international conference on (pp. 995-1004). IEEE.

[40] Zhou, Z. H., Chawla, N. V., Jin, Y., & Williams, G. J. (2014). Big data opportunities and challenges: Discussions from data analytics perspectives [discussion forum]. IEEE Computational Intelligence Magazine, 9(4), 62-74.

[41] Sivarajah, U., Kamal, M. M., Irani, Z., & Weerakkody, V. (2017). Critical analysis of Big Data challenges and analytical methods. Journal of Business Research, 70, 263-286.

[42] Lautenbach, P., Johnston, K. and Adeniran-Ogundipe, T.. Factors influencing bussiness intelligence and analytics usage extent in south african organaisations. S.Afr.J.Bus.Manage, 48(3): 23-33, 2017.

[43] Wang, Te-Wei,et al. "Depicting Data Quality Issues in Business Intelligence Environment Through a Metadata Framework." Applying Business Intelligence Initiatives in Healthcare and Organizational Settings, edited by Shah J. Miah and William Yeoh, IGI Global, 2019, pp. 291-304. http://doi:10.4018/978-1- 5225-5718-0.ch016.

[44] Abai, N. H., Yahaya, J. and Deraman, A. "An integrated framework of business intelligence and analytic with performance management system. A conceptual framework." In Proceedings of the 2015 Science and Information Conference . London. pp. 452-56, 2015.

[45] Chen, H., Chiang, R. H. and Storey, V. C. Business intelligence and analytics. From Big Data To Big Impact, 36(4): 1165 – 1188, 2012.

[46] Mohammed, J. and Westbury, O. Business intelligence and analytics evolution, applications, and emerging research areas. International Journal of Engineering Science and Innovative Technology (IJESIT), 4(2): 193- 200, 2015.

[47] M. A. Khan *et al*., "Effective Demand Forecasting Model Using Business Intelligence Empowered With Machine Learning," in IEEE Access, vol. 8, pp. 116013-116023, 2020, doi: 10.1109/ACCESS.2020.3003790.

[48] Fahad, S. A., & Alam, M. M. A modified K-means algorithm for big data clustering. International Journal of Computer Science Engineering and Technology, 6(4), 129-132, 2016.

*Edited by Robert M.X. Wu and Marinela Mircea*

This book provides the latest viewpoints of scientific research in the field of e-business. It is organized into three sections: "Higher Education and Digital Economy Development", "Artificial Intelligence in E-Business", and "Business Intelligence Applications". Chapters focus on China's higher education in e-commerce, digital economy development, natural language processing applications in business, Information Technology Governance, Risk and Compliance (IT GRC), business intelligence, and more.

Published in London, UK © 2021 IntechOpen © undefined / iStock

E-Business - Higher Education and Intelligence Applications

E-Business

Higher Education

and Intelligence Applications

*Edited by Robert M.X. Wu and Marinela Mircea*