Text Mining Methods and Personalized Web Services

#### **Chapter 1**

## Mining Numbers in Text: A Survey

*Minoru Yoshida and Kenji Kita*

#### **Abstract**

Both words and numerals are tokens found in almost all documents but they have different properties. However, relatively little attention has been paid in numerals found in texts and many systems treated the numbers found in the document in ad-hoc ways, such as regarded them as mere strings in the same way as words, normalized them to zeros, or simply ignored them. Recent growth of natural language processing (NLP) research areas has change this situations and more and more attentions have been paid to the numeracy in documents. In this survey, we provide a quick overview of the history and recent advances of the research of mining such relations between numerals and words found in text data.

**Keywords:** text mining, numeracy, survey, embedding, natural language inference

#### **1. Introduction**

Natural language processing (NLP) is a research field to make machines understand the meaning of a text data, which is a typically a list of words. In some cases, texts are not understandable in their closed form, i.e., without understanding the data other than the words. Numerals are an important form of data in such nonword data not only because many documents are accompanied with related metadata such as publish dates expressed in the form of numbers, but also because the document themselves contain numerals such as "three people", "500 dollars", and "90 cm."

Jointly mining texts and their associated numerical metadata has many variations and many studies have been proposed. For example, predicting stars given with product review texts is typical task of such research areas. Location-aware text mining can be considered as mining of association rules between words and positional data (i.e., longitude and latitude). Even joint learning of texts and images can be seen as mining relations between texts and associated RGB data.

In contrast to such *grounding*-type research, studies on mining numerals explicitly written in text have been getting little attention. However, recently more and more research studies are proposed on this area partly due to the recent advance of deep neural network-based language modeling.

In this survey, we try to provide a quick overview of the history and recent advances of this research field ranging from traditional tasks like information retrieval to emerging ones such as numerical reading comprehension.

#### **2. Traditional tasks**

Firstly, we give a survey on the systems which consider treatment of numbers for the traditional tasks such as information retrieval (IR), question answering (QA), and information extraction (IE). Some of the questions or queries of these tasks require the answers to be numbers, hence requiring appropriate treatment of numbers found in the target text.

#### **2.1 Question answering**

Question Answering (QA) is a task to find appropriate answers from text to the questions given also in text. Because many type of questions require the answers to be numbers e.g., 8,848 (8,849) (meters) is the answer of the question "how tall is Everest?", some existing QA systems treat numbers appropriately, typically in adhoc heuristical ways.

For example, IBM's PIQUANT system for TREC2003 [1] have had the sanity checking module, which use the Cyc knowledge base to check the given answer is valid intervals found in Cyc, e.g., rejecting "200 miles" for the questions for height of a mountain, having the knowledge "mountains are between 1,000 and 30,000 high" from Cyc. Moriceau [2] consider more complicated situation where several numeric answers can be extracted from different Web pages in QA system. They proposed a way to integrate considering the nature of numbers such as number approximations.

#### **2.2 Information retrieval**

Similarly to QA tasks, some Information Retrieval (IR) systems return the direct "answers" to the query. Therefore, appropriate treatment of numbers is required for some type of queries. For example, Banerjee et al. [3] introduced Quantity Consensus Queries (QCQs), the answers for which is the quantity intervals, such as "driving time from Paris to Nice". Their proposed algorithm propose and rank intervals considering whether returned snippets is included in the intervals or not. Sarawagi and Chakrabarti [4] proposed a system to answer quantity queries on Web tables such as "escape velocity jupiter." Their system contain the modules to interpret the numbers presented in the table cells to improve the accuracy.

On the contrary, queries also can be numbers. Yoshida et al. [5] proposed a suffix array-based text mining system enhanced with treatment of numbers, which accept range queries like "[1,000 - 10,000] ft'.

#### **2.3 Information extraction**

Information extraction (IE) is another type of systems that return the answers to the questions, but in this case the questions are given a priori such as "extract all dates and places of events found in the given documents." Many extracted information is in numerals, so special treatment of numbers often contributes to the improvement of the performance of IE systems.

For example, Bakalov and Fuxman [6] proposed a system to extract numerical attributes of objects given attribute names, seed entities, and related Web pages and properly distinguish the attributes having similar values.

**Table 1** summarizes these systems.


#### **Table 1.**

*Systems for Traditional Tasks Considering Numerals.*

#### **3. Numerical common sense acquisition**

*Numerical common senses acquisition* is a task to obtain numerical common senses, e.g., the height of mountains have a typical values "1,000 - 10,000 meters". Many numerals found on text typically describe some *attributes* of *objects* such as "25 C" for the temperature of some city, "170 cm" for the height of some person, etc. Obtaining such numerical common senses can contribute to improving various kinds of systems, e.g., anomaly detection or dialogue systems, etc.

We introduce two type of tasks in this type of research. One is a task to directly extract the common senses, and the other is a task to acquire such knowledge as language model parameters.

#### **3.1 Pattern-based extraction of numerical common senses**

In this task, the input is a large collection of texts.1 The output is a database for "typical values" of something.<sup>2</sup>

Typical methods for this task is to use pattern matching to obtain numerals for each attribute described in the given text. For example, the value "80" can be extracted from the sentence "The size of the dog is 80 cm." using the pattern "the size of the A is # cm."

#### *3.1.1 Previous methods*

Aramaki et al. [7] proposed to obtain physical size of entities by using Web search with patterns like "book (\*cm x \*cm)". Bagherinezhad et al. [8] proposed to use knowledge obtained using these patterns with object detection from images to achieve more reliable object size knowledge. Davidov and Rappoport [9] proposed similar approach but augment their method by obtaining terms similar to given object using the Web and WordNet. Takamura and Tsujii [10] took similar approach by using Web search for linguistic patterns e.g., "the size of A", but they enhanced their patterns with more indirect clues such as WordNet relations,

<sup>1</sup> It includes the case where the system uses Web search engines where the huge amount of texts are behind the search engines.

<sup>2</sup> It is an attribute of object in most cases.

n-gram corpus for the explicit patterns, e.g., "A is longer than B", and implicit patterns, e.g., "put A in B", through a machine learning approach to determine their weights.

Narisawa et al. [11] proposed to obtain numerical common sense by searching numerical expressions in Web corpus, and calculating distribution of numbers given contexts that are given syntactically such as "verb=give, subj=he, ..." and predict labels for given numbers in text, such as *small*, *normal*, *large*.

Recently, a large dataset called Distribution over Quantities (DoQ), was provided by Elazar et al. [12]. It contains ten dimensions (TIME, CURRENCY, LENGTH, AREA, VOLUME, MASS, TEMPERATURE, DURATION SPEED, VOLTAGE) for various kinds of words including nouns, adjectives, and verbs. They explored co-occurrence of words and numeracy in large Web data.

**Table 2** summarizes these approaches and **Table 3** shows the existing data sets.

#### **3.2 Prediction of numbers in sentences**

Some researchers tried to acquire numerical common senses as parameters of language modeling. In this type of research, the system directly predicts numbers to fill in the blanks in texts, or assessing feasibility of the number presented in text, without explicitly collecting above-mentioned knowledge bases.

#### *3.2.1 Task definition*

In this task, the input is a sentence, or document, where the position for a numeral is masked. The system then outputs a likely value for the masked position. For example, given the sentence "my five-year-old son is [MASK] cm tall.", the


#### **Table 2.**

*Systems for Numerical Common Sense Acquisition.*


#### **Table 3.**

*Data set for Numerical Common Sense Acquisition.*

#### *Mining Numbers in Text: A Survey DOI: http://dx.doi.org/10.5772/intechopen.98540*

system is required to answer the likely value to be filled in the position of "[MASK]."

Because the input is a sequence of words, encoder-decoder models are applicable to this task. Especially, the BERT language model is a good match for this problem. BERT is a deep neural network model that consists of modules called *Transformers*. It is trained on the task where the input is a sequence of words with special "[MASK]" tokens, and one of the output is the estimated original word for the position of "[MASK]".

#### *3.2.2 Previous approaches*

Several BERT models pretrained on a huge size of text data are available to the public. Using such pretrained language model to predict or assess the numeracy in documents is an emerging trend. Typically, the models are enhanced with the ability to predict numbers by, simply using masked language models by replace the numbers to be predicted with [MASK] tokens, or by adding numeracy inference modules into language models or by fine-tuning setting where output is a *discretized* versions of target numeracy.

Zhang et al. [13] investigated how pretrained language model like BERT can predict (the *discretized* version of) the attribute with continuous numeric values such as MASS or PRICE with evaluation with DoQ. Chen et al. [14] proposed a task of predicting the magnitude of hidden numerals in text and provided a large dataset called Numeracy-600 K. They also reported CNN and RNN-based models to solve this task. Berg-Kirpatrick and Spokoyny [15] proposed more advanced model using BERT and reported that using BERT was better than other models including BiGRU.

On the other hand, Lin et al. [16] considered more difficult task to predict *accurate* number to be filled in the blank in text, like "A bird usually has [MASK] legs". They reported that the current pretrained models including BERT and RoBERTa performed poorly.

A language model that did not use encoder-decoder model was also proposed. Spithourakis and Riedel [17] proposed a language model for a sequence of words and numerals, which gives the probability for words and numerals simultaneously. For example, it gives the probability of the numeral "50,000" appearing just after the word sequence "the number of video-game consoles I have is". They introduced the probabilities of being words or numerals for each token, and modeled the probability for numerals independently of that of words, using some variations including digit-based RNN and mixture of Gaussians.

**Table 4** summarizes the approaches and **Table 5** shows the dataset for this task.



**Table 5.** *Number Prediction DataSet.*

#### **4. Numeracy embeddings**

*Embedding* or *distributed representation* of words has become basic building blocks for natural language processing in recent years. It represents each word by high-dimension vectors (typically with 50 dimensions or more) of real values. These vectors reflect the meaning of words, such as words with similar meaning are represented by similar<sup>3</sup> vectors. Some researchers have investigated how numeracy itself is modeled in such pre-trained word embedding vectors.

#### **4.1 Task definition**

Embedding vectors are also assigned to numerals such as "three", "100", "million", etc. Popular word embeddings like word2vec do not distinguish these numerals from other words, i.e., the learning algorithms for these vectors treat numerals and other words equally. So, it is not obvious these word vectors appropriately reflect the meaning of numbers, such as "100 is larger than 3" and "4 is the next number of 3", etc. *Numeracy embedding* is a task to embed such numerals in appropriate vector representation.

#### **4.2 Investigating pre-trained word vectors**

Nowadays, word embeddings learned using huge size of corpus are provided by various researchers. Some researchers investigated how or whether these pretrained word vectors appropriately represent numerals.

Naik et al. [18] used GloVe, FastText, and SkipGram vectors. They compares similarity of embedding vectors for numbers. They used two types of tasks: one is for magnitude, e.g., vector for 4 should be more similar to 3 than 1000000, and the other is for numeration, e.g., vector for *three* should be more similar to 3 than *billion*. Contextualized word vectors were also considered. Wallace et al. [19] found that the pretrained language models for DROP, which is numeracy entailment task mentioned later sections, already captures numeracy, by testing if BiLSTM model with pretrained embedding pass some tests such as list maximum, decoding (e.g., convert the string "five" to 5), taking a sum of two numbers.

#### **4.3 Obtaining word vectors for numerals**

On the other hand, developing algorithms specialized to obtain word vectors for numerals beyond pre-trained word vectors have been proposed by some researches in recent years.

Jiang et al. [20] proposed to obtain embedding for numbers by directly applying Skip-Gram models to obtain embeddings for numbers taking into consideration of meaning of numbers by taking weighted average of embeddings which is

<sup>3</sup> Similarity of vectors is typically defined by inner product or cosine similarity of vectors.

#### *Mining Numbers in Text: A Survey DOI: http://dx.doi.org/10.5772/intechopen.98540*


*e.g., 4 is more similar to 3 than 1000000.*

*b e.g., 'three' is more similar to 3 than 'billion'. <sup>c</sup> A system proposed for DROP dataset.*

#### **Table 6.**

*Numeracy Embedding Systems.*

numerically similar to the target number. They find "prototype numbers" by clustering, and represent numbers as a weighted average of these prototypes. Sundararaman et al. [21] proposed to learn embeddings for numbers, which reflect the distance of two numbers in the number line, independently from words.

**Table 6** summarizes these previous methods.

#### **5. Numerical reading comprehension and numerical textural entailment detection**

More complex tasks such as textual entailment detection or reading comprehension also require treatment of numbers appropriately to answer some of the questions. We first mention on some early works for these tasks and then introduce some recent systems.

#### **5.1 Task definition**

*Textual entailment detection* is a task to find, given some texts, sentences which are true if the given text (called *hypothesis*) are true. The situation become more complicated if the sentences contains numerals because it requires numerical knowledge to understand the meaning of sentence. For example, we can say that the sentence "five people are in the house." is true given the hypothesis "two men and three women are in the house.", but it requires mathematical knowledge that two plus three equals five.

Some early works for this task include numeracy modules. The system by Tsuboi et al. [22] for textual entailment recognition task (RITE) in NTCIR-9 consider temporal expression matching such as "the first half of Nth century" to the appropriate interval. The system by Iftene and Moruz [23] implemented the special rules for numbers which create intervals considering expressions like "more than" or "over" for the Recognizing Textual Entailment (RTE-6) task.

*Reading comprehension* is a more complicated task, where the system is required to answer various types of questions.

#### **5.2 Numeracy-focused data sets**

Aforementioned studies mainly focused on the "range" of the numbers, i.e., they simply treat numbers as points or distributions defined on the number line. However, reading comprehension tasks require more advanced numeric skills such as addition, average, maximum, etc., into language models.

This line of research typically constructs the dataset for numeracy understanding task by selecting numeracy-related data from existing datasets for reading comprehension, natural language inference, or entailment. The selected data contain many questions that require understanding and calculation on numbers beyond simple range- or distribution-based treatment of numbers.

Roy et al. [24] proposed the task of Quantity Entailment, which require numeric reasoning to answer. Their dataset included the corpus from datasets for Recognizing Textual Entailment (RTE) task. They also proposed a method to solve these problems with CRF-based recognition of quantity part of the text, and rule-based recognition of entailment.

Ravichander et al. [25] proposed the EQUATE framework for quantitative reasoning in textual entailment, such as determining "5855 of lambs are back" is correct given the premise "6048 lambs is either black or white and there are 193 white ones." DROP proposed by Dua et al. [26] require systems to do operations such as addition, counting, or sorting. The type of questions and answers in DROP dataset varies widely, such as the question "Where did Charles travel to first" given passages "In 1517, the King sailed to Castle. ... In 1518, he traveled to Barcelona." State-of-the-art methods for reading comprehension performed poorly for these datasets (both of EQUATE and DROP) and the authors concluded that more advanced methods are required for these new tasks for numeric reading comprehension.

**Table 7** summarizes these datasets.

#### **5.3 Methods**

Given these datasets, more advanced models for them have been proposed. Typical approaches given the recent advance of deep neural network technologies is to use sequence-to-sequence (seq2seq) model for this task. In seq2seq models, the sequence of words can be feed as input directly to the system, then the system also returns another sequence of words as the output. Especially, recent pretrained language models including BERT already contains language models trained on huge amount of text documents, and they can be trained to return appropriate word sequence by being trained on relatively small set of training samples (i.e., the pair of input documents and "correct" or appropriate output for each input.) of a given task.

Rozen et al. [27] reported that performance for existing natural language inference (NLI) datasets can be improved by augmenting the dataset with synthetic adversarial datasets including the ones generated by rule-based replacement of numeric expressions found in the dataset. Geva et al. [28] reported that adding


**Table 7.** *Numeracy-Focused Data Sets.* *Mining Numbers in Text: A Survey DOI: http://dx.doi.org/10.5772/intechopen.98540*


**Table 8.**

*Numeracal Reading Comprehension Systems.*

synthetic numerical tasks to BERT pretraining steps with fine tuning on DROP dramatically improved the score for DROP. Ran et al. [29] proposed to inject graphbased numerical reasoning module between embedding and prediction modules, which outperformed existing machine reading comprehension models on the DROP dataset.

**Table 8** summarizes these approaches.

#### **6. Solving math word problems**

Math word problem texts are a typical type of documents that contain numerals and words extensively and require deep understanding of the meaning of numerals. Developing a system that automatically solve math word problems is thus one major research task in this area.

#### **6.1 Task definition**

In this task, the problem is given in a text that contains numerals, e.g., "How much How much would it cost to buy 12 apples at 1.1 dollars each?", and systems are required to provide a solution for the problem, e.g., 12 � 1*:*1 ¼ 13*:*2 dollars. Recent approaches for this task typically use deep neural networks that take a sequence of words as inputs. These inputs are transformed through several layers and used to produce the final output. Variety of output forms are considered by previous methods, including simple seq2seq models (i.e., outputs are also sequences of words) and sequence-to-tree models (i.e., outputs are in tree forms that represent equations to calculate the answers.)

Sequence-to-sequence (seq2seq) is a typical approach for this task. Ling et al. [30] provided their original dataset with 100,000 samples, and proposed a method to generate *answer rationales* which are human-readable instructions to derive the answers using a sequence-to-sequence (seq2seq) model. Saxton et al. [31] investigate the ability of existing sequence to sequence architectures including Transformer for mathematical reasoning (e.g., "Solve 41 + 132") with free-form texts.

Some researchers have tried to produce graphs that represent the mathematical operations to directly produce the answers to the questions. Amini et al. [32] provided a dataset for math word problems called MathQA. They also proposed the sequence-to-Program model to solve this task. The approach by Zhang et al. [33] uses a new architecture called Graph2Tree, which uses graphs constructed from texts independently from BiLSTM encoders. They tested their system on MAWPS data set. [34] Lample and Charton [35] showed that neural models can solve mathematical problems such as symbolic integration and solving differential equations using sequence-to-sequence approaches.

**Table 9** summarizes the existing data sets and **Table 10** summarizes the systems for this task proposed so far.

#### *Information Systems - Intelligent Information Processing Systems, Natural Language Processing…*


#### **Table 9.**

*Systems for Math Word Problem Solving.*


**Table 10.**

*Math Word Problem Datasets.*

#### **7. Other tasks**

Yoshida et al. [36] considered a problem of estimating appropriate units for the numbers found in Wikipedia tables when units were omitted. Elazar and Goldberg [37] considered the problem to infer the omitted head related to numerals such as "It is worth about two million \_\_."

Chen et al. [38] proposed the numeral attachment task, which determine what entity is the number presented in text related. They also proposed the task of numeral categorization, which is to classify numerals presented in financial text into 7 or 17 categories [39].

The task proposed by Chaganty and Liang [40] was to describe given numerals by examples, such as "\$131 million is about the cost to employ everyone in Texas over a lunch period."

#### **8. Conclusions**

The relations between numerals and words found in text data has been paid little attention compared to other areas in natural language processing. This paper provided the overview of this field ranging from the systems for traditional tasks such as information retrieval tor the relatively recent tasks like reading comprehension.

We categorized the previous researches into 6 types: traditional tasks, numerical common sense acquisition, numeracy embeddings, numerical reading comprehension, solving math word problems, and others. The first two tasks have been studied relatively long time, while the remaining topics is emerging with recent advances of neural language models.

In Section 2, we introduced some previous systems that have numerical modules for traditional tasks like QA, IE, and IR. In Section 3, we introduced numerical common sense acquisition where typical approaches are pattern-based extraction and parameter estimation for language models. In Section 4, numeracy embedding, *Mining Numbers in Text: A Survey DOI: http://dx.doi.org/10.5772/intechopen.98540*

where the goal is assigning appropriate real-valued vectors to numerals, was introduced, Section 5 introduced numerical reading comprehension and numerical entailment, that require more advanced numerical understanding of text. The task of solving math word problems, which are typical type of texts that contain numerals extensively, was introduced in Section 6, and Section 7 touched on other unique tasks.

Recent increase of the dataset and resources focusing on numeracy will accelerate the development of the systems with the ability of understanding numeracy in text.

#### **Acknowledgements**

This work was supported by JSPS KAKENHI Grant Numbers JP21K12141 and JP20K12027.

#### **Author details**

Minoru Yoshida\* and Kenji Kita Tokushima University, Tokushima, Japan

\*Address all correspondence to: mino@is.tokushima-u.ac.jp

© 2021 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Information Systems - Intelligent Information Processing Systems, Natural Language Processing…*

### **References**

[1] John M. Prager, Jennifer Chu-Carroll, Krzysztof Czuba, Christopher A. Welty, Abraham Ittycheriah, Ruchi Mahindru: IBM's PIQUANT in TREC2003. TREC 2003: 283-292

[2] Numerical data integration for cooperative question-answering Véronique Moriceau Proceedings of the Workshop KRAQ'06: Knowledge and Reasoning for Language

[3] Somnath Banerjee, Soumen Chakrabarti, Ganesh Ramakrishnan: Learning to rank for quantity consensus queries. SIGIR 2009: 243-250

[4] Sunita Sarawagi, Soumen Chakrabarti: Open-domain quantity queries on web tables: annotation, response, and consensus models. KDD 2014: 711-720

[5] Minoru Yoshida, Issei Sato, Hiroshi Nakagawa, Akira Terada: Mining Numbers in Text Using Suffix Arrays and Clustering Based on Dirichlet Process Mixture Models. PAKDD (2) 2010: 230-237

[6] Anton Bakalov, Ariel Fuxman, Partha Pratim Talukdar, Soumen Chakrabarti: SCAD: collective discovery of attribute values. WWW 2011: 447-456

[7] Eiji Aramaki, Takeshi Imai, Kengo Miyo, Kazuhiko Ohe: UTH: SVM-based Semantic Relation Classification using Physical Sizes. SemEval@ACL 2007: 464-467

[8] Hessam Bagherinezhad, Hannaneh Hajishirzi, Yejin Choi, Ali Farhadi: Are Elephants Bigger than Butterflies? Reasoning about Sizes of Objects. AAAI 2016: 3449-3456

[9] Dmitry Davidov, Ari Rappoport: Extraction and Approximation of Numerical Attributes from the Web. ACL 2010: 1308-1317

[10] Hiroya Takamura, Jun'ichi Tsujii: Estimating Numerical Attributes by Bringing Together Fragmentary Clues. HLT-NAACL 2015: 1305-1310

[11] Katsuma Narisawa, Yotaro Watanabe, Junta Mizuno, Naoaki Okazaki, Kentaro Inui: Is a 204 cm Man Tall or Small ? Acquisition of Numerical Common Sense from the Web. ACL (1) 2013: 382-391

[12] Yanai Elazar, Abhijit Mahabal, Deepak Ramachandran, Tania Bedrax-Weiss, Dan Roth: How Large Are Lions? Inducing Distributions over Quantitative Attributes. ACL (1) 2019: 3973-3983

[13] Xikun Zhang, Deepak Ramachandran, Ian Tenney, Yanai Elazar, Dan Roth: Do Language Embeddings capture Scales? EMNLP (Findings) 2020: 4889-4896

[14] Chung-Chi Chen, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen: Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments. ACL (1) 2019: 6307-6313

[15] Taylor Berg-Kirkpatrick, Daniel Spokoyny: An Empirical Investigation of Contextualized Number Prediction. EMNLP (1) 2020: 4754-4764

[16] Bill Yuchen Lin, Seyeon Lee, Rahul Khanna, Xiang Ren: Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-Trained Language Models. EMNLP (1) 2020: 6862-6868

[17] Georgios P. Spithourakis, Sebastian Riedel: Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers. ACL (1) 2018: 2104-2115

[18] Aakanksha Naik, Abhilasha Ravichander, Carolyn Penstein Rosé, *Mining Numbers in Text: A Survey DOI: http://dx.doi.org/10.5772/intechopen.98540*

Eduard H. Hovy: Exploring Numeracy in Word Embeddings. ACL (1) 2019: 3374-3380

[19] Eric Wallace, Yizhong Wang, Sujian Li, Sameer Singh, Matt Gardner: Do NLP Models Know Numbers? Probing Numeracy in Embeddings. EMNLP/ IJCNLP (1) 2019: 5306-5314

[20] Chengyue Jiang, Zhonglin Nian, Kaihao Guo, Shanbo Chu, Yinggong Zhao, Libin Shen, Kewei Tu: Learning Numeral Embedding. EMNLP (Findings) 2020: 2586-2599

[21] Dhanasekar Sundararaman, Shijing Si, Vivek Subramanian, Guoyin Wang, Devamanyu Hazarika, Lawrence Carin: Methods for Numeracy-Preserving Word Embeddings. EMNLP (1) 2020: 4742-4753

[22] Yuta Tsuboi, Hiroshi Kanayama, Masaki Ohno, Yuya Unno: Syntactic Difference Based Approach for NTCIR-9 RITE Task. NTCIR 2011

[23] Adrian Iftene, Mihai Alex Moruz: UAIC Participation at RTE-6. TAC 2010

[24] Subhro Roy, Tim Vieira, Dan Roth: Reasoning about Quantities in Natural Language. Trans. Assoc. Comput. Linguistics 3: 1-13 (2015)

[25] Abhilasha Ravichander, Aakanksha Naik, Carolyn Penstein Rosé, Eduard H. Hovy: EQUATE: A Benchmark Evaluation Framework for Quantitative Reasoning in Natural Language Inference. CoNLL 2019: 349-361

[26] Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, Matt Gardner: DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs. NAACL-HLT (1) 2019: 2368-2378

[27] Ohad Rozen, Vered Shwartz, Roee Aharoni, Ido Dagan: Diversify Your

Datasets: Analyzing Generalization via Controlled Variance in Adversarial Datasets. CoNLL 2019: 196-205

[28] Mor Geva, Ankit Gupta, Jonathan Berant: Injecting Numerical Reasoning Skills into Language Models. ACL 2020: 946-958

[29] Qiu Ran, Yankai Lin, Peng Li, Jie Zhou, Zhiyuan Liu: NumNet: Machine Reading Comprehension with Numerical Reasoning. EMNLP/IJCNLP (1) 2019: 2474-2484

[30] Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom: Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems. ACL (1) 2017: 158-167

[31] David Saxton, Edward Grefenstette, Felix Hill, Pushmeet Kohli: Analysing Mathematical Reasoning Abilities of Neural Models. ICLR (Poster) 2019

[32] Aida Amini, Saadia Gabriel, Shanchuan Lin, Rik Koncel-Kedziorski, Yejin Choi, Hannaneh Hajishirzi: MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms. NAACL-HLT (1) 2019: 2357-2367

[33] Jipeng Zhang, Lei Wang, Roy Ka-Wei Lee, Yi Bin, Yan Wang, Jie Shao, Ee-Peng Lim: Graph-to-Tree Learning for Solving Math Word Problems. ACL 2020: 3928-3937

[34] Rik Koncel-Kedziorski, Subhro Roy, Aida Amini, Nate Kushman, Hannaneh Hajishirzi: MAWPS: A Math Word Problem Repository. HLT-NAACL 2016: 1152-1157

[35] Guillaume Lample, François Charton: Deep Learning For Symbolic Mathematics. ICLR 2020

[36] Minoru Yoshida, Kazuyuki Matsumoto, Kenji Kita: Table Topic Models for Hidden Unit Estimation. AIRS 2016: 302-307

[37] Yanai Elazar, Yoav Goldberg: Where's My Head? Definition, Dataset and Models for Numeric Fused-Heads Identification and Resolution. Trans. Assoc. Comput. Linguistics 7: 519-535 (2019)

[38] Chung-Chi Chen, Hen-Hsen Huang, Hsin-Hsi Chen: Numeral Attachment with Auxiliary Tasks. SIGIR 2019: 1161- 1164

[39] Chung-Chi Chen, Hen-Hsen Huang, Yow-Ting Shiue, Hsin-Hsi Chen: Numeral Understanding in Financial Tweets for Fine-Grained Crowd-Based Forecasting. WI 2018: 136-143

[40] Arun Tejasvi Chaganty, Percy Liang: How Much is 131 Million Dollars? Putting Numbers in Perspective with Compositional Descriptions. ACL (1) 2016

#### **Chapter 2**

## Towards a Personalized Web Services Composition Approach

*Sarra Abidi, Fathia Bettaher and Myriam Fakhri*

#### **Abstract**

Generally available Web Services (WS) can not meet the complex needs of users and their adaptation to the environment remains a major problem for the design of information systems. The web services composition comes to address the satisfaction of new and complex needs such as the process we find in most organizations. Its purpose is to perform several services to meet user demand. **The satisfaction of a user needs a dynamic and reusable environment to meet those needs.** In this context, the user interactions are essential. From there, in this work, we define two objectives: **i)** propose a service composition approach that allows dynamic services composition and its purpose is to meet a need. **ii)** Propose a personalization approach for Web services composition which allows the reuse of services while adopting for the context of each user. Our approach is based on the use of ontologies and user profile.

**Keywords:** Web services composition, personalization, ontologies, user profile

#### **1. Introduction**

Nowadays, we find a large number of services available on the Web and in different directories, where a Web service is an application made available on internet. These services are generally defined by their function, input/output [1–3], which allow their reuse. However, user requirements are continually evolving, so the available services can not meet all needs especially the most complex ones. The composition of services is coming precisely to answer these two questions. After analyzing several definitions (Fekih et al) [4], (Shanchen et al) [5], (Yuan et al) [6], we hold two views on the services composition; According to (Shanchen et al) [5] who has a vision process on the services composition: "The composition is the selection process, combination and implementation services to accomplish a given objective."

A second view more global is that of (Fekih et al) [4], "The composition then being an effective way to create, run, and maintain services that depend on other services."

Based on these definitions, we believe that the services composition has essentially two objectives:


We distinguish two types of composition services; "Orchestration" is the process of programming a central engine which, on its part, controls and calls all services according to a predefined process. Added to that, it defines the order of execution of services [7]. "Choreography", for its part, aims to achieve a common goal between a set of Web services. The collaboration between each web service collection (part of the composition) is described by control flow [8].

Regarding categories of the services composition, we distinguish between; on the one hand, a static composition which uses a fixed manner services defined in advance, which are unchanged and independent of the client context [9] and "a dynamic composition" which occurs when running within the constraints required by the client [9]. "A semi-dynamic composition" combines the two types mentioned previously. On the other hand, we find a "manual composition" which considers that the responsible is the user who generates the composition by hand via a text editor and without using dedicated tools. "The semi-automatic composition" is a step forward compared to the static composition, to the extent that its techniques make semantic proposals to help in the selection of Web services. "Automatic composition" is the automation of the entire composition process, without any user intervention.

Given the continuous increase of heterogeneous information sources and the diversity of user requirements, retrieval information systems deliver massive results. In this context, these results generate subsequent information that disorients the user to distinguish what is relevant from what is not.

In literature, the term "personalization" knows a success. Let us look at the opinion of (Kostadinov) [10], which announces that "the personalization of information comes from a set of individual preferences, by ordering criteria or semantic rules". Such specifications allow obtaining the quality level desired and data arrangements. In this context, personalization of information is a major challenge for the IT industry to the extent that the relevance of the information delivered its intelligibility and its adaptation to the uses and preferences constitute as well as key factors of success or rejection of these systems [11]. We believe so that he will be very useful to incorporate personalization for composing Web services.

Section 2 presents the related work describing the personalization approaches for web service composition. Section 3 presents an overview of the proposed approach, the user profile orientation, the used ontologies, and the personalization based knowledge. Section 4 explains the user's profile construction. Section 5 presents the user's request personalization process. Section 6 treats the personalization of composition services. After that, we present an illustrative example. And we finalize by experimentation and evaluation of the proposal.

#### **2. Related work**

Many existing approaches in the literature treat the concept of personalization for Web service composition. In this regard, (Fekih et al) [4] [present an approach that is both semi-automatic and semantic one hand, user intervention is necessary, where it is represented from its profile and its preferences. Furthermore, the service selection is made up of a semantic description based on OWL-S [12]. It is also important to note that the authors present the service selection process in three stages namely, the query expression that integrates the user profile where this last based on real information (name, date of birth). The service discovery. The final step is the validation of the research by the user that declares whether the user is satisfied or not, knowing that failure to user satisfaction, the whole process will be repeated.

#### *Towards a Personalized Web Services Composition Approach DOI: http://dx.doi.org/10.5772/intechopen.97813*

Shanchen et al [5] thinks that a context classification is important. We distinguish then between the U-context (user context), the W-context (web service context), and R-context (context Resource). On the one hand, the context classification allows to better establishing customization, and consistency can check the status of a Web service after being personalized. On the other hand, the three classes classified are interconnected. The user is the so most dynamic component: its needs, its preferences, and its conditions always vary. The resource is the stable component which characteristics and constraints can be known in advance.

Mcheick H et al. [13, 14] present a crucial adaptation of Web services face the changes that may affect them. The approach aims to resolve Web service adaptation problem. It is based on the principle of adding two components named "Manager of appearance" and "context manager".

Not to forget, to include correspondence from ontologies based on lexical database (Word net) [15]. In [6], authors believe it is useful to store the context before its release in the selection process and the composition of Web services. This therefore provides a rich and reliable representation of data captured in the form of ontology, based thereafter on mathematical formulas for algorithms of semantic correspondence [16].

Given that a composition of services intended to meet the need for a user. Based on this principle, and thus returning the compositional approaches presented in the previous section, we note that regardless of the approach proposed, it still lacks personalization throughout the composition process from the user's request pending the outcome of composite service. Indeed, we choose the user profile as a medium to introduce personalization.

The following section provides an overview of the proposed approach by presenting the choice of user's profile, the choice of the used ontologies, and the choice of personalization's forms.

#### **3. A general overview**

This section presents first an overview of the proposed approach. Second, it explains the choice of user profile. Third, it presents the chosen types of ontologies. And finally, we present the different chosen forms of personalization.

This architecture provides a global perspective that concerns personalization of the request, and the development, management and operation of Web services. Thus, it essentially consists of three layers. As shown in **Figure 1**, the first one presents a user profile construction process based on ontologies. The second layer is the personalization of the user's request, which consists of the evaluation of the request based on the constructed user profile. The last layer consists in the personalization of composition processes using also the user profile.

#### **3.1 Orientation user profile**

Currently, user profiles play a very important role in all digital environments [17]. Profile integration is one of the ways systems that can be adapted to users in digital environments. Each information system based on services or services composition should primarily support resources in order to meet the changes in system use required.

The user model is a representation of information about an individual user that is essential for an adaptive system to provide the adaptation effect. From there, we define a user profile as "*necessary information may be necessary to guide the user's request personalization*". So, personalization is defined by a set of specific individual

**Figure 1.** *Overview of the approach.*

preferences for each user. As the profile includes data collected from users that are effective in evaluating a request, then we relied on the profile technical based on ontologies; we choose to use a short-term profile constructed from the domain ontology and a long-term profile constructed from the ontology of the user. We will explain more in what follows the process of user's profile creation.

#### **3.2 The used ontologies**

When we talk about data representation, we are looking for tools that can enrich and strengthen the efforts of these. Thus, we work in a dynamic environment; we need a reusability of knowledge. Besides, for our approach, we need a tool for indexing and information retrieval. This is why our choice is for the ontology.

Our approach is based on: domain ontology and user ontology. The domain ontology is an indispensable resource for the personalization process. Indeed, the more the context of the application, the domain is also an important factor for a personalization process. For the ontology of the user, since the choice of data is very important insofar as it specifies the information needed to present a user and its preferences, we are based on the multidimensional approach (Kostadinov D) [10] in the first place, then we have taken the approach of (Katifori) [18] which allows the passage of data about the user to the concepts representatives in the user's ontology thereafter.

#### **3.3 Personalization based on knowledge**

Personalization can take many forms; include in this the "result of filtering" often known by eliminating unwanted data rather than looking for specific data within the same document flow. A second form is "the query's enriching" where personalization is defined as learning achieved from the preferences given by the users for the reformulation of the query thereafter.

Seeing that we need a reformulation of query and search for specific data, we retain the "the request enrichment". Personalization is then defined as learning made by users for the reformulation of the request thereafter. Indeed, in our approach, the query undergoes enrichment from the two ontologies (domain ontology and user's ontology).

#### **4. User's profile conception**

For the implementation of a user's profile technique, we essentially based on two approaches: **(1)** The multidimensional approach (Kostadinov D) [10] and **(2)** (Gauch S) approach [19]. This choice is explained by the fact that these two approaches have two complementary forms of personalization. Thus, (Kostadinov D) approach [10] aims to propose a set of open dimensions, able to accommodate most information characterizing Profile. It is based primarily on seven dimensions that are (personal data, center of interest, domain ontology, the expected quality of the delivered results, personalization, security and confidentiality) (**Figure 2**).

Otherwise, since the classification, organization and structuring of profile data is a key element of personalization, so we have chosen Gauch [19] approach which aims to create a profile of ontology-based user without using the user interaction based on a classification of concepts.

To set up a personalization process, it is essential to choose its type to apply. Two main issues arise in this respect; the first issue deals with the dimensions of user context and the second addresses the choice of personalization's form. We are interested in the first question, so it is useful to study a user's context before proceeding to the description of the dimensions for the latter.

Since a context is composed of several dimensions [20], we distinguish a *social dimension* that describes the potential membership of the user. It may be; individual, group or community. A time dimension that is interested in the temporal context of the need, we thus distinguish between a short-term intention and a long-term intention. The first type is related to the needs and preferences of the user during a search session for information, while the long-term context (personalization) reflects the needs and persistent user preferences [21]. Finally, an application dimension is interested in application area.

Regarding the dimensions, we have the following choices: Personal data, this dimension is composed of two parts: a static part of the profile and a dynamic part. The static part concerns the following three: (1) " the user's identity", (2) " demographic data", (3) "physical description". The dynamic part subclass concerns the sub-class "category". For the center of interest, a user may be interested in several concepts. Indeed, we do in this user's modeling framework differentiation between its various needs. We will explain in the next section the used ontologies (**Figure 3**).

In order to highlight the user profile, we illustrate this using the example shown below. Thus, a profile can be defined by the following concepts: Identity which defined by the first-name (Philippe), and the last-name (Arno). The second concept is U-profession specifies the user's work (nurse). U-Civil-status indicates if the

**Figure 2.** *Process of user's profile conception.*

#### *Information Systems - Intelligent Information Processing Systems, Natural Language Processing…*

### **Figure 3.**

user is married or not. U-experience specifies the period of work as a nurse (Five years). The concept U-Medical-service specifies that the nurse works in the urology department. U-address presents the nurse's workplace (C.H.U-de-Bordeaux). Finally, U-Grade specifies what nurse's type is involved, which is linked to the experience (**Figure 4**).

#### **5. Personalization of web services composition**

The personalization of Web services composition process starts from "User's request personalization" process which based on the constructed user's profile. It allows the user to express their needs, by performing first, a (first enrichment) query from a short-term user profile. After the deployment of data acquired, the application undergoes a (second enrichment) from the data of a long-term profile leading subsequently to a personalized query, which will be the basis of the next layer. In this way, an end-user profile based on ontologies is constructed. Finally, we should mention that if a user needs to make updates to some data, it uses the user profile.

The services composition processes is essentially based on 3 steps which are: decomposition into subqueries, discovery and selection of services and proposal of composition's plans. For our approach we followed the same steps by adding the notion of personalization and we choose a semantic Web services. Thus, the dynamic process proceeds as follows. The process starts from a personalized query to invoke as the end the services necessary to meet the expressed needs. It gives us output, a set ordered services for execution. Thus, the process started from a decomposition and verification of sub-queries. In fact, this is based on a comparison *Towards a Personalized Web Services Composition Approach DOI: http://dx.doi.org/10.5772/intechopen.97813*

of the profile parameter which had been added to the request and the WS Context parameter. The second step is a personalized discovery and selection for Web services: This phase is based on a semantic description for the discovery and relevant selection of personalized Web services (**Figure 5**).

Given that a personalized query is identified by three parameters (InpReq, OutRep, ProfReq) and a Web service (WS) is identified by (Input, Output, Context), so actually the selection of services in a personalized way is as follows:


If the comparison between the different parameters is validated, so the process goes to the next step which is "the personalized services composition".

From this validation is calculated the similarity between the different settings using the algorithms presented below. Thus, the services were selected in a nonpredefined order, so they were selected and ordained dynamically to meet the needs of the user.

**Figure 5.** *Architecture of the proposed approach.*

If the comparison between the three query parameters and WS is not validated so we will go only to compare the inputs, and the profile with context in the same way mentioned. If that is validated, the result is satisfying, if not; we will compare the profile with the context. If it is approved, so we have a proposed composition plans, referring to ontology "Word net", if not there's no result.

As a consequence, the new parameters (Profile and Context) that we have added are responsible for a personalized Web services composition or not. In this way, the process makes a composition of relevant personalized services in choice, in number and scheduling services. To clarify the proposed approach, we present on the following section an illustrative example of medicines circuit.

#### **Algorithm 1**: Discovery module

```
Inputs: SRq, bestsw
Outputs: SWF
Taux-sim 0
For each s in SW do
If similarity (SRq, s) Taux-sim so
Taux-sim similarity (SRq, s)
   SWF s
End if
End for
 bestsw Better(SWF)
returns SWF
```
**Algorithm 2:** Semantic similarity.

*Inputs:* SW, SRq *Outputs*: Taux-sim. Taux-sim 0 *If* (EntReq=EntSW) *and* (SortReq=SortSW) and(ProfReq=ContxtSW) *so* Taux-sim 3 **else if** (EntReq) included (EntSW) and (SortSW) inclus (SortReq) and (ProfReq) included (ContxtSW) *so* Taux-sim 2 *Else if* (SortReq) included (SortSW) and (ProfReq) inclded (ContxtSW) *so* Taux-sim 1 else Taux-sim 0 End if

#### **6. Illustrative example**

To highlight the interest of the proposed approach, we present a case study of a medicines circuit application in a health facility.

Healthcare organizations are highly dynamic working environments which are facing the challenge of delivery personalized services to their patients in a very costeffective and efficient way. Many reports in the healthcare field state that there is an "absence of real progress towards applying advances in information technology to improve administrative and clinical processes" [22]. Furthermore, in healthcare

*Towards a Personalized Web Services Composition Approach DOI: http://dx.doi.org/10.5772/intechopen.97813*

organizations, the lack of personalization of contemporary enterprise information systems is considered as a major obstacle in improvement of organizational and medical treatment processes.

Let us start with the following case in which the clinician writes a nominative prescription. The pharmacist valids the prescription after a pharmaceutical analysis. Drug doses issued to the clinical are unitary and nominative (ready to be administered to the patient) and respect the administration plan for the next 24 hours. The renewal of the grant by 24-hour period is provided by the pharmacy **(**chemistry**)** if the prescription is still valid.

This scenario presents a situation where the actor tries to satisfy a need for a complex query. There is an executable composition to satisfy him. Indeed, as part of his mission, he must prepare medicines to administer for its care unit patients from a stock.

This activity is usual for the nurse/pharmacist **(**chemist**),** to know arrangement of patient medicine and the preparation according to the medicines type.

We suppose that we have a certain number of services as shown in **Table 1**, in a repository in which we distinguish.

In the following, we present the process of medicines preparation in the case of a personalized composition and a non-personalized one.

#### **6.1 Query expression step**

To specify his purpose, the actor can formulate his need as follows by expressing his purpose. Thus, given the following request, the preparation can therefore be carried out as follows. Preparation by medicines type and arrangement per patient.

User's query: Prepare(medicines-list**,** doses-list, date, care-unit-list) **∧** Arrangement(care-unit-list, patient-list).

We distinguish between two different requests; a non-personalized request and personalized one, knowing that the personalized request is enriched by a user profile which contains the following information (profession, experience) (**Table 2**).

Thus, in our approach we defend that the personalization must start since the expression of the request because it allows giving more information about the user and his environment. This will make it possible to define a more specific context for each actor.


**Table 1.** *Selected web services.*


#### **Table 2.**

*Presentation of the two request types.*

#### **6.2 Research and services selection step**

Having doubts about the choice of services. The user resorts to the automatic selection based on a repository of services and a basic knowledge (which is an example of a composition plan). Thus, for the services rendered by a nonpersonalized request, we will obtain the services S1, S2, S3, S4, S5, and S6. This is explained by the fact that the profession's actor is not defined. Therefore, the system offers all services. However, for the services rendered by a personalized request, based on the profile parameters and as a nurse she has the right to perform services S1, S2, S3, and S6. In addition, for "Pharma verification-compatibility" service; the pharmacist has no right to verify medicines compatibility for a patient. But rather, checks the compatibility between medicines (**Table 3**).

#### **6.3 Services composition step**

Services composition result is based on a similarity measurement algorithm and a dynamic discovery between different parameters. Starting from a nonpersonalized request (**Figure 6**), the system proposes the following composition plans. A nurse must first verify the authorized doses for each patient. Then, she prepares medicines by their types. After that, she attributes for each patient his mentioned medicines. The next service allows arranging medicines according to each patient. On Urology-unit and through "PHARMA Compatibility-Verification", the system allows knowing the compatibility between Paola and Digoxine. Finally, the actor saves the order.

For personalized query (**Figure 7**), through profile integration, the system proposes three services for a nurse. Thus, from her profession and experience, she can


**Table 3.** *Service research results for both types of queries.* *Towards a Personalized Web Services Composition Approach DOI: http://dx.doi.org/10.5772/intechopen.97813*

**Figure 6.** *Composition result for a non-personalized query.*

begin with the preparation of medicines by their types. Since this actor has already worked as a referring nurse, the system proposes for her to arrange medicines for each patient. Finally, the last service is "PHARMA Edition" which allows saving the order.

In that way, we have proved that the profile integration influences to have a personalized request that results in an impact on research and selection of services in order to obtain a personalized composition.

**Figure 7.** *Composition result for a personalized query.*

### **7. Experimentation and evaluation of the proposal**

The main objective of this experiment is to show that the user profile integration increases the number of relevant services. Thereafter, we will get a corresponding composition to the personalized user query. For this, we used two query types; a non-personalized query and a personalized query, where appropriate valuation measures are mainly based on precision and recall.

• **Recall** is the ratio of the number of relevant services found by the filter to the number of relevant services available.

**Recall** = Number of selected relevant services / (number of relevant selected services+ number of relevant non-selected services).

Based on **Figure 8**, we notice that the recall for a personalized query is high compared to a non-personalized query. This is due to the recall which is always the number of relevant selected services over all relevant services. So the user will have access to information that he wished to have.

**Figure 8.** *Recall measurement curve.*

**Figure 9.** *Precision measurement curve.*

• **Precision** is the proportion of relevant services among the selected services.

**Precision** = Number of relevant selected services **/** (number of relevant selected services + number of non-relevant selected services).

As shown in **Figure 9**, the accuracy for a personalized query is high compared to non-personalized. However from a number of services equal to 27, the curve for a personalized query is the same for a non-personalized query. This is explained by the precision which is the number of relevant selected services over all services.

#### **8. Conclusion and perspectives**

This paper presents a dynamic approach for personalization of Web services composition. First, the construction of a user profile from domain ontology and user's ontology is a key point for personalization, which leads to the construction of a personalized query where each user may have personal data that is stored in a

parameter named "Profile". On the one hand these data facilitate the personalization subsequently, hence the construction of a corresponding request to the need of the user. On the other hand, the process is based on a similarity measure algorithm for the personalized discovery of web services, which allows thereafter establishing a personalized composition.

This approach has a dynamic modeling for user not only for the query expression but also the composition process.

We should note that a scaling test is in progress as Web services adapted to our needs are not available. So, we were forced to edit them manually, which took a long time. But we are working on the scaling (we increase the number of services and we observe the result).

As a future work, we can identify two interesting perspectives. The first one is How to improve the results relevance in terms of selected services. The second one is how to respond to business needs in dynamic contexts.

The second perspective is to design compositions that integrate a personalization for business process satisfaction and user satisfaction.

#### **Author details**

Sarra Abidi<sup>1</sup> \*, Fathia Bettaher<sup>2</sup> and Myriam Fakhri<sup>3</sup>

1 ESPRIT, Tunisie


\*Address all correspondence to: sarralabidi@gmail.com

© 2021 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Towards a Personalized Web Services Composition Approach DOI: http://dx.doi.org/10.5772/intechopen.97813*

#### **References**

[1] Abidi, S., Essafi, M., Fakhri, M and Ben Ghezala H. "Refined cloud services discovery based on user model including privacy-preserving access control", Knowledge Based and Intelligent Information and Engineering Systems, (2018).

[2] Amina, A., Mohammed, B. and Driss Chenouni, "A phased two stages conceptual framework for web services composition", Journal of Engineering and Applied Sciences, Vol. 13 No. 7. (2018)

[3] Fekih, H., Mtibaa, S. and Bouamama, S. "User-centric web services composition approach based on swarm intelligence", IEEE 18th International Conference on High Performance Computing and Communications. (2016)

[4] Fekih H, Mtibaa S, Bouamama S. «An Efficient User-Centric Web Service Composition Based on Harmony Particle Swarm Optimization». International Journal of Web Services Research January Volume 16 . Issue 1 . (2019)

[5] Shanchen P, Qian G, Ting L.H, Guanguan X, Kaitali L. « A Behavior Based Trustworthy Service Composition Discovery Approach in Cloud Environment ». IEEE Access ( Volume 7), (2019)

[6] Yuan Yuan, Weishi Zhang, Xiuguo Zhang. « A Context-aware Selfadaptation Approach for Web Service Composition ». 3rd International Conference on Information Systems Engineering. (2018)

[7] C Peltz.. Web services orchestration and choreography. IEEE Computer, pages 46-52.(2003)

[8] A Dumas. M Oaks. P Barros. Standards for web service choreography and orchestration: Status and perspectives, Procs of the 3rd

International Conference the Business Process Management (BPM 2005), 1st International Workshop on Web Service Choreography and Orchestration for Business Process Management, Nancy, France , pages 1-15. (2005)

[9] S Dustdar and W Schreiner**.** A survey on web services composition. Web and Grid Services, Vol. 1, No. 1, pages 1-30. (2005)

[10] Kostadinov D. Personalization of information a profile management approach and reformulation of queries. Thesis in Computer Science. University of Versailles Saint-Quentin-En-Yvelines. (2005)

[11] Serge Garlatti and Yannick Pri, Adaptation and personalization in the Semantic Web, Revue I3 Information-Interaction - Intelligence, 24 pp. (2004)

[12] Fekih, H., Mtibaa, S. and Bouamama, S. (2016), "User-centric web services composition approach based on swarm intelligence", IEEE 18th International Conference on High Performance Computing and Communications.(2016)

[13] Mahmoud Zakaria Maamar. Qusay H. Soraya Kouadri Mostéfaoui**.** Context for personalized web services, Proceedings of the 38th Hawaii International Conference on System Science. (2005)

[14] Amel Hannech. Hamid Mcheick. Semantic web services adaptation and composition method,Department of Computer Science and Mathematics University of Quebec at Chicou- timi (UQAC) Chicoutimi, Canada (ICIW: The Eighth International Conference on Internet and Web Applications and Services). (2013)

[15] George A. Miller, WordNet: a lexical database for English, Magazine

Communications of the ACM CACM Homepage archive Volume 38, pages 39-41. (1995)

[16] Amel Hannech, Hamid Mcheick, Composition model of semantic web services orchestration on demand, Department of Computer Science and Mathematics University of Quebec at Chicoutimi (UQAC) Chicoutimi, Canada. (25th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)). (2012)

[17] Brusilovsky, P., Millàn, E, "User models for adaptive hypermedia and adaptive educational systems", "The Adaptive Web", page 53, (2007)

[18] Golemati M. Halatsis C. Katifori A. Lepouras G. Vassilakis C. creating an ontology for the user profil: Method and applications. Proceedings of the First IEEE International Conference on Research Challenges in Information Science (RCIS). (2007)

[19] Challam V. Chandramouli A. Gauch S. contextual search using ontologybased user profiles, Proceedings of RIAO, Pittsburgh USA. (2007)

[20] Bouzeghoub M, Kostadinov D, Personalization of information: Overview of the state of the art and a flexible model profiles definition, Conference in Search of Information and Applications (CORIA), Grenoble, France. (2005)

[21] Fuhr N. information retrievial: introduction and survey, postgraduate course on information retrievial, University of Duisburg-Essen. (2000)

[22] Lenz, R and Reichert, M, "IT support for healthcare processes – Premises, challenges, perspectives," pp. 39–58, (2007)

Section 2
