**6.3 The MapReduce model architecture**

Google created MapReduce to process large quantities unstructured or semistructured data, such as documents and logs of requests for web pages, on large clusters of nodes. It produced different types of data, such as inverted indices or

<sup>1</sup> The four sources of big data, https://www.communication-web.net/2016/03/07/ les-4-sources-du-big-data/

**Figure 4.**

*An example of data flow in the MapReducee big data architecture [29].*

URL access frequencies [28]. The MapReduce has three main parts, including the Master, the Map and reduce function. An example of this data flow is shown in **Figure 4**.

The Master is responsible for the management of the Map and Reduce functions and the provision of data and procedures, he organizes communication between mappers and reducers. The map function applies to each input record and produces a list of intermediate records. The Collapse function (also known as Reducer) is applied to each group of intermediate records with the same key and generates a value. Therefore, the MapReduce process includes the following steps:

