**Abstract**

Food recommendation is an important service in our life. To set a system, we searched a set of food images from social network which were shared or reviewed on the web, including the information that people actually chose in daily life. In the field of representation learning, we proposed a scalable architecture for integrating different deep neural networks (DNNs) with a reliability score of DNN. This allowed the integrated DNN to select a suitable recognition result obtained from the different DNNs that were independently constructed. The frequent set of foods extracted from food images was applied to *Apriori* data mining algorithm for the food recommendation process. In this study, we evaluated the feasibility of our proposed method.

**Keywords:** food recommender, food data mining, image recognition, deep neural network, data mining algorithm

## **1. Introduction**

People are now consuming more high energy foods, fats and meat, and most of them do not eat enough fruit, vegetables and other dietary fiber. The make-up of a diversified and combined food will vary depending on individual characteristics such as age, gender, lifestyle and degree of physical activity, cultural context, locally available foods and dietary customs. While the increase of food services plays an important role in food market and business, the food service industry is a vital part of economy [1–3]. The business relies on its management to control costs, keep customers happy, and ensure smooth operations on a daily basis. There are many different types of food service types or procedures, but the major category of the food service is Buffet and Family style services.

Being an industry that serves the human needs, the food service is always the forefront of innovation. Even the food safety practices have been continuously updated along with legislation, the service is still facing a number of issues such as food technologies and consumer trends. For example, a customer wants to know the food information in order to have a set of food on the table. Foreigners who are not familiar with the local foods would like to enjoy having foods in a common style in those countries. In addition, a food designer is seeking a new decoration idea for the beautification of foods on plate. Food recommendation therefore is an important

tool to enrich our life. It can be defined as a system that will recommend items to the users/customers within an environment depending on their past activities.

There was demonstration that digital imaging could estimate food information in many environments and it had many advantages over other methods [4, 5]. However, to derive the food information such as food type, food combination and portion size from food images remains uncertainty.

Accordingly, to achieve better food recommendation, it would be useful to analyze foods that people are actually eating in daily life. POS (Point of Sale) is a large-scale transaction data relevant to the customer's purchase tendency [6]. The data is used only by individual store and not open for public. Therefore we cannot analyze the food purchase data among different stores, restaurants, canteens, and so on. *Amazon Go* is a smart store where a purchase transaction can be detected by a camera. Comparing with the POS system, the *Amazon Go* provides an automatic management of information about the foods that people bought, including the items that associated with those products and the appearance of each item with individual preference [7]. In addition, the system can predict the expectation of the market. Note that this kind of *Amazon-Go-like* system has similar constraint for collecting big data. The obtained database from different sources thus is varied. Therefore we can say that there is a limitation of integrating the purchase transaction over different database as shown in **Figure 1**.

From the diagram, it seems to be meaningful to create a system that analyzes the big data of food-images from various communities including companies, restaurants, and groups in social network system for extracting the people's preference of food combination, food design, and food appearance by applying the image recognition technology. In the field of learning representation, there are many established models such as Artificial Neural Network (ANN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN).

ANN is a broad term that encompasses any form of deep learning model. It can be either shallow or deep depending on the number of hidden layers. CNNs are designed specifically for computer vision. They are different from standard layers of ANNs as they are constructed to receive and process pixel data. RNNs are the "time series version" of ANNs. They are meant to process the sequences of data. They are at the basis of forecasting models and language models. The most common kind of recurrent layers are called LSTM (Long Short Term Memory) and GRU (Gated Recurrent Units). They contain a series of small, in-scale ANNs that are able

**75**

three layers -

computational time.

*A Food Recommender Based on Frequent Sets of Food Mining Using Image Recognition*

hard to find the proper model that can recognize all of them.

the network architecture and hyper-parameters.

**2.1 Concept of convolutional neural network (CNN)**

extract some features such as edge, blur, and color.

Learning extraction and Classification [17].

**Step-1: Learning extraction**

to choose the existed information they want to let it flow through the model. That is

However, some problems for food recognition by these models may occur as there are tremendous number of foods. In addition, it remains a challenging issue due to the complexity of emotional expressions, arising from the food variety, gender differentiation, cross culture and age-related differences [8, 9]. Thus, it is

Food image recognition is one of the promising applications of visual object application, as it will help estimate food characteristics and analyze people's eating choices for daily life. Many research works represented food recognition more practical by using the convolutional neural network (CNN) model [10–12]. CNN was applied to the tasks of food detection and recognition through parameter optimization. A dataset of the most frequent food items was constructed in a publicly available food-logging system. The CNN showed significantly higher accuracy than a conventional method did. In addition, the color feature is not always helpful for improving the accuracy by comparing the results of two group of controlled trials. It was reported that the achievement of CNN model was at 70–80% on one dataset and 60% on the multi-food dataset. The improvements could be expected by collecting more images and optimizing

For example, Deep Convolutional Neural Network (DCNN) was introduced for food recognition based on a combination of CNN-related techniques such as pre-training with the large-scale ImageNet data, fine-tuning and activation features extracted from the pre-trained CNN [13, 14]. Another approach was based on two main steps: firstly, to produce a food activation map on the input image (i.e. heat map of probabilities) for generating bounding boxes proposals and, secondly, to recognize each of the food types or food-related objects presented in each bounding box [15]. Interestingly, the Max-Pooling function was used for the data and the features extracted from this function were used to train the network. An accuracy of 86.97% for the classes of the FOOD-101 data set was recognized [16]. It was found that the image classification could be extended using prominent features that could categorize food images. Note that the feature-based approach and the multi-level classification approach (hierarchical approach) were highly appreciable to avoid mis-classifications when the number of classes was increased. However, these methodologies consumed high

Convolutional neural network is a network that employs a mathematical operation called convolution. There are two main processes in CNN architecture –

This process executes feature extraction from images through the following

a.Convolution layer: this is the first layer to extract features from an input image. There are matrix filters (feature map) that multiplies with image in order to

*DOI: http://dx.doi.org/10.5772/intechopen.97186*

how they establish the "memory".

**2. Related work**

**Figure 1.**

*Limitation of integrating the purchase transaction over different database.*

#### *A Food Recommender Based on Frequent Sets of Food Mining Using Image Recognition DOI: http://dx.doi.org/10.5772/intechopen.97186*

to choose the existed information they want to let it flow through the model. That is how they establish the "memory".

However, some problems for food recognition by these models may occur as there are tremendous number of foods. In addition, it remains a challenging issue due to the complexity of emotional expressions, arising from the food variety, gender differentiation, cross culture and age-related differences [8, 9]. Thus, it is hard to find the proper model that can recognize all of them.
