**6. Probabilistic frequent itemsets analysis of learning behaviors**

Based on LB-Eclat algorithm, the probabilistic frequent itemsets of 11 data sets of learning behaviors are mined, and the itemsets with high probability are found. On the basis of "Support" (>0.3) and "Conference" (>0.7), the probability frequent itemsets of each dataset are mined, and then the association degree of rules generated by itemsets is verified by "Lift". If "Lift" > 1, the association degree of relevant rules is high. In the mining results of probabilistic frequent itemsets, 2-itemsets are the most, as shown in **Tables 6**–**8**, the other 3-itemsets and 4-itemsets are mainly based on the intersection and combination of 2-itemsets. The higher the density of data sets, the more frequent itemsets are mined. Based on the constraints of "Support" and "Confidence", some data sets are limited to 2-itemsets, such as L1-p2 and L1-p4.

From the distribution of frequent 2-itemsets in **Tables 6**–**8**, they have the following characteristics:

1.There is a strong correlation between the components of learning behaviors, and even has a more obvious impact on the components of learning results. In the data set of approximate density, the frequent itemsets of Technology courses are significantly more than that of Literature courses. It shows that the learning behavior components of Technology courses have a strong diversity, and there is a continuous and serial interaction between the components, which makes learners form the approximate frequency participation. Compared with Literature courses, the components of Technology courses are more conducive to the formation of frequent itemsets of learning behaviors.


#### **Table 6.**

*Probabilistic frequent 2-itemsets of sparse density data sets.*

*Improved Probabilistic Frequent Itemset Analysis Strategy of Learning Behaviors Based on… DOI: http://dx.doi.org/10.5772/intechopen.97219*


#### **Table 7.**

*Mining results of probabilistic frequent 2-itemsets of moderate density data sets.*

2.For sparse density data sets, "forumng", "homepage" and "content" are beneficial to form frequent 2-itemsets with other components, which is obviously reflected in different data sets of Literature and Technology courses. "wiki" also has frequent interaction with other components in Technology courses; For moderate density and dense density data sets, frequent 2-itemsets are similar, "forumng", "homepage", "content", "url", "quiz" and "subpage" all have strong component correlation. For Technology courses, frequent itemsets formed by "dataplus", "dualpane", "wiki" and "questionnaire" are used widely and frequently.

For the frequent itemset association rules of learning behavior components, three indicators are used to measure, which are "Support", "Confidence" and "Lift". "Support" determines the correlation between the components. "Lift" > 1 indicates that there is association and has positive correlation. The higher "Lift" is, the more valuable the association rules are; if "Lift" < 1 and smaller, there is negative correlation; if Lift = 1, the components are independent and have no correlation. The association rules with "Lift" > 1 and high confidence are listed and shown in **Table 9**, these association rules are the basis for tracking, adjusting and optimizing learning behaviors.

On the whole, the association rules corresponding to the probabilistic frequent itemsets of sparse density data sets are less, and the association rules of Literature courses are less in the same density data sets [22]. For the moderate density and dense density data sets of Technology courses, rules are formed among the


#### **Table 8.**

*Probabilistic frequent 2-itemsets of dense density data sets.*



*Improved Probabilistic Frequent Itemset Analysis Strategy of Learning Behaviors Based on… DOI: http://dx.doi.org/10.5772/intechopen.97219*

#### **Table 9.**

*Association rules generated by probabilistic frequent Itemsets.*

components of learning behaviors, and some of the components can produce rules with high credibility and strong relevance with the final assessment results.

It can be seen from **Table 9** that there are common association rules of components among different data sets, which indicates that these rules have strong generality; for Literature courses or Technology courses, there are some similarities in association rules, but there are also obvious differences; For the same course, in different periods, the results show that the association rules of probabilistic frequent itemsets have both intersection and differences. About {content, questionnaire, subpage, url} ! {homepage}, {resource, url} ! {subpage} and {resource, url} ! {subpage}, the "Lift" values are higher, indicating that the association degree is very high. From the table, it is easy to form strong association rules around "questionnaire", "quiz", "forumng", "homepage", "resource", "subpage", "url" and so on. "dataplus", "dualpane", "folder", "wiki" and so on have strong relevance in Technology courses. Some of components have an obvious impact on the learning results. The extraction of these association rules can greatly simplify the categories of components in **Tables 1**–4.

#### **Figure 11.**

*The key topology of learning behaviors based on probabilistic frequent Itemsets and association rules.*

The mining of probabilistic frequent itemsets and the learning of association rules are conducive to the evaluation and recommendation of components in the construction of learning behaviors [22–24]. At the same time, the formation process of learning behaviors can realize the aggregation of effective components according to these association rules. For the components related to association rules, we can build elastic proximity relationships or timely guidance strategies and recommendation mechanisms. This can effectively guide the learning processes, on the other hand, according to the needs of learning objectives, we can design association rules of probabilistic frequent itemsets according to the historical data, which is conducive to analyze and predict feasible participation components.

Based on the data in **Tables 6**–**9**, the nodes and edges of component interaction processes are constructed, and the key constituent units of learning behavior data sets are generated by Gephi. **Figure 11** shows the topological structure and relationship weight of probabilistic frequent itemsets. There are 14 participation components involved and the weight of each relationship (edge) is calculated automatically. The thickness of the line indicates the strength of the relationship, and the dotted lines represent the potential relationships. The construction and extraction of the key topology of learning behaviors supported by probabilistic frequent itemsets are completed, which is a referential result of data-driven learning behavior prediction and decision making.

### **7. Decision-making scheme for improving learning behaviors**

Studying learning behavior through big data can promote learners to improve their learning processes and learning effects [25]. Aiming at the mining and association analysis of probability frequent itemsets, we realize 11 data subsets of learning behaviors with components as the basic structure characteristics. On the basis of Eclat framework, the vertical data format is adopted to design and improve the data structure and analysis algorithm for learning behavior components. Through the indicator comparison of approximate algorithms, the improved algorithm is effective and feasible for the analysis processes of data subsets, especially in the application of moderate density and dense density data set. Based on the data analysis

#### *Improved Probabilistic Frequent Itemset Analysis Strategy of Learning Behaviors Based on… DOI: http://dx.doi.org/10.5772/intechopen.97219*

results, "Support", "Confidence" and "Lift" are the measurement indicators, and the corresponding thresholds are set. The probabilistic frequent itemsets and association rules are mined, and the key topology of learning behaviors supported by the probabilistic frequent itemsets are constructed. The whole processes of mining and analyzing probabilistic frequent itemsets are based on the vertical data format, which ensures the depth and breadth of data research results for decision prediction.

The research of learning behaviors is a specific branch of big data. It is different from other types of data characteristics. Because of the periodicity, continuity, collectivity and individuality of learning behaviors, there may be greater instability and discreteness between the generated data and the expected data. It is very difficult in data analysis and decision making, so it is necessary to design appropriate data structures and algorithms [26, 27] to carry out multi-dimensional empirical study on learning behaviors. Through a series of work and research results of probabilistic frequent itemsets analysis, the following decision schemes are obtained.

#### **7.1 Learning content will affect the frequent itemsets of learning behaviors**

Learning content determines learners' tendency. The data of learning behaviors focuses on two Literature courses and two Technology courses, which correspond to multiple learning periods respectively. On the whole, the learning process of Technology courses more complicated, the learning behavior components are more diverse, and the online learning process description is also quite complete and comprehensive, that forms larger scale datasets. Learning content will affect the data density, components and the actual learning processes of learners, which determines the frequent itemsets mining results. For example, from the probabilistic frequent itemsets of the two learning periods of L1 course, the online learning processes corresponding to the learning contents do not have advantages, there is no effective correlation between the components and the learning assessment results, and the advantages of online learning mode are not obvious, which may be more suitable for the teaching mode.

Therefore, the construction of learning behaviors depends on the learning content. According to the mining results of frequent itemsets of historical data and the analysis of association rules, the learning mode of the course is optimized in the new learning period. Based on the learning content, we guide or expand the components of learning behaviors, so as to enhance the learning interest.

#### **7.2 Teaching goals will affect the frequent itemsets of learning behaviors**

The same learning content in different learning period, can produce different learning behavior data density, so as to get different frequent itemsets. In different learning periods, the frequent itemsets and association rules obtained by the algorithm are similar, but there are also obvious differences. The components are not the same, and some data sets are quite different. Learners in different periods have different teaching needs, and then correspond to different teaching objectives; On the other hand, the participation and traction in the learning process make the different participation components, and the stickiness of different components are different, which determines the frequent items, and thus produces different association rules, it even affects learners' assessment methods and learning results.

Therefore, the construction of learning behaviors should consider the learning periods and the actual learners, flexibly construct teaching objectives, and design adaptive learning behavior components. In the learning processes, we should also timely analyze the learning behaviors, mine the existing problems and learners' preferences, adjust the components in time, and optimize the learning methods appropriately. We should build a real-time and effective data tracking and analysis mechanisms.
