**1. Introduction**

Analysis of programming learning for purposes of assisting and qualifying a learning process from beginning to end represents an onerous task to programming practice, since the practice of assisted programming spends time and effort in activities, assessment, especially when there is the application of a lot of exercises and there are many students in a class. Thus, applying learning analysis that makes it possible to compare programming solutions developed by different students and to verify how a student's solution evolve over time represent a real challenge for the evaluation of programming.

#### *Enhanced Expert Systems*

Although there are already several solutions for representing and comparing programming students' profiles [1, 2], there are few solutions for a temporal analysis of the learning of these students during a programming course.

A more recent proposal to analyze programming learning aims to map source codes into software metrics that quantify effort and quality of programming [3]. Through these metrics, for each programming activity, it is possible to compare student's solutions under different variables to identify classes of solutions, common learning difficulties, good practices of programming and even plagiarism.

Although the proposal of [3] makes it possible to compare student profiles of a class in each programming activity, it is laborious for a teacher through this instrument to verify how these evaluation metrics evolve over time, that is, to each activity of a course, for each student. This type of monitoring allows the programming teacher to identify in which students develop better in their learning processes and where students begin to present learning difficulties.

In order to meet this need by offering programming teachers an instrument to monitor the learning process of their students, this chapter extends the proposal of [3] generating 3D views of student profiles mapped into selected software metrics. These metrics characterize each student's efficiency, style, and programming effort with each programming solution they develop over a course.

In addition to the 3D representation to analyze learning, this system selects dynamically programming solution samples for a teacher to score until finding a representative set of rubric representations to inform evaluation criteria. This functionality may contribute later to generate a representative set of programs to train automatic assessment system of programming exercises.

Another feature of this system that is still in the testing phase is the prediction of students' performances in an activity based on their history of solving activities or the solutions of that same activity developed by other students.

The main contribution of this chapter is, therefore, offering a tool to support evaluation, decision-making in the field of programming, enabling teachers to analyze and monitor their students' learning for each programming activity under a wide range of variables, anticipating a predictable future of poor performance.

In order to present the fundamentals and the functionalities of the proposed system, this chapter is organized in the following order. Section 2 presents the related work. Section 3 describes the system architecture with 3D representations of profiles and the selection of rubric representations. Section 4 highlights reports of application of our system in a programming distance course. Section 5 concludes this work highlighting the main results, future work and final considerations.

#### **2. Static analysis of programming**

Static analysis is an automatic assessment approach to programming learning based on analysis of code. Through static analysis, it is possible to analyze effort, complexity, efficiency and quality of programming [4–6].

The main advantages of static analysis are lower cost, less reliance on the teacher's reference solution and the possibility of offering an evaluation closer to human evolution, although many programming teachers have prioritize the dynamic analysis, which is an analysis based on the correct and efficient execution of programs. Static analysis can therefore be used in the analysis of programming codes for the following purposes:

**55**

**2.2 Related works**

*Automatic Mapping of Student 3D Profiles in Software Metrics for Temporal Analysis…*

• Analysis of learning difficulties and good programming practices [3].

**2.1 The evolution of static analysis strategies of programming**

Among the technological solutions of programming learning analysis based on static analysis already proposed, we highlight: the metrics of Halstead and McCabe [4, 5, 13], the evaluation of programming skills by software metrics [10], the recommendation system of activities according to learning difficulties [1], the analysis of difficulties by software metrics [3], the evaluation of how programming students learn from the analysis of their programming codes [14] and the programming

The main static analysis strategies of programming developed from the 1960s to the present day were based on software evaluation metrics that evolved from the purposes of measuring codes and software quality for educational purposes of diagnosing learn-

In the 1970s and 1980s, the software metrics were used to analyze programming codes for the purposes of estimating effort, complexity and programming style. Thus, some developed strategies were associated the programming process with the psychological complexity to evaluate performance in programming without neces-

During the 90s until the year 2010, strategies of static analysis based on metrics for learning analysis, but in times when the Intelligent Tutoring Systems (ITS) were high, it was sought to represent the model or profile of a student, focusing more on his learning. In more recent research on programming learning analysis, in addition to having

The main related works to our proposal are the assessment system based on the software metrics of [3], the instruments of visualization of programming students' profiles of [16], the recognition strategy of profiles by source code analysis metrics of [2], the selection model of features of [17], the system of recognition of rubrics

a concern to better understand the students' learning profiles, there have been attempts to remedy learning difficulties [3]. Other trends in programming learning analysis are proficiency assessment [12], prediction of performances [15] and the

ing difficulties and evaluating difficulties, skills and even programming skills.

sarily having the concern to help those who had more difficulties [4, 9].

*DOI: http://dx.doi.org/10.5772/intechopen.81754*

• Prediction of performance [7, 8]

• Programming style analysis [5, 9]

• Evaluation by software metrics [3, 10]

• Recognition of signs of plagiarism [3]

• Recognition of rubrics [11]

• Recommendation of activities [1]

• Programming proficiency analysis [12]

proficiency analysis of SCALE system [12].

classification of profiles by learning levels [2].

• Effort measurement and coding complexity [4]

*Automatic Mapping of Student 3D Profiles in Software Metrics for Temporal Analysis… DOI: http://dx.doi.org/10.5772/intechopen.81754*


*Enhanced Expert Systems*

plagiarism.

performance.

**2. Static analysis of programming**

codes for the following purposes:

complexity, efficiency and quality of programming [4–6].

Although there are already several solutions for representing and comparing programming students' profiles [1, 2], there are few solutions for a temporal analy-

Although the proposal of [3] makes it possible to compare student profiles of a class in each programming activity, it is laborious for a teacher through this instrument to verify how these evaluation metrics evolve over time, that is, to each activity of a course, for each student. This type of monitoring allows the programming teacher to identify in which students develop better in their learning processes

In order to meet this need by offering programming teachers an instrument to monitor the learning process of their students, this chapter extends the proposal of [3] generating 3D views of student profiles mapped into selected software metrics. These metrics characterize each student's efficiency, style, and programming effort

In addition to the 3D representation to analyze learning, this system selects dynamically programming solution samples for a teacher to score until finding a representative set of rubric representations to inform evaluation criteria. This functionality may contribute later to generate a representative set of programs to

Another feature of this system that is still in the testing phase is the prediction of students' performances in an activity based on their history of solving activities or

The main contribution of this chapter is, therefore, offering a tool to support

evaluation, decision-making in the field of programming, enabling teachers to analyze and monitor their students' learning for each programming activity under a wide range of variables, anticipating a predictable future of poor

In order to present the fundamentals and the functionalities of the proposed system, this chapter is organized in the following order. Section 2 presents the related work. Section 3 describes the system architecture with 3D representations of profiles and the selection of rubric representations. Section 4 highlights reports of application of our system in a programming distance course. Section 5 concludes this work highlighting the main results, future work and final considerations.

Static analysis is an automatic assessment approach to programming learning based on analysis of code. Through static analysis, it is possible to analyze effort,

The main advantages of static analysis are lower cost, less reliance on the teacher's reference solution and the possibility of offering an evaluation closer to human evolution, although many programming teachers have prioritize the dynamic analysis, which is an analysis based on the correct and efficient execution of programs. Static analysis can therefore be used in the analysis of programming

A more recent proposal to analyze programming learning aims to map source codes into software metrics that quantify effort and quality of programming [3]. Through these metrics, for each programming activity, it is possible to compare student's solutions under different variables to identify classes of solutions, common learning difficulties, good practices of programming and even

sis of the learning of these students during a programming course.

and where students begin to present learning difficulties.

with each programming solution they develop over a course.

train automatic assessment system of programming exercises.

the solutions of that same activity developed by other students.

**54**


Among the technological solutions of programming learning analysis based on static analysis already proposed, we highlight: the metrics of Halstead and McCabe [4, 5, 13], the evaluation of programming skills by software metrics [10], the recommendation system of activities according to learning difficulties [1], the analysis of difficulties by software metrics [3], the evaluation of how programming students learn from the analysis of their programming codes [14] and the programming proficiency analysis of SCALE system [12].

#### **2.1 The evolution of static analysis strategies of programming**

The main static analysis strategies of programming developed from the 1960s to the present day were based on software evaluation metrics that evolved from the purposes of measuring codes and software quality for educational purposes of diagnosing learning difficulties and evaluating difficulties, skills and even programming skills.

In the 1970s and 1980s, the software metrics were used to analyze programming codes for the purposes of estimating effort, complexity and programming style. Thus, some developed strategies were associated the programming process with the psychological complexity to evaluate performance in programming without necessarily having the concern to help those who had more difficulties [4, 9].

During the 90s until the year 2010, strategies of static analysis based on metrics for learning analysis, but in times when the Intelligent Tutoring Systems (ITS) were high, it was sought to represent the model or profile of a student, focusing more on his learning.

In more recent research on programming learning analysis, in addition to having a concern to better understand the students' learning profiles, there have been attempts to remedy learning difficulties [3]. Other trends in programming learning analysis are proficiency assessment [12], prediction of performances [15] and the classification of profiles by learning levels [2].

#### **2.2 Related works**

The main related works to our proposal are the assessment system based on the software metrics of [3], the instruments of visualization of programming students' profiles of [16], the recognition strategy of profiles by source code analysis metrics of [2], the selection model of features of [17], the system of recognition of rubrics

with dimensionality reduction of [11] and the study of [18] involving the discovery of longitudinal patterns.

PCodigo II is an online system of automatic mapping of students' profiles in software metrics to analyze programming learning [3]. In addition to profiling mapping in 348 software metrics, PCodigo II has massive execution, similar profile graphing, information visualization, and plagiarism analysis capabilities.

The first applications of PCodigo II of [3] in real programming exercises demonstrate the effectiveness of this system for the diagnostic assessment of programming learning. Thus applying PCodigo II in real programming exercises it was shown that teachers, taking into account what the metrics say, can recognize the learning difficulties, good programming practices and classes of learning profiles of a whole class in a fast, detailed and holistic way.

The chapter of [16] presents some information visualization instruments in a multidimensional perspective to help teachers in the analysis of programming learning with mapping of profiles on software metrics. Through generated visualizations, we can analyze and compare profiles under different variables to recognize learning difficulties and classes of solutions from similar characteristics.

The strategy of profile recognition by static analysis of codes based on metrics of [2] aims to infer profiles of programmers from analysis of their Java code, classify them according to skills and continually evaluate their progress in the practice of programming in a course. The detected profiles are a novice, advanced beginner, proficient and expert.

Some metrics used in this strategy are a number of sentences, conditional control and repetition structures, types of data, classes, operators, lines of code, and other code. The advantage of this strategy in relation to our system is to classify and qualify students. However, we automatically select the most appropriate metrics to evaluate each type of programming solution.

For an automatic selection of evaluation variables, we highlight the selection model of the characteristics of [17], which combines clustering techniques and algorithm to create a feature map by selecting relevant terms in the texts of the groups of notes of the evaluation of a teacher. In our proposal, the relevant characteristics, that is, the most important metrics for each programming solution, we can visualize through heat maps comparing different solutions from five or more software metrics.

Regarding the composition of rubrics, a strategy to highlight is the proposal of [11], which is based on clustering and Principal Component Analysis techniques to recognize, from solutions developed by students, examples of solutions that represent, in a rubric scheme, the scores attributed by a teacher. This work complements these proposals by generating a ranking of samples of programming solutions for a teacher to score until finding the best set of rubric representations with a diversity of marks awarded.

According to [18], to understand how learning unfolds in the over time, it is necessary to move to a new learning perspective in which the units of analysis are separate but interrelated learning events.

Following this idea, the study of [18] investigates and validates longitudinal patterns in online participation as a measure to differentiate student performances.

The proposal of the system of this work, based on the study of [18], seeks to understand how programming learning unfolds and analyze longitudinal patterns.

In this way, following this proposal, in relation to the other Works Presented, we advanced in the 3D representation of profiles of programming students, in the view of characteristics represented by software metrics over time and the composition of rubrics from a ranking of selected solutions automatically for a teacher to score.

**57**

*Automatic Mapping of Student 3D Profiles in Software Metrics for Temporal Analysis…*

**3. 3D representation system of programming students' profiles**

The system of representation of profiles presented in this chapter is an evolution of *PCodigo II*, a software developed by which, by software metrics that quantifies effort and quality of programming, recognizes possible learning difficulties, good programming practices and until strong evidence of plagiarism among

Thus our system extends the students' profiles representation of *PCodigo II* in a temporal dimension, selects more relevant metrics and allows the automatic selection of representative examples from a set of source codes for composition of rubric

**Figure 1** shows the system's architecture proposed in a scheme of inputs, processing and outputs come an integration of our system to the 1.0 and 3. x versions of

According to **Figure 1**, for version 1.9 of *Moodle*, the system receives as input a backup of Moodle's *Compacted Classroom* (in .zip, .rar, .gz or .tgz formats). For version 3. x of *Moodle*, the system is accessed through *Teacher's Credentials* to access

The course data imported from *Moodle* are as follows: student listing, activity listing, activity notes and *Submissions*, that are files of programming exercises. These data are then extracted by the *Extracting and Preprocessing* module and *Submissions* containing source codes that were written in C, C++, Java or Python languages are mapped to vectors whose dimensions are software metrics that quantify effort and quality of programming [3]. The submitted C programs are mapped

Each vector representation on software metrics of a student's programming solution we call *Learning State*. Then, after generating *Learning States* of a programming class, the system gathers these representations in a *Cognitive Matrix* for analysis and

In order to analyze solutions in a generic way, we have reduced each *Learning State* to five metrics: *Maintainability*, *Cyclomatic Complexity*, *Indentation*, *Laconism*

• *Maintainability* represents the student's ability to write durable and adaptable

• *Cyclomatic Complexity* informs the complexity of a programming code that is

• *Indentation metric* characterizes the instructions of a program within structures

• *Laconism* expresses the capacity to express itself in a few words that in pro-

• *Modularization* informs organizational capacity of the parts of a functional or

Then, bringing together the cognitive matrices for each programming solution of a course, a *3D Representation of Learning Profiles* of a programming class. The same procedure is performed for a *Reduced Matrix*. This timeline formed by a set of

gramming is measured by the number of tokens per line of code.

*Learning States* of a student over a course is called *Learning Profile*.

on 348 software metrics and the Python programs, in 42 metrics.

the number of paths of a method [Curtis et al. 1979].

*DOI: http://dx.doi.org/10.5772/intechopen.81754*

*Moodle* virtual learning environment.

a distance programming course of *Moodle*.

comparison of programs written by students [3].

and *Modularization*. They are described as follows:

code to new needs.

and functions.

data module.

programs [3].

representation.

*Automatic Mapping of Student 3D Profiles in Software Metrics for Temporal Analysis… DOI: http://dx.doi.org/10.5772/intechopen.81754*
