2.1. Establishment of the body of documents

We start from the base that every research field has a set of scientific communications that contribute to the development of the subject. To identify these communications and analyze them, the subsequent steps can be followed:

Then, a bibliometric analysis based on keywords co-occurrence is carried out, aimed to determining the primary descriptors that are mostly present in the publications, their relationships and relevance, by means of the Visualization of Similarities (VoS) technique [20]. Additionally, they include secondary possible descriptors that reflect the same meaning, fruit of the linguistic similarities and/or acronyms or abbreviations that are used in the natural language. For example, when including the keywords of an article you can choose to use the e-learning or

Mapping a Research Field: Analyzing the Research Fronts in an Emerging Discipline

http://dx.doi.org/10.5772/intechopen.76731

53

• Primary descriptors: 51. E-learning, LMS, b-learning, online learning, Moodle, mlearning, ICT, learning objects, technology acceptance model, e-learning platform, adaptive learning, e-assessment, web-based learning, virtual learning environments, adult learning, informal learning, instructional design, SCORM, augmented reality, educational technology, intelligent tutoring systems, remote laboratory, simulation, learning analytics, learning environments, e-learning 2.0, teaching and learning, interactive learning environments, educational data mining, gamification, learning design, social learning, lifelong learning, metadata, MOOC, virtual classroom, labview, learning methods, personal learning environments, adaptive elearning systems, computer-based learning, information literacy, virtual learning, Blackboard, continuing education, game-based learning, interactive learning, personalized learning, recommender systems, virtual laboratories, virtual reality. • Secondary descriptors: 13. elearning, electronic learning, Learning management system, blearning, blended learning, mlearning, mobile learning, Information and communications technologies, eassessment, electronic assessment, VLE, Massive

Step 2. Correspondence of publications and descriptors. In a matrix containing all the indexed scientific publications and the primary and secondary descriptors identified, the number of articles published by the Conference Proceeding or the Journal with that descriptor in the title, abstract and keywords fields is recorded at each crossing. It is very important to use the same selection criteria described in the previous step to ensure information integrity. Then, the primary and secondary descriptors related to the same term are added, assuming that the

sum reflects unique publications related to each other by the descriptors.

• Journals and conference proceedings included in the matrix: 12.923

elearning descriptor [21].

E-learning case

E-learning case

• Keywords: 4521

Open Online Courses and PLE.

Step 1. Definition of descriptors. It is about knowing all those terms present in the primary scientific literature with which the subject has been described. As expected, we start from a core term, which is generally the same as the research field. With this term, all the publications whose title, summary and keywords include the core term are identified in a comprehensible database.

#### E-learning case


The search results should be refined according to the desired coverage degree in the analysis and the access availability of the bibliometric data.

#### E-learning case


The set of publications obtained can be used in its entirety or from a statistically representative sample.

#### E-learning case


Then, a bibliometric analysis based on keywords co-occurrence is carried out, aimed to determining the primary descriptors that are mostly present in the publications, their relationships and relevance, by means of the Visualization of Similarities (VoS) technique [20]. Additionally, they include secondary possible descriptors that reflect the same meaning, fruit of the linguistic similarities and/or acronyms or abbreviations that are used in the natural language. For example, when including the keywords of an article you can choose to use the e-learning or elearning descriptor [21].

### E-learning case

2.1. Establishment of the body of documents

them, the subsequent steps can be followed:

and the access availability of the bibliometric data.

• Publication type: Journal and Conference Proceeding • Document type: Article, conference paper and review

database.

52 Scientometrics

E-learning case

E-learning case

• Language: English.

sample.

E-learning case • Results: 9291

• Representative sample: 2000 (21.6%)

• Core term: e-learning

ceedings [18].

We start from the base that every research field has a set of scientific communications that contribute to the development of the subject. To identify these communications and analyze

Step 1. Definition of descriptors. It is about knowing all those terms present in the primary scientific literature with which the subject has been described. As expected, we start from a core term, which is generally the same as the research field. With this term, all the publications whose title, summary and keywords include the core term are identified in a comprehensible

• Data source: SCOPUS, database that indexes mostly journals and conference pro-

The search results should be refined according to the desired coverage degree in the analysis

• Analyzed timespan: 2012–2014. It corresponds to a period in which there is a stable worldwide production in e-learning, since in the previous period it was in constant growth and in the following period there was a significant decrease in production [19].

The set of publications obtained can be used in its entirety or from a statistically representative


Step 2. Correspondence of publications and descriptors. In a matrix containing all the indexed scientific publications and the primary and secondary descriptors identified, the number of articles published by the Conference Proceeding or the Journal with that descriptor in the title, abstract and keywords fields is recorded at each crossing. It is very important to use the same selection criteria described in the previous step to ensure information integrity. Then, the primary and secondary descriptors related to the same term are added, assuming that the sum reflects unique publications related to each other by the descriptors.

#### E-learning case

• Journals and conference proceedings included in the matrix: 12.923

Step 3. Percentage of participation in the subject (PP). It is the percentage of articles in the publication that are related to the subject during the timespan established in the initial criteria, this is done by taking the maximum number of articles per descriptor, bearing in mind that an article may be related to more than one descriptor.

E-learning case

tion of pure and hybrid publications [1]).

other areas of knowledge in their scope.

ogy were identified.

guidelines followed by the researchers.

E-learning case

http://www.scimagojr.com

5

• The set of publications must maintain an average PP higher than 50%, for which the cut-off point per publication was established at 25% (coinciding with the classifica-

Mapping a Research Field: Analyzing the Research Fronts in an Emerging Discipline

http://dx.doi.org/10.5772/intechopen.76731

55

• The cut-off point included 11 publications that were excluded because they defined

• 82 journals and 137 conference proceedings that meet the criteria of the methodol-

Step 5. Publication set analysis. The set of selected publications is analyzed under a bibliometric approach (a) to determine if it represents the existence of a scientific community that communicates its knowledge through these channels and (b) to recognize it as an emerging and distinctive scientific discipline that can be defined as a transversal thematic category [5]. For this, the mapping overlay technique [7] can be used, which facilitates the exploration of the knowledge bases of an emerging discipline and its evolutionary dynamics. This technique requires a base map on which to overlay a local map (thematic) and thus make comparisons. This overlap allows placing the discipline in the general topology of scientific knowledge and identifying whether a cluster effect occurs, which should be considered as evidence of the existence of a specific disciplinary field from the point of view of scientific communication

The relation degree of publications is established by the normalized value produced by the combination of citations, co-cites and coupling [22, 23]. In addition, this analysis can be enriched with the distribution by clusters that visualization tools perform, such as VOSViewer [24].

• The base map is a global map of science that includes the total number of publications indexed in SCOPUS, made up of 7 clusters, which in a clockwise and broad sense can be named as follows: Social Sciences (red), Psychology (light cyan), Medicine (green), Health Sciences (purple), Life Sciences (yellow), Physical Sciences

• The local map that is overlaid on the global map of science is the set of 219

• There is a cluster effect that shows a high cohesion among publications, which is sufficient evidence, in terms of scientific communication, that e-learning is a distinctive

.

(dark cyan) and Engineering and Computer Science (blue) (Figure 2). • The composite indicator was arranged by SCImago Journal & Country Rank5

publications selected in the previous step (Figure 3).

#### E-learning case

Correspondence matrix description (Figure 1):


Figure 1. Percentage of participation (PP) of the term in journals and conferences. Source: [9].

Step 4. Cut-off point for the inclusion of publications in the analysis. You must determine the cut-off point over the PP from which the publications for the categorization of the thematic will be included. Other studies have classified publications among "pure", "hybrid" and "unrelated" publications in a given subject [1] and on the determination of the core set of publications [21]. However, we believe that this value should be established through the combination between the maximum allowed error of the subject relation of the publication and the average PP of the total set of publications. The higher the cut-off point, the greater the precision in the selection of journals will be. Although, this precision means a reduced volume, and if not, a low cut-off point increases the error in the selection and its volume. Once the cutoff point is established, all publications that exceed this threshold are considered as the basic set of analysis of the emerging subject category.

#### E-learning case

Step 3. Percentage of participation in the subject (PP). It is the percentage of articles in the publication that are related to the subject during the timespan established in the initial criteria, this is done by taking the maximum number of articles per descriptor, bearing in mind that an

• 3.680 journals and conference proceedings do not have any publication related with

Step 4. Cut-off point for the inclusion of publications in the analysis. You must determine the cut-off point over the PP from which the publications for the categorization of the thematic will be included. Other studies have classified publications among "pure", "hybrid" and "unrelated" publications in a given subject [1] and on the determination of the core set of publications [21]. However, we believe that this value should be established through the combination between the maximum allowed error of the subject relation of the publication and the average PP of the total set of publications. The higher the cut-off point, the greater the precision in the selection of journals will be. Although, this precision means a reduced volume, and if not, a low cut-off point increases the error in the selection and its volume. Once the cutoff point is established, all publications that exceed this threshold are considered as the basic

Figure 1. Percentage of participation (PP) of the term in journals and conferences. Source: [9].

• 7.801 journals and conference proceedings have a PP lower than 5%.

article may be related to more than one descriptor.

Correspondence matrix description (Figure 1):

any of the 64 descriptors.

set of analysis of the emerging subject category.

E-learning case

54 Scientometrics


Step 5. Publication set analysis. The set of selected publications is analyzed under a bibliometric approach (a) to determine if it represents the existence of a scientific community that communicates its knowledge through these channels and (b) to recognize it as an emerging and distinctive scientific discipline that can be defined as a transversal thematic category [5]. For this, the mapping overlay technique [7] can be used, which facilitates the exploration of the knowledge bases of an emerging discipline and its evolutionary dynamics. This technique requires a base map on which to overlay a local map (thematic) and thus make comparisons. This overlap allows placing the discipline in the general topology of scientific knowledge and identifying whether a cluster effect occurs, which should be considered as evidence of the existence of a specific disciplinary field from the point of view of scientific communication guidelines followed by the researchers.

The relation degree of publications is established by the normalized value produced by the combination of citations, co-cites and coupling [22, 23]. In addition, this analysis can be enriched with the distribution by clusters that visualization tools perform, such as VOSViewer [24].

#### E-learning case


<sup>5</sup> http://www.scimagojr.com

scientific discipline, since there is a network of relationships and interactions that are established between the authors and scientists who share thought structures, cooperation patterns, language and forms of communication.

• The publications distribution shows a main group in Social Sciences and other small groups in Computer Science and Psychology.

Figure 2. Global map of science based on SCOPUS and SCImago Journal & Country Rank using VOSViewer with its density map setting (Source: [9]).

E-learning case

visual purposes (Source: [9]).

which are:

• Publication type: Journal and Conference Proceeding • Document type: Article, conference paper and review

don't contribute to the identification of research fronts.

• Minimum number of papers published by Journal/Conference Proceeding: 100

Figure 3. Distribution of publications related to the thematic, using the mapping overlay technique with VOSViewer in its density map configuration. The color of the publication indicates the area of knowledge in which it is superimposed and its size corresponds to the percentage of participation The size of the selected publications has been modified for

Mapping a Research Field: Analyzing the Research Fronts in an Emerging Discipline

http://dx.doi.org/10.5772/intechopen.76731

57

The second task is to configure the variables that determine the form of the wordcloud, among

1. Keep each term with its own length. You can fall into the error of disaggregating terms, for example, the term Information and Communication Technologies should remain as one and

2. Don't include terms in the visualization that correspond to the same name of the scientific field analyzed, places, dates, proper names, names of organizations and all others that

• Analyzed timespan: 2012–2014.

• Number of terms to display: 100

not separate it into 3 or 4 parts.

• Language: English.

#### 2.2. Identification of research fronts

To identify the research fronts through the visualization of keywords in a wordcloud, it is necessary to identify the body of publications on which the analysis is going to be carried out (previous section). Then, all the keywords of the publications are extracted, keeping the same filters defined in the previous stages, with the confidence of finding a set of structured and well-defined terms. This technique provides value when the data has a treatment that ensures a correct interpretation. This is done through two tasks, being the first to refine the set of terms (which can be in the order of thousands) to obtain those that are mostly different and that can be visually represented without loss of information. The refinement process may include a minimum threshold of articles published by a journal or conference report to ensure that there is a volume and regularity guaranteed in the conceptual development of the thematic. It can also be refined by defining the number of terms to be displayed in the wordcloud.

Figure 3. Distribution of publications related to the thematic, using the mapping overlay technique with VOSViewer in its density map configuration. The color of the publication indicates the area of knowledge in which it is superimposed and its size corresponds to the percentage of participation The size of the selected publications has been modified for visual purposes (Source: [9]).

#### E-learning case

scientific discipline, since there is a network of relationships and interactions that are established between the authors and scientists who share thought structures, coopera-

• The publications distribution shows a main group in Social Sciences and other

To identify the research fronts through the visualization of keywords in a wordcloud, it is necessary to identify the body of publications on which the analysis is going to be carried out (previous section). Then, all the keywords of the publications are extracted, keeping the same filters defined in the previous stages, with the confidence of finding a set of structured and well-defined terms. This technique provides value when the data has a treatment that ensures a correct interpretation. This is done through two tasks, being the first to refine the set of terms (which can be in the order of thousands) to obtain those that are mostly different and that can be visually represented without loss of information. The refinement process may include a minimum threshold of articles published by a journal or conference report to ensure that there is a volume and regularity guaranteed in the conceptual development of the thematic. It can also be refined by defining the number of terms to be displayed in the

Figure 2. Global map of science based on SCOPUS and SCImago Journal & Country Rank using VOSViewer with its

tion patterns, language and forms of communication.

small groups in Computer Science and Psychology.

2.2. Identification of research fronts

density map setting (Source: [9]).

56 Scientometrics

wordcloud.


The second task is to configure the variables that determine the form of the wordcloud, among which are:



Finally, by means of a rapid visual analysis of the generated wordcloud, the research fronts of the scientific field can be identified in a differentiated way.

#### E-learning case


A limitation of wordclouds, that can affect the reader's interpretation, is the term length that can capture a quick attention being located in a central place of the visualization without having significant weight. However, this visualization technique is a powerful tool to abstract relevant information from large volumes of information, in addition, it can be used to observe the main trends of other bibliometric data. For example, journals and congresses with the greatest influence in the discipline or the institutions and countries that contribute the most to the discipline productivity.

3. Conclusions

negative diagonal format (Source: Self-made).

format (Source: Self-made).

This study proved that bibliometric analysis combined with visualization techniques provides

Figure 5. Wordcloud of e-learning worldwide, based on data from SCImago Journal & Country Rank in positive and

Figure 4. Wordcloud of e-learning worldwide, based on data from SCImago Journal & Country Rank in a positive diagonal

Mapping a Research Field: Analyzing the Research Fronts in an Emerging Discipline

http://dx.doi.org/10.5772/intechopen.76731

59

sufficient elements to map an emerging discipline, in this case study, e-learning.

Mapping a Research Field: Analyzing the Research Fronts in an Emerging Discipline http://dx.doi.org/10.5772/intechopen.76731 59

Figure 4. Wordcloud of e-learning worldwide, based on data from SCImago Journal & Country Rank in a positive diagonal format (Source: Self-made).

Figure 5. Wordcloud of e-learning worldwide, based on data from SCImago Journal & Country Rank in positive and negative diagonal format (Source: Self-made).
