**3.1. Domain-level research patterns**

Citation paths at a disciplinary level are depicted in the visual representation called a dual-map overlay [6] (see **Figure 4**). The left regions represent where the collected literature publishes while the right regions render where it cites from. Citing literature and cited literature are also called research frontier and knowledge base respectively. The base map consists of the journal/ conference-level citation relationships among over 10,000 venues. Major clusters are labeled by terms chosen from the titles of venues in corresponding clusters. First, all of the terms' loglikelihood ratios are calculated based on their frequency in clusters. The use of LLR achieves to represent those terms' uniqueness in clusters. Then, top three terms are selected to tag clusters, based on their LLR values in descending order. Citation trajectories are colored based on the citing regions. The width of the paths is proportional to the z-score-scaled citation frequency.

**Table 3** describes these trajectories in descending order of the third column, namely Z-score. The color of each row is corresponding to the path. Findings indicate that scientometrics has been largely driven by social sciences and medicine as represented by "psychology, education, health" and "medicine, medical, clinical" respectively at the first column. Literature from social sciences heavily cites from "psychology, education, social", "systems, computing, computer", "health, nursing, medicine", "economics, economic, political", and "molecular, biology, genetics", yielding five citation paths. Research frontiers from medicine are based on "health, nursing, medicine" and "molecular, biology, genetics", having two additional trajectories. These observations

show scientometrics is multidisciplinary and partially interdisciplinary; Multidisciplinary since scientometrics research has been published in multiple disciplines; Partially interdisciplinary for literature published in "psychology, education, health" has a variety of intellectual bases

**Research frontier Knowledge base Z-score** Psychology, education, health Psychology, education, social 8.841 Psychology, education, health Systems, computing, computer 4.766 Medicine, medical, clinical Health, nursing, medicine 4.052 Psychology, education, health Health, nursing, medicine 3.313 Psychology, education, health Economics, economic, political 2.724 Psychology, education, health Molecular, biology, genetics 2.461 Medicine, medical, clinical Molecular, biology, genetics 1.984

Scientometrics of Scientometrics: Mapping Historical Footprint and Emerging Technologies…

http://dx.doi.org/10.5772/intechopen.77951

15

**WoS category Year Frequency Density** Information science & library science 1990 3880 138.571 Computer science 1990 3260 116.429 Computer science, interdisciplinary applications 1990 2284 81.571 Computer science, information systems 1990 925 33.036 Business & economics 1992 653 25.115 Management 1992 374 14.385 Engineering 1992 292 11.231 Public administration 1992 199 7.654 Planning & development 1992 179 6.885 Education & educational research 1992 165 6.346 Social sciences – other topics 1992 160 6.154 Science & technology – other topics 1993 462 18.480 Multidisciplinary sciences 1993 348 13.920 Business 1994 242 10.083 Neurosciences & neurology 1996 159 7.227 Environmental sciences & ecology 1997 261 12.429 General & internal medicine 1999 145 7.632 Surgery 2000 162 9.000 Public, environmental & occupational health 2003 201 13.400 Environmental sciences 2006 189 15.750

**Table 3.** Domain-level citation trends.

while "medicine, medical, clinical" largely cites from neighboring domains.

**Table 4.** Top 20 frequently assigned WoS categories.

**Figure 4.** Citation paths at a disciplinary level.

Scientometrics of Scientometrics: Mapping Historical Footprint and Emerging Technologies… http://dx.doi.org/10.5772/intechopen.77951 15


**Table 3.** Domain-level citation trends.

(DTM) which is a generative technique extended from Latent Dirichlet Allocation (LDA). DTM captures the evolution of latent topics in a collection of documents whereas it was

Citation paths at a disciplinary level are depicted in the visual representation called a dual-map overlay [6] (see **Figure 4**). The left regions represent where the collected literature publishes while the right regions render where it cites from. Citing literature and cited literature are also called research frontier and knowledge base respectively. The base map consists of the journal/ conference-level citation relationships among over 10,000 venues. Major clusters are labeled by terms chosen from the titles of venues in corresponding clusters. First, all of the terms' loglikelihood ratios are calculated based on their frequency in clusters. The use of LLR achieves to represent those terms' uniqueness in clusters. Then, top three terms are selected to tag clusters, based on their LLR values in descending order. Citation trajectories are colored based on the citing regions. The width of the paths is proportional to the z-score-scaled citation frequency. **Table 3** describes these trajectories in descending order of the third column, namely Z-score. The color of each row is corresponding to the path. Findings indicate that scientometrics has been largely driven by social sciences and medicine as represented by "psychology, education, health" and "medicine, medical, clinical" respectively at the first column. Literature from social sciences heavily cites from "psychology, education, social", "systems, computing, computer", "health, nursing, medicine", "economics, economic, political", and "molecular, biology, genetics", yielding five citation paths. Research frontiers from medicine are based on "health, nursing, medicine" and "molecular, biology, genetics", having two additional trajectories. These observations

oblivious to the preceding model [12].

**3.1. Domain-level research patterns**

**Figure 4.** Citation paths at a disciplinary level.

**3. Results**

14 Scientometrics


**Table 4.** Top 20 frequently assigned WoS categories.

show scientometrics is multidisciplinary and partially interdisciplinary; Multidisciplinary since scientometrics research has been published in multiple disciplines; Partially interdisciplinary for literature published in "psychology, education, health" has a variety of intellectual bases while "medicine, medical, clinical" largely cites from neighboring domains.

We considered WoS category assignment to literature as another important indicator representing domain-level thematic concentration. The top 20 frequently assigned WoS categories to the records are described in **Table 4**. It shows the year it was first assigned, and the density of how many times per year a specific category has been given, from its first year. The table is sorted in ascending order of the year. Results show that three categories have been assigned more than 2000 times – "information science & library science" (n = 3880), "computer science" (n = 3260), and "computer science, interdisciplinary applications" (n = 2284). These categories were first assigned from the beginning in the data set, demonstrating the greatest densities. The most frequently assigned category to be added to the top four list is "computer science, information systems." This category also demonstrates a relatively high density (33.036), given its first year of assignment was 1990. This finding suggests that literature under these four categories has had the largest influence on the emergence and development of scientific knowledge in scientometrics. In turn, research with scientific foci in social sciences, engineering, medical & health sciences, and environmental sciences brought along a multidisciplinary grasp to the domain.

### **3.2. Trending keywords**

Given by authors and indexers, keywords reflect representative concepts underlying published literature. The top 20 frequently occurring keywords in the data set are described in **Table 5**. It shows the year it first appeared, and the density of how many times on average a specific keyword has appeared, from its first year. Findings indicate that in the beginning, "bibliometrics" and "scientometrics" focused on employing "citation analysis" to examine the "impact" of a "science". We assume that "journal" and "publication" were considered as units of analysis. Another effort focused on evaluating research "performance" and "productivity" and examining the "pattern" of scientific "collaboration." The other stream of research had interest in devising a "bibliometric indicator" such as journal "impact factor", which led to the recent development of the widely accepted author-level metric "h-index."

**Figure 5** displays the keyword co-occurrence in the data set. We used a technique called a density visualization guided by VOSviewer. The font size of a keyword is proportional to its occurrence frequency. The more frequently a pair of keywords co-occurs, the closer the pair is located to the red spots. The visualization resulted in 484 keywords which occurred more than or equal to 18 times. As depicted, "bibliometrics" frequently co-occurred with "impact" which is consistent with the finding above. It also determined that devising an "impact fac-

**Table 6** lists 20 keywords which have surged during a specific duration of time. The investigation of keyword bursts adds temporal contexts in understanding historic footprint and emerging technologies in scientometrics which were oblivious to the snapshot metrics. The keywords were sorted in ascending order of the beginning years of bursts. "physics" is one of the keywords with the longest bursts, ending in 2010. It also has the second strongest bursts when not including "science." It indicates applications of scientometrics to physics and/or knowledge transfer from physics to scientometrics had intensively been conducted from the early years. The widely accepted author-level metric, namely h-index, was also derived from physics. The second longest bursts from 1992 is led by "law", also demonstrating a relatively high value of bursts. It shows the identification of laws existing in scientometrics phenomena was among the important initiatives. "publication output" is the keyword with the third longest and strongest bursts. It is argued that the evaluation of research performance and productivity was one of the key themes in the domain. The strongest burst episode from 1992 is associated with "indicator." In consideration with other keywords such as "stationary distribution", "model", and "informetric distribution", we argue modeling an indicator of impact measure was of greatest interest in scientometrics.

We analyzed another text fields, namely titles and abstracts since more informational points of content can be examined than only exploring keywords. We aimed to uncover the evolution of latent topics in the records over time. Toward that end, we removed stop words from

tor" for "journal ranking" was among the important themes in scientometrics.

**Keyword Year Frequency Density** Productivity 1992 270 10.385 Collaboration 1993 353 14.120 Bibliometric indicator 1993 290 11.600 Pattern 1993 273 10.920 Network 1994 357 14.875 Impact factor 1996 527 23.955 Index 2002 324 20.250 h-index 2007 386 35.091 Scopus 2008 280 28.000

Scientometrics of Scientometrics: Mapping Historical Footprint and Emerging Technologies…

http://dx.doi.org/10.5772/intechopen.77951

17

**3.3. Temporal topic models**

**Table 5.** Top 20 frequently occurring keywords.


Scientometrics of Scientometrics: Mapping Historical Footprint and Emerging Technologies… http://dx.doi.org/10.5772/intechopen.77951 17


**Table 5.** Top 20 frequently occurring keywords.

**Figure 5** displays the keyword co-occurrence in the data set. We used a technique called a density visualization guided by VOSviewer. The font size of a keyword is proportional to its occurrence frequency. The more frequently a pair of keywords co-occurs, the closer the pair is located to the red spots. The visualization resulted in 484 keywords which occurred more than or equal to 18 times. As depicted, "bibliometrics" frequently co-occurred with "impact" which is consistent with the finding above. It also determined that devising an "impact factor" for "journal ranking" was among the important themes in scientometrics.

**Table 6** lists 20 keywords which have surged during a specific duration of time. The investigation of keyword bursts adds temporal contexts in understanding historic footprint and emerging technologies in scientometrics which were oblivious to the snapshot metrics. The keywords were sorted in ascending order of the beginning years of bursts. "physics" is one of the keywords with the longest bursts, ending in 2010. It also has the second strongest bursts when not including "science." It indicates applications of scientometrics to physics and/or knowledge transfer from physics to scientometrics had intensively been conducted from the early years. The widely accepted author-level metric, namely h-index, was also derived from physics. The second longest bursts from 1992 is led by "law", also demonstrating a relatively high value of bursts. It shows the identification of laws existing in scientometrics phenomena was among the important initiatives. "publication output" is the keyword with the third longest and strongest bursts. It is argued that the evaluation of research performance and productivity was one of the key themes in the domain. The strongest burst episode from 1992 is associated with "indicator." In consideration with other keywords such as "stationary distribution", "model", and "informetric distribution", we argue modeling an indicator of impact measure was of greatest interest in scientometrics.

#### **3.3. Temporal topic models**

**Keyword Year Frequency Density** Science 1991 1613 59.741 Bibliometric analysis 1991 871 32.259 Journal 1991 815 30.185 Citation 1991 803 29.741 Bibliometrics 1992 1914 73.615 Impact 1992 969 37.269 Citation analysis 1992 814 31.308 Publication 1992 700 26.923 Scientometrics 1992 646 24.846 Indicator 1992 596 22.923 Performance 1992 348 13.385

to the recent development of the widely accepted author-level metric "h-index."

grasp to the domain.

16 Scientometrics

**3.2. Trending keywords**

We considered WoS category assignment to literature as another important indicator representing domain-level thematic concentration. The top 20 frequently assigned WoS categories to the records are described in **Table 4**. It shows the year it was first assigned, and the density of how many times per year a specific category has been given, from its first year. The table is sorted in ascending order of the year. Results show that three categories have been assigned more than 2000 times – "information science & library science" (n = 3880), "computer science" (n = 3260), and "computer science, interdisciplinary applications" (n = 2284). These categories were first assigned from the beginning in the data set, demonstrating the greatest densities. The most frequently assigned category to be added to the top four list is "computer science, information systems." This category also demonstrates a relatively high density (33.036), given its first year of assignment was 1990. This finding suggests that literature under these four categories has had the largest influence on the emergence and development of scientific knowledge in scientometrics. In turn, research with scientific foci in social sciences, engineering, medical & health sciences, and environmental sciences brought along a multidisciplinary

Given by authors and indexers, keywords reflect representative concepts underlying published literature. The top 20 frequently occurring keywords in the data set are described in **Table 5**. It shows the year it first appeared, and the density of how many times on average a specific keyword has appeared, from its first year. Findings indicate that in the beginning, "bibliometrics" and "scientometrics" focused on employing "citation analysis" to examine the "impact" of a "science". We assume that "journal" and "publication" were considered as units of analysis. Another effort focused on evaluating research "performance" and "productivity" and examining the "pattern" of scientific "collaboration." The other stream of research had interest in devising a "bibliometric indicator" such as journal "impact factor", which led

> We analyzed another text fields, namely titles and abstracts since more informational points of content can be examined than only exploring keywords. We aimed to uncover the evolution of latent topics in the records over time. Toward that end, we removed stop words from

of scientometrics to biomedicine" and "literature-based research in medicine" respectively. Knowledge discovery in healthcare and biomedical sciences has been among the greatest interest in scientometrics. We assume that this stream of research has ups and downs based

Scientometrics of Scientometrics: Mapping Historical Footprint and Emerging Technologies…

http://dx.doi.org/10.5772/intechopen.77951

19

**3.** Falling topics: Topic 5 has fallen. We labeled it "history and philosophy of scientometrics." It is obvious that a study of theory and practice tends to be prominent in early years of a science. As staging into the maturation, this kind of topic naturally moves way from inter-

**4.** Static topics: Topic 11 has been statically distributed over time. Based on the extracted terms, Topic 11 is interpreted as "mapping intellectual structure using citation and network analysis." This is one of the canonical research themes in scientometrics receiving

on the change of scientific foci.

est. It has also decreased in scientometrics.

**Table 6.** Top 20 keywords with the greatest intensive burstiness.

consistent attention from the beginning of the domain.

**Figure 5.** Keyword co-occurrence network (n = 484).

text, using a list of stop words in Python NLTK. The text was lowercased, tokenized, and deaccented. Then, we lemmatized the tokens and extracted noun phrases by bigram indexing. Text pre-processing and topic modeling were driven by gensim, a robust text mining toolkit in Python. **Table 7** describes 20 topics and 10 corresponding terms per topic. The terms were sorted in descending order of the average probabilities over the 28 years. Results show that most of the terms having high probabilities are unigram-formed.

**Figure 6** illustrates the topical trends from 1990 till 2017 using a visualization technique called a bump chart. The topics are sorted in descending order of normalized probability distributions in the beginning year. We further discuss nine prominent topics, Topics 9, 17, 7, 4, 1, 5, 11, 16, and 0, due to their relatively high probability distributions. We categorized these topics into four trends: (1) rising, (2) rising-falling, (3) falling, and (4) static.


of scientometrics to biomedicine" and "literature-based research in medicine" respectively. Knowledge discovery in healthcare and biomedical sciences has been among the greatest interest in scientometrics. We assume that this stream of research has ups and downs based on the change of scientific foci.



**Table 6.** Top 20 keywords with the greatest intensive burstiness.

text, using a list of stop words in Python NLTK. The text was lowercased, tokenized, and deaccented. Then, we lemmatized the tokens and extracted noun phrases by bigram indexing. Text pre-processing and topic modeling were driven by gensim, a robust text mining toolkit in Python. **Table 7** describes 20 topics and 10 corresponding terms per topic. The terms were sorted in descending order of the average probabilities over the 28 years. Results show that

**Figure 6** illustrates the topical trends from 1990 till 2017 using a visualization technique called a bump chart. The topics are sorted in descending order of normalized probability distributions in the beginning year. We further discuss nine prominent topics, Topics 9, 17, 7, 4, 1, 5, 11, 16, and 0, due to their relatively high probability distributions. We categorized these topics

**1.** Rising topics: Topics 9, 17, 7, and 1 are consistently rising. Topic 9 we labeled "applications of scientometrics to material sciences" has received the greatest attention over time. Topic 17 which has sharply increased is named "publication-based scholarly communication." Topics 7 and 1 have been always in the top topic list and recently received increasing attention. We labeled them "evaluation of funded research" and "applications of scientometrics to medical education" respectively. Findings indicate that applications of scientometrics to domains other than biomedical sciences are of increasing concerns in

**2.** Rising-falling topics: Topics 4, 16, and 0 repeat rising and falling. Topic 4 can be named "literature-based research in healthcare." Topics 16 and 0 can be understood as "applications

most of the terms having high probabilities are unigram-formed.

**Figure 5.** Keyword co-occurrence network (n = 484).

18 Scientometrics

into four trends: (1) rising, (2) rising-falling, (3) falling, and (4) static.

the scientific community.


disease two use country data one online journal research theory library information period result paper search number health number internet sci water function study study

**Topic 16 Topic 17 Topic 18 Topic 19** research communication journal ecology rehabilitation bibliometrics citation species stem cell scholarly communication analysis geography neuroscience dss impact climate change

credit publishing study city

**Table 7.** 20 generated topics.

**Figure 6.** Topical trends.

guideline science impact factor conservation paper library information paper knowledge study media reference biodiversity transplantation theory science tourism article impact author study

capacity system subject bibliometric analysis

Scientometrics of Scientometrics: Mapping Historical Footprint and Emerging Technologies…

http://dx.doi.org/10.5772/intechopen.77951

21

Scientometrics of Scientometrics: Mapping Historical Footprint and Emerging Technologies… http://dx.doi.org/10.5772/intechopen.77951 21


**Table 7.** 20 generated topics.

**Topic 0 Topic 1 Topic 2 Topic 3** article psychology publication productivity journal education cancer faculty author nursing document publication article published Brazilian drug index number research research gender literature study descriptor result study psychiatry Korean study research theses Latin American conclusion medicine school literature woman publication aids drug year **Topic 4 Topic 5 Topic 6 Topic 7** health science research research research history country evaluation publication scientometrics science impact public health book collaboration funding literature reception publication assessment medicine removal output policy method philosophy physics researcher result nature university project disease colleague study scientist health care sport productivity work **Topic 8 Topic 9 Topic 10 Topic 11** performance technology research structure indicator literature field analysis research patent analysis map bibliometric indicator nanotechnology information network quality serial study mapping evaluation indexing science citation group application development data measure development data cluster data material paper database peer review core knowledge method **Topic 12 Topic 13 Topic 14 Topic 15** study distribution information paper population model web research method data library publication country index link literature

20 Scientometrics

**Figure 6.** Topical trends.

#### **3.4. Document co-citation network**

Previous section utilized titles and abstracts to investigate topical trends without any bound context. This section examined those fields in a context of document-level co-citation relationship. **Figure 7** visualizes the document co-citation network in the data set. Each node is a cited reference extracted from the reference sections of the records and the size of the node is proportional to its cumulative frequency of received citations. Nodes with inner circles in red represent articles with citation bursts. We labeled the most highly cited 20 articles in black following a truncated form of <LAST NAME> < ABBREVIATED FIRST NAME> (<YEAR>) so as to only display first authors' names and published years (see the upward in **Figure 7**). They are cited more than or equal to 95 times locally, meaning in the data set. The color legend at the top of the display indicates links and citations in cooler colors happen more closely to 1990 whereas hotter ones occur in closer years to 2017. Based on the color scheme, we can keep track of the evolution of the document network. Findings show that most of the landmark articles were published relatively recently. Cumulative citations and citation bursts also intensively happened with these articles. Next, we conducted clustering and labeled the clusters in blue, using LLR (see the downward in **Figure 7**). Clusters are numbered in such a way that higher rankings are given to the clusters containing more references. In order to add richer contexts in interpreting the clustering results, we generated another visualization

Scientometrics of Scientometrics: Mapping Historical Footprint and Emerging Technologies…

http://dx.doi.org/10.5772/intechopen.77951

23

In **Figure 8**, we re-grouped all the nodes on multiple lines so that the cluster memberships can be more accessibly identified. As depicted in the figure, emerging trends can further be captured by examining Clusters 1, 6, 10, 16, 17, 18 given cluster sizes, recency, cumulative citations, and citation bursts. **Table 8** summarizes these clusters in terms of cluster size, three types of labels, and mean year of citees, i.e. cluster age. Of the selected clusters, Cluster 1 is the largest and oldest. In consideration with Cluster 6, results show that impact measure is still among the important themes in scientometrics. The third largest and newest group of literature is Cluster 10. It indicates practical applications of social media analytics to scientometrics is receiving the most recent attention. Other emerging topics include international collaboration (Cluster 16) and applications to medicine (Cluster 17) and environmental sciences and

called a timeline visualization (see **Figure 8**).

**Figure 8.** Timeline visualization with LLR cluster labels.

policy (Cluster 18).

**Figure 7.** Document co-citation networks with truncated labels of first authors' names and published years (upward) and cluster labels (downward) (n = 1856, e = 6127).

red represent articles with citation bursts. We labeled the most highly cited 20 articles in black following a truncated form of <LAST NAME> < ABBREVIATED FIRST NAME> (<YEAR>) so as to only display first authors' names and published years (see the upward in **Figure 7**). They are cited more than or equal to 95 times locally, meaning in the data set. The color legend at the top of the display indicates links and citations in cooler colors happen more closely to 1990 whereas hotter ones occur in closer years to 2017. Based on the color scheme, we can keep track of the evolution of the document network. Findings show that most of the landmark articles were published relatively recently. Cumulative citations and citation bursts also intensively happened with these articles. Next, we conducted clustering and labeled the clusters in blue, using LLR (see the downward in **Figure 7**). Clusters are numbered in such a way that higher rankings are given to the clusters containing more references. In order to add richer contexts in interpreting the clustering results, we generated another visualization called a timeline visualization (see **Figure 8**).

In **Figure 8**, we re-grouped all the nodes on multiple lines so that the cluster memberships can be more accessibly identified. As depicted in the figure, emerging trends can further be captured by examining Clusters 1, 6, 10, 16, 17, 18 given cluster sizes, recency, cumulative citations, and citation bursts. **Table 8** summarizes these clusters in terms of cluster size, three types of labels, and mean year of citees, i.e. cluster age. Of the selected clusters, Cluster 1 is the largest and oldest. In consideration with Cluster 6, results show that impact measure is still among the important themes in scientometrics. The third largest and newest group of literature is Cluster 10. It indicates practical applications of social media analytics to scientometrics is receiving the most recent attention. Other emerging topics include international collaboration (Cluster 16) and applications to medicine (Cluster 17) and environmental sciences and policy (Cluster 18).

**Figure 8.** Timeline visualization with LLR cluster labels.

**3.4. Document co-citation network**

22 Scientometrics

Previous section utilized titles and abstracts to investigate topical trends without any bound context. This section examined those fields in a context of document-level co-citation relationship. **Figure 7** visualizes the document co-citation network in the data set. Each node is a cited reference extracted from the reference sections of the records and the size of the node is proportional to its cumulative frequency of received citations. Nodes with inner circles in

**Figure 7.** Document co-citation networks with truncated labels of first authors' names and published years (upward)

and cluster labels (downward) (n = 1856, e = 6127).


approaches to a variety of domains such as material sciences, medicine, and environmental sciences have received increasing attention. In reverse, practical applications of social media analytics to scientometrics is also receiving the most recent interest. Impact measure and science mapping are among the canonical research themes receiving consistent attention from

Scientometrics of Scientometrics: Mapping Historical Footprint and Emerging Technologies…

http://dx.doi.org/10.5772/intechopen.77951

25

The present chapter aimed to explore epistemological characteristics, historic areas of innovation, and emerging trends in scientometrics. We achieved this by investigating domain-level citation paths, WoS category assignment, keyword co-occurrence, temporal topic models, and document clusters. The findings indicate the domain of scientometrics is multidisciplinary and partially interdisciplinary. Social sciences and biomedicine have published to the field, but not yet cited from each other. We argue that the maturation of scientometrics as a scientific field is still ongoing. Next, early studies tried to measure a science's impact and performance and productivity of published research. Successive effort investigated laws and indicators in scientometrics and explored scientific collaboration. Recent literature is paying attention to topics such as applying scientometrics approaches to different domains and bringing social

The approaches of the present study provide advantages in investigating intellectual structure of a science as follows. First, we tried to make our data collection inclusive by investigating closely neighboring domains. Conventional studies of domain analysis often cover only a fraction of published literature. Our method provides a systematic way to explore the broader coverage of a scientific discipline. Second, we investigated the domain from a multi-faceted point of view. Domain-level citation trajectories, subject category assignment, networks of subject categories and keywords, bursting keywords, topic models, and document co-citation networks were identified in this study. Sub-sections in Results triangulated each other, adding richer interpretations from macro units of analysis to micro ones. Finally, the analytical procedure and tools employed in the present work enabled us to explore time-aware research trends in the domain. In addition, one can conduct this kind of domain analysis of his or her concern as frequently as needed without prior knowledge or experience. Thus, the proposed approaches have a relatively higher reproducibility and lower cost for conducting studies at

There are several limitations in our work. First, the topic search we conducted on WoS may have missed relevant records. It is acknowledged that the vocabulary mismatch presents a challenge for keyword-based search. We may be able to overcome this drawback by employing citation indexing or iterative search query development as an alternative strategy in order to capture a much broader context. Second, WoS as our source of data may have underrepresented conference proceedings. It is also recognized as an issue for disciplines such as social sciences and arts and humanities [13]. At the time of data retrieval, the authors' institutes only subscribed to the core collection of WoS. Thus, it was inevitable not to miss some

the beginning of the domain.

media analytics in scientometrics.

a larger scale, especially as in the era of mass publication.

**5. Conclusion**

**Table 8.** Cluster summary.
