**6. Conclusion and future work**

In this study, we developed a map of science, Mapping Science based on the research content similarity for funding project descriptions and recently published articles, which have difficulty in applying the citation analysis. After improving the existing paragraph embedding technique with an entropy-based clustering method of word vectors, we confirmed the good face validity. Then, we introduced the map constructed from approx. 300 k IEEE articles and NSF projects from 2012 to 2016 with the clustering and layout method of articles/projects and analytic functions provided on the map. Finally, we confirmed that formation processes of some specific research areas can be captured as changes of network structure.

As the next step, we plan to have a comparison with citation-based methods on concrete scenarios and incorporate patent information on the map. In addition, by overlaying domestic funding projects with NSF and Horizon2020 through the JST thesaurus that has English and Japanese notations, we will identify the trend of public grants. Finally, we try to extract metrics from chronological changes of the network structure of research areas. Foresight and understand from scientific exposition (FUSE) program in Intelligence advanced research projects activity (IAPRA) already conducted a study for identifying emerging research area based on several metrics obtained from several maps of science from 2011 to 2015. We, JST, will also utilize such metrics in statistical analysis and machine learning techniques to detect emerging research areas in their early stage for the next science and technology policies.

[9] Kullback S, Leibler R. On information and sufficiency. Annals of Mathematical Statistics.

Mapping Science Based on Research Content Similarity http://dx.doi.org/10.5772/intechopen.77067 193

[10] Firth JR. A synopsis of linguistic theory 1930-1955. Studies in Linguistic Analysis.

[11] Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. In: Proceedings of Workshop at the International Conference on Learning

[12] Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS 13). Vol. 2; 2013.

[13] Le Q, Mikolov T. Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning (ICML 2014).

[14] Shannon C. A mathematical theory of communication. Bell System Technical Journal.

[15] Vilnis L, McCallum A. Word representations via Gaussian embedding. In: Proceedings of International Conference on Learning Representations (ICLR 2015); 2015. pp. 1-12 [16] Kimura T, Kawamura T, Watanabe K, Matsumoto N, Sato T, Kushida T, Matsumura K. J-GLOBAL knowledge: Japan's largest linked data for science and technology. In: Proceedings of the 14th International Semantic Web Conference (ISWC 2015); 2015 [17] Santus E, Lenci A, Lu Q, Walde S. Chasing hypernyms in vector spaces with entropy. In: Proceedings of the 14th Conference of the European Chapter of the Association for

[18] Kawamura T, Watanabe K, Matsumoto N, Egami S, Jibu M. Funding map for research project relationships using paragraph vectors. In: Proceedings of the 16th International

[19] Kawamura T, Watanabe K, Matsumoto N, Egami S, Jibu M. Science graph for characterizing the recent scientific landscape using paragraph vectors. In: Proceedings of the 9th ACM International Conference on Knowledge Capture (K-Cap 2017); 2017. pp. 9-16 [20] Jones KS, Walker S, Robertson SE. A probabilistic model of information retrieval: Development and comparative experiments. Information Processing and Management.

[21] Rosvall M, Bergstrom CT. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences of the United States of

[22] Newman MEJ. Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America (PNAS 2006). 2006;

Conference on Scientometrics & Informetrics (ISSI 2017); 2017. pp. 1121-1131

Computational Linguistics (EACL 2014); 2014. pp. 38-42

America (PNAS 2008). 2008;**105**(4):1118-01123

1951;**22**:79-86

pp. 3111-3119

2014;**32**(2):1188-1196

2000;**36**(6):779-808

**103**(23):8577-8582

1948;**27**(379-423):623-656

1957;**1952-59**:1-32

Representations (ICLR 2013); 2013
