**1. Introduction**

A city is a complex ecosystem and, as such, it is not the sum of its components; each component contributes but does not form the behavior of the whole [1]. The modern city is characterized by a sophisticated structure and zones of diverse urban social function, that is, residential neighborhoods, commercial areas, and industrial areas [2]. Functional city parts enable better orientation and support people's different needs [3, 4]. Rapid urban development has led to larger cities with more complex social dynamics, and this creates a great challenge for the accurate mapping of urban land use [5], for example, to promote social equity [6].

A smart city is a platform to facilitate technological and social innovation that enhances productivity, sustainability, and livability [7]. It opens the door for research designated for dynamic and automated identification of social function land use understanding and classifying city lands of different social functions. Mapping of urban land use can be utilized for urban planning and designing of better urbanization strategies [8–10], urban air quality management [11], promotion of sustainable ecocities [12, 13], and green utilization efficiency of urban land [14]. Knowledge of the function of city parts and their management can help govern a city [15] and contribute to a better understanding of mobility patterns and interconnections between city parts, which is crucial for efficient planning decisions within cities, for example, planning of highways. Moreover, it can serve businesses looking for the right spot for their business, advertisers choosing a location for enhanced advertisement, and social recommendations [3].

The digital revolution has brought a great opportunity for social sciences research in cities; the emergence of enhanced computing power and mobile phones with built-in sensors and location technologies has created an enormous amount of data for understanding and monitoring urban life [16]. Data sources, such as remote sensing imagery, social media data, taxi trajectories, and mobile phone patterns of usage, have been utilized for cheaper and enhanced social land-use identification research.

Most research in recent years has offered complex methodologies that require the integration of several data resources of different types or substantial prior knowledge about the examined city. The motivation for conducting this research is to offer a method that requires only sparse knowledge of the examined land and relies on an inexpensive data resource. Previous works have yet to achieve high accuracy in such conditions; therefore, research and creative solutions are needed to solve this problem. Although incorporating several data resources can definitely improve the identification rate, in this work, we aim to achieve solid land-use mapping with a simple and efficient methodology that requires one data resource. Our main assumption is that sparse prior knowledge about the examined city's functional zones can be obtained by a local or domain expert at a low cost. We mainly rely on call detail records (CDR), an inexpensive and available data source routinely collected by telecom operators, and assume that areas of different social functions cause different typical cellular communication behavior [17]. For example, one can expect the communication pattern in a residential neighborhood to have different characteristics than that used for industry; perhaps at night and in the early morning, there will be more communication in a residential neighborhood. We utilize this behavior to identify different area categories with different functions.

This paper presents a semi-supervised algorithm, denoted as SSK (Semi-supervised Self-labeled K-nearest neighbor), which requires only sparse prior knowledge of the examined urban area, meaning it assumes we possess only a small number of land-use labeled areas. SSK combines both the distance-weighted k-nearest neighbor (DKNN) with a self-labeled iterative technique aimed to enlarge the training set in an iterative manner. We also perform a neighbor smoothing approach that offers a unique interpretation of neighbors in the context of the KNN process. In addition to considering feature-space neighbors as in the regular KNN, we also consider the geographical space neighbors, and thus we utilize the geographical homogeneity of social functions in urban areas.

The contributions of this work are as follows:


The rest of this paper is organized as follows: Section 2 presents recent developments and research on land-use mapping, Section 3 describes the methodology and SSK land-use classification algorithm, Section 4 evaluates the efficiency of SSK and compares its performance with other algorithms that require more prior knowledge about the examined area, Section 5 presents the neighbor smoothing integrated into SSK, Section 6 evaluates the usage of neighbor smoothing in SSK and discusses its merits and drawbacks, and Section 7 summarizes the work, presents conclusions, and offers directions for further research.
