**Abstract**

In recent years, technological advances, specifically new sensing and communication technologies, have brought new opportunities for a less expensive, dynamic, and more accurate mapping of social land use in cities. However, most research has featured complex methodologies that integrate several data resources or require much prior knowledge about the examined city. We offer a methodology that requires little prior knowledge and mainly relies on call detail records, which is an inexpensive available data resource of mobile phone signals. We introduce the Semi-supervised Self-labeled K-nearest neighbor (SSK) algorithm that combines distance-weighted k-nearest neighbors (DKNN) with a self-labeled iterative technique designed for training classifiers with only a small number of labeled samples. In each iteration, the samples (small land units) that we are most confident of their classification by DKNN are added to the training set of the next iteration. We perform neighbor smoothing to the land-use classification by considering feature-space neighbors as in the regular KNN but also geographical space neighbors, and thereby leverage the tendency of approximate land areas to share similar social land use. Based only on a few labeled examples, the SSK algorithm achieves a high accuracy rate, between 74% without neighbor smoothing, and 80% with it.

**Keywords:** call detail records, classification, computational social science, k-nearest neighbors, land use, machine learning, mobile phone data, smart cities, urban computing
