**3. Materials and methods**

In this chapter, the changes of the productivity of the other employees are discussed according to the leadership models of the construction engineers working as construction engineers. In this respect, data will be obtained from the point of view of the site managers and employees. In other words, it will be determined that, productivity will increase as a result of which modern leadership types—transactional/transformational/passive avoidant leadership behaviors—in the literature studies especially in recent years, are applied to the construction worker/employee group in which characteristics.

In this study, as a method of obtaining data, questionnaires were used to provide "bidirectional evaluation" in order to reach the targeted results realistically. In other words, not only the behaviors, characteristics and the like of the leaders, but also the factors such as expectations, characteristics and style of living are taken into account.

After this data which will be obtained from the construction companies in Adana province through the questionnaire forms, data mining studies have been started. Detailed survey studies for the determination of relations have been applied to civil engineers working in construction companies as construction site supervisors and other employees in construction sites. A bidirectional model designed to be tested in this context is shown in **Figure 1**.

When the relevant model is considered, it has been decided that the implementation of a bidirectional questionnaire will be positive and appropriate, as explained earlier. When all theories and methods in the relevant literature are examined, it was found appropriate to use the multiple leadership questionnaire (MLQ) scale developed by Bass and Avolio [8] in this study to determine the types and characteristics of the engineers, as shown in **Figure 2**.

Information was gathered from the site chiefs about the efficiency value to be calculated for employees who will evaluate themselves. These collected data were added to the end of MLQ surveys applied to engineers. The chiefs assessed the productivity level of their employees by choosing between low, medium and high. In the employee questionnaires, employees indicated their productivity information by choosing one of the low, medium and high options according to the management of the site chiefs and leadership understanding. It is ensured that the data mining methods that form the basis of the study in this way are consistent in the data to be applied.

Since MLQ has a bidirectional survey application system, it reveals how leaders perceive the way of management both from the point of view of themselves and from the point of view of employees. Forty five questions in the short measure are asked to the leaders in active mode and to the employees in passive mode. The relevant scale with this feature, that is, the same type of questions are directed to the people on both sides of the subject, it is possible to obtain more healthy results. With this feature, MLQ is a leadership survey questionnaire that has been used in recent years by many researchers in different disciplines. The survey questionnaire used in this study does not include personal information of the persons, only

**Figure 1.** Model designed to be tested.

Determination and Classification of Crew Productivity with Data Mining Methods

http://dx.doi.org/10.5772/intechopen.75504

39


**Figure 1.** Model designed to be tested.

**3. Materials and methods**

38 Data Mining

data to be applied.

worker/employee group in which characteristics.

tions, characteristics and style of living are taken into account.

In this chapter, the changes of the productivity of the other employees are discussed according to the leadership models of the construction engineers working as construction engineers. In this respect, data will be obtained from the point of view of the site managers and employees. In other words, it will be determined that, productivity will increase as a result of which modern leadership types—transactional/transformational/passive avoidant leadership behaviors—in the literature studies especially in recent years, are applied to the construction

In this study, as a method of obtaining data, questionnaires were used to provide "bidirectional evaluation" in order to reach the targeted results realistically. In other words, not only the behaviors, characteristics and the like of the leaders, but also the factors such as expecta-

After this data which will be obtained from the construction companies in Adana province through the questionnaire forms, data mining studies have been started. Detailed survey studies for the determination of relations have been applied to civil engineers working in construction companies as construction site supervisors and other employees in construction

When the relevant model is considered, it has been decided that the implementation of a bidirectional questionnaire will be positive and appropriate, as explained earlier. When all theories and methods in the relevant literature are examined, it was found appropriate to use the multiple leadership questionnaire (MLQ) scale developed by Bass and Avolio [8] in this study to determine the types and characteristics of the engineers, as shown in **Figure 2**.

Information was gathered from the site chiefs about the efficiency value to be calculated for employees who will evaluate themselves. These collected data were added to the end of MLQ surveys applied to engineers. The chiefs assessed the productivity level of their employees by choosing between low, medium and high. In the employee questionnaires, employees indicated their productivity information by choosing one of the low, medium and high options according to the management of the site chiefs and leadership understanding. It is ensured that the data mining methods that form the basis of the study in this way are consistent in the

Since MLQ has a bidirectional survey application system, it reveals how leaders perceive the way of management both from the point of view of themselves and from the point of view of employees. Forty five questions in the short measure are asked to the leaders in active mode and to the employees in passive mode. The relevant scale with this feature, that is, the same type of questions are directed to the people on both sides of the subject, it is possible to obtain more healthy results. With this feature, MLQ is a leadership survey questionnaire that has been used in recent years by many researchers in different disciplines. The survey questionnaire used in this study does not include personal information of the persons, only

sites. A bidirectional model designed to be tested in this context is shown in **Figure 1**.

**4.** Data transformation (to transform the data to be used by the data mining technique)

**6.** Pattern evaluation (to identify interesting patterns representing information obtained ac-

Determination and Classification of Crew Productivity with Data Mining Methods

http://dx.doi.org/10.5772/intechopen.75504

41

**7.** Information presentation (performing the user presentation of the obtained information

The difficulty in obtaining the relevant data from the large data generated by the use of technology in every sector is also valid for leadership. In order to obtain meaningful and useful information from meaningless data heaps, it is planned to use data mining methods in this study. For this reason, the data in the surveys gathered within the scope of the study were primarily preprocessed and then prepared in the relevant file format for operation in data mining programs. Today, both commercial and open source programs have been developed to make data mining studies. There are many algorithms in these programs. By using these algorithms, meaningful information can be extracted from the data available [9]. In this study, KEEL software was used for preprocessing steps and Weka software for clas-

Weka [10] is the abbreviation for Waikato Environment for Knowledge Analysis. It is a Javabased data mining and machine learning software developed under the GNU general public license at Waikato University in New Zealand. It includes preprocessing, classification, clustering, association rule mining, feature selection and visualization processes on data sets. Weka works with the Attribute Relationship File Format (.arff) file format. This file format is a specially designed file format that is kept in a text structure. The @relation, @attribute and @data statements are used to specify the file structure. The @relation specifies the general purpose or name of the stack data. While @attribute is used to specify attribute names that

correspond to columns in the data set, @data marks the beginning of the raw data set.

KEEL is software written in Java language developed by the University of Granada with the support of the National Science Projects Agency of Spain. KEEL is not rich in terms of classical data mining algorithms such as clustering. Instead, Fuzzy classifiers, artificial intelligencebased classification and rule-based clustering algorithms are included [12]. One of the weakest software in terms of data visualization is the KEEL. Since KEEL software provides highly advanced algorithms in preprocessing parts according to other software, KEEL software was used in the preprocessing phase of the data obtained from the questionnaires in this study; that is, the data preprocessing level, which constitutes the first four steps of the information discovery process, was realized with the help of KEEL software. During this preprocessing phase, normalization is performed, and the data are transformed into the related form.

In the scope of the study, 102 employee questionnaires and 102 leader questionnaires were

applied. The results of the obtained leadership outputs are given in **Table 1**.

**5.** Data mining (implementing intelligent methods to capture data patterns)

cording to some measurements)

that has been mined) [10, 11].

sification steps.

**4. Findings and results**

**Figure 2.** Systematization of the related survey study and the goals planned to be achieved.

information like their age range, their gender, how much they are working, and so on were collected. In this study, a scale covering 45 questions, which was revised and abbreviated, was used instead of 72 questions.

According to the main axis of the study, data mining methods are used together with Weka and KEEL programs for classifications. In a sense, this study is supported by a different perspective that is not frequently used in the sector, and the sector has benefited.

Data mining is the process of extracting previously undiscovered information based on a wide variety and quantity of data held in data warehouses and using them to make decisions and action plans. It is the search for relationships and rules that will allow us to make predictions about the future from a large amount of data. Data mining is the semiautomatic discovery of patterns, relations, changes, irregularities, rules and statistically significant structures in data. The computer is responsible for determining the relationships, rules and properties between the data. The goal is to be able to detect previously unrecognized data patterns [9].

It is necessary to address the different types of data for an effective data mining application, to ensure the effectiveness and scalability of the data mining algorithm, to provide usefulness, accuracy and significance of the results, to display the discovered rules in various forms, to process data in different environments and to provide privacy and data security features. Alternatively, data mining is actually regarded as a part of the knowledge discovery process. The stages of the knowledge discovery process are as follows [9]:


The difficulty in obtaining the relevant data from the large data generated by the use of technology in every sector is also valid for leadership. In order to obtain meaningful and useful information from meaningless data heaps, it is planned to use data mining methods in this study. For this reason, the data in the surveys gathered within the scope of the study were primarily preprocessed and then prepared in the relevant file format for operation in data mining programs. Today, both commercial and open source programs have been developed to make data mining studies. There are many algorithms in these programs. By using these algorithms, meaningful information can be extracted from the data available [9]. In this study, KEEL software was used for preprocessing steps and Weka software for classification steps.

Weka [10] is the abbreviation for Waikato Environment for Knowledge Analysis. It is a Javabased data mining and machine learning software developed under the GNU general public license at Waikato University in New Zealand. It includes preprocessing, classification, clustering, association rule mining, feature selection and visualization processes on data sets. Weka works with the Attribute Relationship File Format (.arff) file format. This file format is a specially designed file format that is kept in a text structure. The @relation, @attribute and @data statements are used to specify the file structure. The @relation specifies the general purpose or name of the stack data. While @attribute is used to specify attribute names that correspond to columns in the data set, @data marks the beginning of the raw data set.

KEEL is software written in Java language developed by the University of Granada with the support of the National Science Projects Agency of Spain. KEEL is not rich in terms of classical data mining algorithms such as clustering. Instead, Fuzzy classifiers, artificial intelligencebased classification and rule-based clustering algorithms are included [12]. One of the weakest software in terms of data visualization is the KEEL. Since KEEL software provides highly advanced algorithms in preprocessing parts according to other software, KEEL software was used in the preprocessing phase of the data obtained from the questionnaires in this study; that is, the data preprocessing level, which constitutes the first four steps of the information discovery process, was realized with the help of KEEL software. During this preprocessing phase, normalization is performed, and the data are transformed into the related form.

### **4. Findings and results**

information like their age range, their gender, how much they are working, and so on were collected. In this study, a scale covering 45 questions, which was revised and abbreviated, was

According to the main axis of the study, data mining methods are used together with Weka and KEEL programs for classifications. In a sense, this study is supported by a different per-

Data mining is the process of extracting previously undiscovered information based on a wide variety and quantity of data held in data warehouses and using them to make decisions and action plans. It is the search for relationships and rules that will allow us to make predictions about the future from a large amount of data. Data mining is the semiautomatic discovery of patterns, relations, changes, irregularities, rules and statistically significant structures in data. The computer is responsible for determining the relationships, rules and properties between the data. The goal is to be able to detect previously unrecognized data

It is necessary to address the different types of data for an effective data mining application, to ensure the effectiveness and scalability of the data mining algorithm, to provide usefulness, accuracy and significance of the results, to display the discovered rules in various forms, to process data in different environments and to provide privacy and data security features. Alternatively, data mining is actually regarded as a part of the knowledge discovery process.

spective that is not frequently used in the sector, and the sector has benefited.

**Figure 2.** Systematization of the related survey study and the goals planned to be achieved.

The stages of the knowledge discovery process are as follows [9]:

**3.** Data selection (determine the data related to the analysis to be performed)

**1.** Data cleaning (remove loud and inconsistent data)

**2.** Data integration (combining multiple data sources)

used instead of 72 questions.

patterns [9].

40 Data Mining

In the scope of the study, 102 employee questionnaires and 102 leader questionnaires were applied. The results of the obtained leadership outputs are given in **Table 1**.


to be normalized, whereas v′ is the value to be obtained as the result of normalization. In the

Determination and Classification of Crew Productivity with Data Mining Methods

http://dx.doi.org/10.5772/intechopen.75504

The format of the .arff file is different for each data mining method. Some of the data mining methods in Weka only work with numerical data, while others work with categorical data. For example, while there is a greater need for numerical data for classification and clustering algorithms, categorical or nominal data are needed for the algorithm of association rules [7]. Numerical data were categorized by the uniform frequency method. In the equal frequency method, the property ranges are divided into N ranges, and an equal number of pieces of data are held in each range. This method is used because it can work with distorted data. **Table 2**

shows the summary of the .arff file format prepared in the scope of the study.

**Table 3.** The classification results of random forest.

(1)

43

formulation, new\_min is taken as 0 and new\_max is taken as 1.

**Table 1.** Survey results.

The collected surveys in the scope of this study were primarily brought together in an Excel file. The data stored in Excel format are then converted to the .arff file format, which is a file format of Weka, so that one of the data mining program, Weka, can be run. For this, various preprocessing was carried out with the help of KEEL program and transformation of file format was performed.

In the scope of the study, min-max normalization method, which is widely used, is used. In the min-max normalization method, the largest and smallest values in a group are handled. All other data are normalized to these values. The purpose here is to normalize the smallest value to be 0 and the largest value to be 1 and to spread all the other data to this 0–1 range. The prescribed formulation is shown in Eq. (1). According to this formulation, v is the value

```
@relation bap_calisan_2017
@attribute gorev
{boyaci,demirci,dogalgaz_doseme,formen,isci,kalipci,parke_ustasi,seramikci,sivaci,tesisatci,usta}
@attribute cinsiyet {1,2}
@attribute yas {1,2,3,4,5}
@attribute egitim {1,2,3,4,5,6}
@attribute maas {1,2,3,4}
@attribute tecrube {1,2,3,4}
@attribute benzer_islerdeki_tecrube {1,2,3,4}
@attribute …….
…………
@attribute motivasyon {yuksek,dusuk}
@attribute lideregore_liderliktipi {transformational,transactional,passive-avoidant}
@attribute calisanagore_liderlik_tipi {transformational,transactional,passive_avoidant}
@data
tesisatci,osmaniye,1,2,3,2,3,3,2,3,2,1,3,0,0,1,3,1,2,7,10,1,yuksek,transformational,transformational
sivaci,kahramanmaras,1,3,3,2,4,4,3,3,3,1,3,0,1,0,3,1,2,7,10,2,yuksek,transformational,transformational
sivaci,adana,1,2,3,1,3,4,3,3,1,1,3,0,1,0,3,1,2,7,10,2,dusuk,transformational,transformational1,3,2,4,6,2,1,0,1,0,3,dusuk,D
isci,adana,1,2,3,2,4,4,3,3,3,1,1,0,1,0,3,1,2,7,8,2,dusuk,transformational,passive_avoidant
formen,adana,1,2,3,3,3,3,2,3,2,3,4,1,0,0,3,2,2,6,8,2,yuksek,transformational,transformational
isci,adana,1,3,2,1,4,4,4,3,4,1,3,0,1,0,3,1,2,7,10,2,yuksek,transactional,transactional
kalipci,erzurum,1,3,2,1,4,4,1,3,1,2,1,0,1,0,3,2,3,7,8,1,dusuk,transformational,transactional
parke_ustasi,adana,1,3,2,1,4,4,3,3,2,1,3,0,1,0,3,2,2,6,10,2,dusuk,transactional,transformational
……..
```
to be normalized, whereas v′ is the value to be obtained as the result of normalization. In the formulation, new\_min is taken as 0 and new\_max is taken as 1.

$$\psi' = \frac{v - \min\_{\mathbf{A}}}{\max\_{\mathbf{A}} - \min\_{\mathbf{A}}} \left( \mathbf{y} \min\_{\mathbf{A}} (\max\_{\mathbf{A}} - \mathbf{y} \min\_{\mathbf{A}} \min\_{\mathbf{A}}) + \mathbf{y} \min\_{\mathbf{A}} \min\_{\mathbf{A}} \right) \tag{1}$$

The format of the .arff file is different for each data mining method. Some of the data mining methods in Weka only work with numerical data, while others work with categorical data. For example, while there is a greater need for numerical data for classification and clustering algorithms, categorical or nominal data are needed for the algorithm of association rules [7]. Numerical data were categorized by the uniform frequency method. In the equal frequency method, the property ranges are divided into N ranges, and an equal number of pieces of data are held in each range. This method is used because it can work with distorted data. **Table 2** shows the summary of the .arff file format prepared in the scope of the study.


The collected surveys in the scope of this study were primarily brought together in an Excel file. The data stored in Excel format are then converted to the .arff file format, which is a file format of Weka, so that one of the data mining program, Weka, can be run. For this, various preprocessing was carried out with the help of KEEL program and transformation of file for-

According to the leader 84 18 0 102

**Transactional leadership**

81 20 1 102

**Passive avoidant leadership**

**Total number of surveys**

In the scope of the study, min-max normalization method, which is widely used, is used. In the min-max normalization method, the largest and smallest values in a group are handled. All other data are normalized to these values. The purpose here is to normalize the smallest value to be 0 and the largest value to be 1 and to spread all the other data to this 0–1 range. The prescribed formulation is shown in Eq. (1). According to this formulation, v is the value

{boyaci,demirci,dogalgaz\_doseme,formen,isci,kalipci,parke\_ustasi,seramikci,sivaci,tesisatci,usta}

**Transformational leadership**

@attribute lideregore\_liderliktipi {transformational,transactional,passive-avoidant} @attribute calisanagore\_liderlik\_tipi {transformational,transactional,passive\_avoidant}

isci,adana,1,2,3,2,4,4,3,3,3,1,1,0,1,0,3,1,2,7,8,2,dusuk,transformational,passive\_avoidant formen,adana,1,2,3,3,3,3,2,3,2,3,4,1,0,0,3,2,2,6,8,2,yuksek,transformational,transformational

isci,adana,1,3,2,1,4,4,4,3,4,1,3,0,1,0,3,1,2,7,10,2,yuksek,transactional,transactional kalipci,erzurum,1,3,2,1,4,4,1,3,1,2,1,0,1,0,3,2,3,7,8,1,dusuk,transformational,transactional parke\_ustasi,adana,1,3,2,1,4,4,3,3,2,1,3,0,1,0,3,2,2,6,10,2,dusuk,transactional,transformational

tesisatci,osmaniye,1,2,3,2,3,3,2,3,2,1,3,0,0,1,3,1,2,7,10,1,yuksek,transformational,transformational sivaci,kahramanmaras,1,3,3,2,4,4,3,3,3,1,3,0,1,0,3,1,2,7,10,2,yuksek,transformational,transformational

sivaci,adana,1,2,3,1,3,4,3,3,1,1,3,0,1,0,3,1,2,7,10,2,dusuk,transformational,transformational1,3,2,4,6,2,1,0,1,0,3,dusuk,D

mat was performed.

**Table 1.** Survey results.

According to the employee

42 Data Mining

@relation bap\_calisan\_2017

@attribute cinsiyet {1,2} @attribute yas {1,2,3,4,5} @attribute egitim {1,2,3,4,5,6} @attribute maas {1,2,3,4} @attribute tecrube {1,2,3,4}

@attribute benzer\_islerdeki\_tecrube {1,2,3,4}

@attribute motivasyon {yuksek,dusuk}

@attribute gorev

@attribute ……. …………

@data

……..

**Table 2.** Categorical .arff file.

Within the scope of this study, the leadership qualities that employees qualify for their leaders are classified. According to the analysis results obtained, the most successful data mining algorithm in terms of classification has been random forest algorithm with a rate of 81.3725%. This algorithm is a community learning algorithm. This algorithm generates more than one decision tree during the classification process and thus aims to increase the classification value. Individually constructed decision trees come together to form the decision forest. The classification results obtained are given in **Table 3**.

**Author details**

**References**

Abdullah Emre Keleş<sup>1</sup>

University, Adana, Turkey

Technology University, Adana, Turkey

and Mümine Kaya Keleş<sup>2</sup>

\*Address all correspondence to: mkaya@adanabtu.edu.tr

from: http://www.keel.es/ [Accessed: 2017-12-05]

2014); November 06-08, 2014. Antalya. 2014. p. 111

Natural and Applied Sciences Institute; 2016

11-13 February 2009. p. 11-13

DOI: 10.1016/j.sbspro.2014.05.215

\*

Determination and Classification of Crew Productivity with Data Mining Methods

http://dx.doi.org/10.5772/intechopen.75504

45

1 Faculty of Engineering, Department of Civil Engineering, Adana Science and Technology

[1] Balaban B. The Effect of the Culture on the Motivation of the Workers in the Turkish

[2] WEKA. Machine Learning Group at the University of Waikato [Internet]. 1993. Available

[3] KEEL. Knowledge Extraction based on Evolutionary Learning [Internet]. 2005. Available

[4] Kaya M, Keleş AE, Laptali Oral E. Construction crew productivity prediction by using data mining methods. Procedia—Social and Behavioral Sciences. 2013;**141**:1249-1253.

[5] Andaç MS, Oral EL. Yapım İşlerinde Çalışan Verimliliğinin Yapay Arı Kolonisi Algoritması Kullanılarak Tahmini. In: 3. Proje ve Yapım Yönetimi Kongresi (PYYK

[6] Keleş AE, Kaya M. The Analysis of the Factors Affecting the Productivity in the Wall Construction of the Using Apriori Data Mining Method. In: Proceedings of the XVI. Academic Information Conference (AB2014). Mersin; 05-07 February 2014. p. 831-836 [7] Keleş AE. İnşaat Projelerinde Şantiye Şeflerinin Liderliği ve Çalışan Motivasyonu İlişkisinin Veri Madenciliği ile Belirlenmesi [thesis]. Adana: Çukurova University

[8] Bass BM, Avolio BJ. Mind Garden Tools for Positive Transformation, Multifactor Leadership Questionnaire [Internet]. 1995. Available from: https://www.mindgarden.

[9] Dener M, Dörterler M, Orman A. Open Source Data Mining Programs: A Case Study on Weka. In: Proceedings of the XI. Academic Information Conference (AB09). Şanlıurfa;

com/16-multifactor-leadership-questionnaire [Accessed: 2017-12-05]

2 Faculty of Engineering, Department of Computer Engineering, Adana Science and

Construction Sector [thesis]. İstanbul: İstanbul Technical University; 2006

from: https://www.cs.waikato.ac.nz/ml/index.html [Accessed: 2017-12-05]
