**3. Findings**

In the study, the presence of gene variants of eNOS- Intron 4a/b VNTR, eNOSrs1799983, PER3-rs57875989, DRD2-rs1799732, COMT-rs4680, NR3C1- rs41423247 and DRD2, COMT, and NR3C1 gene methylation status, the criminal record history, continuum of substance use, former polysubstance abuse, suicidal behavior, and inpatient treatment were analyzed with decision trees, which are the classification algorithms. The accuracy, sensitivity, and precision performance rates of each model are presented in **Table 2**.

### **3.1 The criminal record history**

The tree structure established in order to review the effect of eNOS-Intron 4a/b VNTR, eNOS-rs1799983, PER3-rs57875989, DRD2-rs1799732, COMT-rs4680, NR3C1 rs41423247 gene variants, and DRD2, COMT, and NR3C1 gene methylation states on tendency of decriminalization was presented in **Figure 1**; the effect of input variables (gene variants) on weight distribution is presented in **Table 3**. The tendency of delinquency was evaluated by questioning the forensic history of the participant. The existence of criminal history, the tendency delinquency, and absence of the tendency to delinquency were interpreted that this tendency has been more suppressed and there is not any tendency to action. Therefore, the decision tree structure was


**Table 2.**

*Behavioral tendencies and performance rates in classification of gene variants by decision tree method.*

established as "Criminal Record History -Yes" or " Criminal Record History -No." There are 116 individuals in the Criminal Record History class of the dataset; however, the absence of criminal record history included 95 data. The tree root is established over the NR3C1 gene, as may be seen in **Figure 1**. The presence of CC genotype of NR3C1-rs41423247 following partial methylation of the NR3C1 gene is an effective sequence in predicting a tendency of an individual to delinquency.

*The Analysis on the Effects of COMT, DRD2, PER3, eNOS, NR3C1 Functional Gene Variants… DOI: http://dx.doi.org/10.5772/intechopen.106313*

#### **Figure 1.**

*The decision tree structure constructed with 10-fold cross-validation for the evaluation of gene variant effect of Criminal History.*


#### **Table 3.**

*Average weight values of gene variants in the criminal record history within the context of knowledge acquisition.*

The review of **Table 3** reveals that the attributes with the highest information gain are COMT-rs4680 with an average weight value of 0.013, and the lowest variable is COMT-METHYLATION with an average weight value of 0.001. This is concluded that the COMT-rs4680 variant contains the highest information gain on delinquency.

#### **3.2 The continuum of substance use**

The tree structure established to investigate the effect of the tendency toward continuous use of the substance without interruption in individuals with SUD. The weight distributions of the input variables (gene variants) on the output are presented in **Table 4**. The tendency for continuous use of the substance without interruption was evaluated by answers of the participants to the question "Do you use the substance intermittently?". The decision tree structure was created as "Intermittent" or "Continuous." The number of individuals who have declared intermittent substance use in the dataset was 100, whereas 111 individuals stated continuous substance use.


#### **Table 4.**

*Average weight values of gene variants in the trend to continuous substance use within the context of knowledge acquisition.*

The tree root starts as part of COMT-METHYLATION in the form of methylated or non-methylated and forms a wide branching structure. In case of the partial methylation of COMT-METHYLATION, the branching continues through the DRD2 rs1799732 gene variant; however, if unmethylated, it continues with the NR3C1 rs41423247 gene variant.

The review of reveals that the variable with the highest information gain is COMT-METHYLATION with an average weight value of 0.018, and the lowest variables with an average weight value of 0.001 are DRD2-METHYLATION, NR3C1 METHYLA-TION, eNOS-rs1799983, and eNOS-Intron 4a/b VNTR. This is concluded that the COMT-METHYLATION variant contains the highest information gain on delinquency.

#### **3.3 The former polysubstance abuse**

Individuals with SUD may use more than one substance such as alcohol, cigarettes, cannabis, heroin, cocaine, toluene, ecstasy, etc. Combined use of at least two of cannabinoid, synthetic cannabinoid, ecstasy, heroin, cocaine, and toluene is investigated within the scope of this sub-assessment. In addition to any of the aforementioned substances, tobacco and/or alcohol use, the use was not included in the multiple substance use, because all of the participants already use these two substances in combination with the other substances mentioned. The reason for exclusion of this situation from the analysis is that it is clear that there will be no meaningful results.

The weight distributions of the input variables (gene variants) on the output are presented in **Table 5**. The decision tree structure class was created as "Polysubstance Use - Yes" or "Polysubstance Use - No." The number of people who declared combined use of at least two of the substances mentioned was 95 in the dataset, and the number of people who declared use of one substance was 116.

The tree root starts by COMT-rs4680 and forms a wide branching structure. The Met/Met and Val/Val genotype branches to NR3C1-METHYLATION and to Val/Met NR3C1- rs41423247 and continues. It is concluded that there is not any tendency to polysubstance use when the Met/Met genotype is unmethylated in the NR3C1*The Analysis on the Effects of COMT, DRD2, PER3, eNOS, NR3C1 Functional Gene Variants… DOI: http://dx.doi.org/10.5772/intechopen.106313*


#### **Table 5.**

*Average weight values of gene variants in the polysubstance use within the context of information gain.*

METHYLATION branch; however, there is a tendency in partial methylation followup of the Val/Val genotype.

The review of **Table 5** reveals that the variable with the highest information gain is COMT-rs4680 with an average weight value of 0.032, and the lowest variable is DRD2-METHYLATION with an average weight value of 0.001. This is concluded that the COMT-rs4680 variant contains the highest information gain in multiple substance use.

#### **3.4 The suicidal behavior**

The individuals were asked whether they had attempted suicide at least once in order to evaluate the suicidal behavior in individuals with SUD. One-hundred and forty-nine individuals who had never attempted suicide were classified under "No Suicide Attempt," and 62 individuals who had at least one or more suicide attempts were classified under "Suicide Attempts."

The tree root starts by NR3C1-rs41423247 and forms a wide branching structure. It is concluded that suicidality is not seen in the GG genotype, and it branches to NR3C-METHYLATION in the GC and CC genotypes and continues. There is not any tendency when NR3C1-METHYLATION is unmethylated in the CC genotype; however, the same pathway shows the tendency in the GC genotype.

The review of **Table 6** reveals that the variable with the highest information gain is PER3-rs57875989 with an average weight value of 0.013, and the lowest variable is DRD2-rs1799732 with an average weight value of 0.005. This is concluded that the PER3-rs57875989 variant contains the highest information gain on suicidality in individuals with SUD.

#### **3.5 The inpatient treatment**

Uzbay (Uzbay 2015) defined substance addiction in general as "a brain disease characterized by some behavioral disorders and the desire to take a substance continuously or periodically in order to feel the pleasurable effects of the substance, or to avoid the discomfort caused by its absence." Based on this definition, hospitalization


#### **Table 6.**

*Average weight values of gene variants in the suicidal behavior within the context of information gain.*

of an individual with SUD to be treated voluntarily may be evaluated as a desire to get rid of substance use or to avoid substance use. The participants were classified under two groups depending on the history of hospitalization. There were 158 individuals under the class of "History of Hospitalization-yes" and 53 individuals under the class of "History of Hospitalization-No." These numbers suggest that the majority of the participants tend to avoid the substance addiction. The tree root starts by NR3C1- METHYLATION and forms a wide branching structure. The branching continues in partial methylation and unmethylated pathways through NR3C1-rs41423247. The individuals with the NR3C1-rs41423247 CC genotype bound in the unmethylated pathway have a tendency to avoid the substance, binding to the eNOS-rs1799983 gene variant occurs when the same pathway is followed in the partial methylation pathway.

The review of **Table 7** reveals that the variable with the highest information gain is NR3C1-METHYLATION with an average weight value of 0.017, and the lowest variable is COMT-METHYLATION with an average weight value of 0.001. This causes to conclude that NR3C1-METHYLATION status provided the highest information about transforming the tendency to avoid or to get rid of the substance into the behavior.


#### **Table 7.**

*Average weight values of gene variants in the inpatient treatment within the context of information gain.*

*The Analysis on the Effects of COMT, DRD2, PER3, eNOS, NR3C1 Functional Gene Variants… DOI: http://dx.doi.org/10.5772/intechopen.106313*

### **3.6 Discussion and conclusion**

According to the data of the Ministry of Justice, the number of people in prison for crimes related to substance addiction was 57,674 in 2018 corresponding to 21.78% of all convicts [22]. It is detected in some studies on substance addiction that substance users have higher rates of prison history [23]. There is a similar pattern in the database used within the scope of the study, and 53.7% of individuals with SUD have a forensic history. Thirteen of the 28 decision leaves obtained in the established tree structure ended with the existence of a criminal story; however, 15 ended with the absence of a criminal story. The tree root is established over the NR3C1 gene, as seen in **Figure 1**. In a recent study on convicted male individuals, results obtained indicated that the NR3C1 gene is associated with violent behavior in adult males [24]. For instance, the presence of CC genotype of NR3C1-rs41423247 following partial methylation of the NR3C1 gene is an effective sequence in predicting a tendency of an individual to delinquency. It is detected that the dominant variable in the tree is the COMT-rs4680 functional gene variant, which is in line with the studies of the literature [9, 25]. When the tree success rates in **Table 2** are examined, it is seen that the accuracy rate of the model is 52.68%. The lack of information about more individuals in the dataset, lower diversity of the dataset such as the absence of individuals without SUD and with same genetic information as input information have prevented the higher learning.

Substance addiction is a process that causes many systems to change physiologically and the desire to use the substance continuously by withdrawal [26]. Therefore, the tendency to continuous substance use is the natural expected result of the substance addiction. However, the desire to get rid of the substance is effective in getting away from this situation for a while. From this point of view, the root of the COMT-METHYLATION part, which starts in the form of methylated (Partial) or unmethylated, establishes a wide branching structure in the tree structure established. In case of the partial methylation of COMT-METHYLATION, the branching continues through the DRD2-rs1799732 gene; however, if unmethylated, it continues with the NR3C1-rs41423247 gene variant. Forty-four decision leaves were formed on the tree, 20 of which resulted in intermittent use and 24 of them in continuous use. It was detected that the variable with the highest information gain was the COMT-METHYLATION. In the review of the performance rates in **Table 2**, the accuracy rate was detected as 49.76%.

When the COMT-rs4680 genotypes and allele distributions were compared with clinical parameters in the statistical results of the dataset used in the study with X et al., it was observed that multiple substance use was significantly lower in individuals with Met/Met genotype than in individuals with Val/Met and Val/Val genotypes, and multiple substance use was found statistically significantly higher in carriers of the Val allele. Vandenbergh et al. also found in their study that the high-activity Val allele was significantly higher in individuals with multiple substance use [27]. It was observed in the classification of decision trees that there was not any tendency for multiple substance use in the separation of the tree root with COMT-rs4680 and the sequencing of the Met/Met separation with NR3C- METHYLATION to unmethylated. Sequencing continued with DRD2-rs1799732 in partial methyl cleavage of the same pathway. It was concluded that a tendency to multiple substance use appeared in the partial methylation of the sequence with NR3C1 METHYLATION in the Val/Val separation of the main root. Findings of the tree provide similar results when compared with previous studies. Thirty-five result leaves appeared on the tree. Fifteen of these leaves ended that there was a tendency, and 20 ended without any tendency.

It was detected that the variable with the highest information gain was the COMTrs4680. In the review of the performance rates in **Table 2**, the accuracy rate was detected as 51.21%.

It is stated in studies conducted on individuals with substance use that individuals are suicidal due to their inability to cope with the economic difficulties due to the substance, inadequacy, family problems, exclusion from society, mood disorders, depression experienced during substance withdrawal. It was observed in consideration of the statistical results of the dataset in previous studies (reference) that suicide attempts were at higher levels in individuals who have started to use substances before the age of 15 years. It was commented that individuals at and below 15 years of age have not yet completed their physical and mental development, they may be easily affected by their friends, and the emotional and hormonal changes due to adolescence may have caused these differences. The tree root starts by NR3C1 rs41423247 and forms a wide branching structure. It is concluded that suicidality is not seen in the GG genotype, and it branches to NR3C-METHYLATION in the GC and CC genotypes and continues. There is not any tendency when NR3C1-METHYLATION is unmethylated in the CC genotype; however, the same pathway shows the tendency in the GC genotype. Findings of the tree provide similar results when compared with previous studies. Forty-three result leaves appeared on the tree. Fifteen of these leaves ended that there was a tendency, and 22 ended without any tendency. It was detected that the variable with the highest information gain was the PER3-rs57875989. In the review of the performance rates in **Table 2**, the accuracy rate was detected as 65%.

It was seen that the root formed a wide branching structure from NR3C1-METH-YLATION when considering the tree structure established for the trend analysis for the desire to get rid of the substance examined under the presence or absence of inpatient treatment history. The branching continues in partial methylation and unmethylation pathways through NR3C1-rs41423247. The individuals with the NR3C1-rs41423247 CC genotype bound in the unmethylation pathway have a tendency to avoid the substance, binding to the eNOS-rs1799983 gene variant occurs when the same pathway is followed in the partial methylation pathway. Twenty-eight result leaves appeared on the tree. Fifteen of these leaves ended with the decision that there was a tendency, and eight ended without any tendency. It was detected that the variable with the highest information gain was the NR3C1-METHYLATION. The review of the performance rates in **Table 2** reveals that the accuracy rate was detected as 70.56%.

In this study, the effects of genetic and methylation differences of COMT, DRD2, PER3, eNOS, NR3C1 functional gene variants on the potential trends in individuals with SUD were analyzed by decision trees algorithm, and their similarities with previous studies in the literature were evaluated. The tendencies are grouped in a structure suitable for dual classification under the subgroups of tendency to delinquency, tendency to use of the substance, tendency to use of multiple substances, tendency to suicide, and tendency to abandon the substance, respectively. There is not any deficient data in the dataset. The 10-fold cross-validation was used in the model. This method creates k discrete pieces in a dataset with m samples, each containing m/ k samples. This method allocates a different dataset for testing each time and uses the remaining k-1 dataset for training purposes. It is trained k times by changing the classifier in this way. In the last step, it estimates the classifier performance by the average of the k errors obtained.

The decision trees are a model that may provide effective results in binary classes among machine learning classification methods. Our study is the first in the literature *The Analysis on the Effects of COMT, DRD2, PER3, eNOS, NR3C1 Functional Gene Variants… DOI: http://dx.doi.org/10.5772/intechopen.106313*

to examine the effects of gene variants on behavioral tendencies through machine learning methods. However, the lower accuracy rates obtained in this study indicate that the dataset needs to be more diverse and comprehensive.
