**4. Applications of computational methods in AhR modeling**

The wealth of structural information described above on AhR provides an excellent opportunity to apply various computer-based simulations to study the dynamicity and structural organization of the various AhR domains. The applications of such computational tools not only can yield much needed insights on how these domains interact together within the AhR machinery, but can also offer detailed answers on their interactions with other AhR partners (*e.g.* ARNT, DNA, and chaperone proteins). It can also explain how a small molecule ligand can bind to AhR and how this can affect AhR functions, conformational dynamics or its interaction with other entities. Computational tools can also suggest novel-binding sites either within the AhR structure, or at the interfaces described above to either

stabilize these interaction (*i.e.* agonism) or block these interactions (*i.e.* antagonism). Most importantly, computational methods including virtual screening can be used as a high throughput-screening tool to identify compounds that can bind to these sites to modulate the AhR activity.

### **4.1 Modeling the PAS B domain**

Most of the *in silico* AhR screening campaigns till now focused on studying the PAS B domain. The PAS B domain is also known as the ligand-binding domain (LBD), where the ligands (agonists/antagonists) have been shown to bind [41]. Given that the available AhR structures (described above) lack this domain, computational methods played a key role in studying the interactions of ligands with this important region. Towards this goal, homology modeling was used to build 3-dimensional structures for this domain to allow the study of ligand binding to AhR. A homology model approach usually starts with identifying a similar template to the target domain (*i.e.*, the PAS B domain in this case). Once a template is identified, one uses various computational methods (*e.g.* sequence alignment, threading, and loop modeling) to construct the 3-dimensional structure of the target protein [42].

In many AhR studies, the human hypoxia inducible factors (HIF-2α) crystal structures served as templates for AhR-PAS B domain because it has the highest sequence similarity towards the AhR-PAS B domain. **Table 2** provides a list of the reported *in silico* studies that were conducted over the last few years by adopting various crystal structures of HIF-2α as starting points to construct PAS B models. These studies were focused on understanding the roles played by the different PAS B residues in interacting with known AhR modulators and to screen for novel AhR ligands. Docking findings from these studies indicated that the binding cavity within the AhR-LBD can accommodate ligands with structural maximal dimensions of 14 A ̊ X 12 A ̊ X 5A ̊, and showed that their binding within AhR relies mainly on electronic properties [17]. Given this information, various computational methods including molecular modeling, molecular docking followed by MD simulations, and binding free energy calculations were used to provide insights about ligand interactions within the PAS B pockets [49].

For example, Bisson and his group established an agonist-optimized model of the human AhR-PAS B domain, followed by docking around five thousand chemical structures, including AhR agonists and antagonists, within the PAS B domain.


**Table 2.**

*Report studies that used different crystal structures of human hypoxia inducible factors (HIF-2*α*).*

*Targeting the Aryl Hydrocarbon Receptor (AhR): A Review of the In-Silico Screening… DOI: http://dx.doi.org/10.5772/intechopen.99228*

Docking results were then filtered and the top five systems were subjected to long MD simulations (~ 60 ns) to study the conformational and dynamical changes in these generated complexes. Findings from Bisson's work revealed the importance of residues 307–329 in the PAS B domain, which were shown to be very flexible, acting as an access gate to the ligand-binding pocket. These residues can also adopt different conformations upon AhR ligands' binding and play a primary function in controlling the structural changes and accessibility of the ligands to the AhR ligand binding pocket [41].

#### **4.2 Interaction of the PAS B domain with different ligands**

With the 3-dimensional structure of the PAS B domain in hand, many groups focused on studying its binding to different ligands (*e.g.,* TCDD) (shown in **Figure 5**) [50] and investigated this binding reaction for different species. For example, TCDD studies on mouse AhR revealed a number of conserved residues that regulate the access of TCDD to the binding pocket [51–53]. These residues include Thr283, His285, Phe289, Tyr316, Ile319, Cys327, Met334, Phe345, Ala375, and Gln377 and have been also shown to control the internal size of the binding cavity [54–56]. Similarly, aromatic side chains of Phe 289, Phe 345, and Tyr 316 were shown to be important in stabilizing TCDD in its best mode of binding via noncovalent interaction [54].

Mutations at outer residues (e.g., Arg282, Thr311, Glu339, and Lys350) into alanine did not impact TCDD binding to AhR [57]. In the human AHR-LBD a mutation at Ala375 to Val and Leu decreases the binding affinity of TCDD and makes indirubin a less potent endogenous AhR ligand [45, 58]. Additional site-directed mutagenesis within AhR-LBD residues has been used to identify key residues promoting for ligand selectivity in AhR. These developed models provided a clear basis towards understanding the mechanism of ligand-dependent activation of AHR via its PAS B domain. In particular, the above mentioned molecular docking

**Figure 5.** *Chemical structures of AhR ligands in this study obtained from pubchem database.*

and mutagenesis analyses helped in identifying and confirming the binding pocket of TCDD and other AhR modulators [52, 57, 59, 60].

Examples of these models include those developed by Kim and her team, who constructed 3D models from several avian species including, chicken, albatross, and cormorant, and studied the sensitivity of dioxin derivatives against multiple AhR isoforms. All models were subjected to docking simulations with TCDD followed by MD simulations. Kim's results used the mean square displacement (MSD) of the MD trajectories as a stability indicator for the bound ligands. These findings revealed Ile324 and Ser380 from chicken AhR1 exhibited the least MSD values compared to all AhR-LBD residues in other avian species. The size of binding pocket was also shown to be variable among the different species. Moreover, stabilization of TCDD in the binding pocket of chicken AhR relied on the features of Ile324 and Ser380, which explained why chicken AhR is more sensitive to TCDD binding compared to other AhR isoforms [54, 61–63].

Further mutational and functional analysis studies were expanded to include additional AhR modulators other than TCCD. For example, the work of Faber and her team studied induribin binding to AhR in both mouse and human. This study revealed that a mutation in His326Tyr and Ala349Thr in mouse AhR, and Tyr332 and Thr355 in human AhR can increase the potency of indole compounds, particularly, indirubin. Also, although indirubin and vemurafenib can fit within the same binding pocket in AhR, the two compounds showed two different modes of binding [45, 47]. For example, flutamide efficiently binds to residues inside the AHR-LBD with a high affinity in both mouse and human AHR to activate the AhR pathway [64]. It is important to note that, the biological response of AhR is dependant on the type of the bound ligand and has been shown to change based on the interaction of a given ligand with the residues forming the LBD in the PAS B domain [48, 65].

#### **4.3 Virtual screening and machine learning models applied to AhR**

Over the last few decades, virtual screening has been used as a major tool to in hit identification campaigns against numerous biological targets [66]. In this regard, AhR is no exception and various *in silico* screening methods have been employed to identify new AhR modulators based on the developed 3D models for the PAS B domain [13]. These methods can be classified into two major groups; ligand-based methods (*e.g.,* quantitative structure–activity relationship (QSAR)) and structure-based methods (*e.g.,* docking and MD simulations). Ligand-based methods (LB) depend on the knowledge of known active/inactive molecules against a given target or disease to suggest new active chemical entities. Ligand-based methods are typically used when no information about the 3D structure of the target is available. They include QSAR, pharmacophore modeling and machine learning algorithms methods. QSAR models, for example, correlates the structural, physicochemical features, and biological mode of action of known compounds to build a mathematical model, which can be used to suggest new modifications to these structures for better activity or improved biophysical/biochemical properties [67–69].

Pharmacophore modeling maps the ligand-target interactions into a set of steric and electronic features structured in a specific 3D arrangement [70]. These pharmacophore models can be then used to screen millions of available chemical structural libraries for compounds that satisfy these pharmacophore features, which can be used for scaffold hopping and fragment-based drug design. On the other hand, structure–based methods require the knowledge of target protein crystal structure, or its 3D developed homology models. Ligands from a given database can be fitted into the active site of the target protein and can be ranked based on the predicted

#### *Targeting the Aryl Hydrocarbon Receptor (AhR): A Review of the In-Silico Screening… DOI: http://dx.doi.org/10.5772/intechopen.99228*

binding affinities. In this context, molecular docking and molecular dynamics simulations are among the many valuable tools that can be used to predict the most probable mode of binding of a given ligand within the target. Furthermore, structure-based pharmacophore models can provide more detailed insights on the interaction of ligand with the binding site [69, 71, 72].

As discussed below, several AhR screening studies combined both methods to enhance the search for possible AhR candidates [67, 73]. The plethora of accumulated physicochemical, chemical and structural data on AhR modulators augmented this hit identification search with great tools to build reliable machine learning models, which require large datasets of chemical structures along with their interaction kinetics with AhR [74].

An example of AhR *in silico* screening studies is the one implemented by Xiao et al., who constructed 3D structures of the PAS B using the HIF-2α as a template. Xiao used his model to study the effects of ~185 polybrominated diphenyl ethers (PBDEs), classified as organic pollutants, on AhR activation. This study combined molecular docking simulations, two-dimensional quantitative structure–activity relationship (2D-QSAR) models, and three-dimensional QSAR (3D-QSAR) models to analyze the local ligands' interactions against a diverse set of PAS B configurations. Their result showed that bromide replacements in at the ortho- or metapositions of PBDEs (BDE-49) as shown in the **Figure 5** exhibited the largest effect on PBDEs' binding, mainly interacting with residues Met342, Thr290, Met334, and Phe289 in the binding site of the AHR-PAS B model in mouse and zebrafish [75, 76].

In a similar approach, Rath and his team built two human PAS B domain; a wild type and mutant (Val381 Ala, Val381Asn) models. Around 60 natural compounds from *Withania somnifera* were then docked within these models. Docking rustles were then refined using MD simulations for 50 ns. Findings from Rath's study showed that withaferin A, withanolide A, withanolide B, withanolide D and withanone were effective as AhR ligands in all three models. In the meantime, withanolide A was more stable in the binding site and interacted with various residues in each model even after 50 ns of the MD simulations. Withanolide A was further validated experimentally in an *in vivo* zebrafish model to significantly reduce CYP1A1 expression. This was done in the presence of a strong AhR activator, namely benzo[a]pyrene in adult zebrafish brain when administrated together. Thus, withanolide A (see **Figure 5** and **Table 3**) neutralized the benzo[a]pyrene toxicity in zebrafish brain [77].


#### **Table 3.**

*Activation of AhR transcription by chemical compounds that identified by in silico screening of different chemical libraries.*

In another screening study, Mahiout, et al. identified IMA-06201 and IMA-06504 as two novel AhR agonists, with similar modes of binding to that of TCDD. Both compounds showed great stability in the central area of the AhR ligandbinding pocket. Furthermore, these AhR agonists were shown to be more efficient and more potent as selective AhR modulators than TCDD. To confirm that, Mahiout used CYP1A1 enzyme activity as a biomarker for AhR activation and compared the efficacy and potency of IMA-06201 and IMA-06504 (see **Figure 5** and **Table 3**) to that of TCDD in the presence and absence of the AhR antagonist, CH-223191, at different concentrations in rat hepatoma cell lines. Their results showed that the new compounds, IMA-06201 and IMA-06504, were able to induce CYP1A1 activity in a similar efficacy to that of TCDD, where CH-223191 was shown to block their CYP1A1 induction. Also, in an Ames test to assess the genotoxicity of the new identified compounds, IMA-06201 and IMA-06504 did not show mutagenic effects at low concentrations [46].

Machine-learning algorithms combined with QSAR have been recently used to screen for new AhR ligands. For instance, Matsuzaka used deep learning (DL) to construct machine-learning models to predict AhR activators. These models showed advantages on enhanced input data based on the 3D chemical structures of the compounds into these models, and their performance was better than traditional machine learning models [78]. To enhance the screening process of AhR ligands, Zhu established a virtual screening protocol from combining ligand-based and structure-based screening with supervised machine learning to screen around eight thousand from the pesticide databases to identify an agonistic effect on AHR activity. Zhu's results revealed sixteen compounds as AhR activators and these findings were validated in a zebrafish *in vivo* model to assess their AhR activation and exhibited induction in CYP 1a1 levels [79].

Towards improving the prediction accuracy of his model, Yang, et al. used machine learning algorithms to construct two-dimensional quantitative structure– activity relationship (2D-QSAR) models from multiple linear regression (MLR) and artificial neural network (ANN) algorithms. He used the pEC50 values of 60 dioxins derivatives as AhR activators to build. These models predicted the toxicity of 162 new dioxin derivatives, showing a good correlation between compounds' chemical structures and their IC50 and EC50 values.

Recently, Goya-Jorge employed various machine learning algorithms to build a set of QSAR models. These models adopted the adoboost (AdB), random forest (RF), gradient boosting (GB), support vector machine (SVM), and multilayer perceptron (MLP) as classifiers to examine around 1900 compounds from synthetic and natural sources on their AhR agonism. Around 40 compounds baring the benzothiazole scaffold were classified as AhR agonists. In vitro validation of these hits showed that indole derivatives can serve as AhR ligands, including the endogenous substances [80, 81]. **Table 3** reports some of the top hits emerging from different *in silico* studies.

#### **5. Current challenges in modeling AhR**

Identifying novel AhR modulators using in silico approaches require establishing more comprehensive computational models of this target. These models should describe the detailed organization of the different AhR domains as well is its interaction with other protein/DNA partners. While the available crystal structures provide a glimpse of these missing pieces of information, there are still more to be done in this regard. For example, all currently available AhR crystal structures deposited in the protein data bank are lacking two important AhR domains, namely *Targeting the Aryl Hydrocarbon Receptor (AhR): A Review of the In-Silico Screening… DOI: http://dx.doi.org/10.5772/intechopen.99228*

the PAS B domain and the transactivation domain [39]. The transactivation domain is essential in AhR intercellular trafficking.

On the other hand, the PAS B domain interacts with an AhR ligand, which can modulate the AhR activity. While homology modeling has helped constructing acceptable models for this domain, the similarity of the templates used to build the PAS B domain is very low, leaving a lot of doubt about their accuracy. A crystal structure of the PAS B domain would be a great leap forward towards understanding the mode of action of AhR modulators and towards identifying better agonists/ antagonists for this important target. Furthermore, there is a gap of knowledge on how AhR interact with other protein partners in the inactive state, including co-chaperone, AIP, and the protein kinase SRC. This builds an additional challenge to identify druggable pockets at their protein–protein interfaces [7, 82]. With the apparent advances in obtaining 3D experimental structures of protein (e.g. Cryoelectron microscopy (cryo-EM)) one expects several of these structural challenges can be solved in the near future, opening new gates for the computational science to identify new AhR modulators and to help understand its functional, structural and biological characterizes more clearly.

### **6. Executive summary**

The AhR is a ligand-activated transcriptional factor. It regulates various genes' expression and plays a pathophysiological function in numerous diseases. Crystallography has been employed to resolve three crystal structures containing bHLH and PAS A domains from human and mouse origin and to identify four protein–protein interfaces. However, all these structures lacked the PAS B domain, which plays a fundamental role in ligands' binding domain to AhR. Computational and mutational studies revealed important residues that constitute the binding pockets within the PAS B domain. Towards identifying novel AhR modulators, several virtual screening and machine learning algorithms were constructed based on the available structural and pharmacological properties of known AhR ligands. Computational methods are extremely fast and intensely reduce the cost and time in screening millions of compounds to find compounds that could interact with the AhR. Recent studies employing these methods against AhR have been reviewed and discussed in this chapter. We hope the literature presented here can help advance the development of novel, selective and potent AhR modulators.

### **Author details**

Farag E.S. Mosa, Ayman O.S. El-Kadi\* and Khaled Barakat\* Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB, Canada

\*Address all correspondence to: kbarakat@ualberta.ca and aelkadi@ualberta.ca

© 2021 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
