We are IntechOpen, the world's leading publisher of Open Access books Built by scientists, for scientists

6,100+

Open access books available

150,000+

International authors and editors

185M+ Downloads

156 Countries delivered to Our authors are among the

Top 1% most cited scientists

12.2%

Contributors from top 500 universities

Selection of our books indexed in the Book Citation Index in Web of Science™ Core Collection (BKCI)

### Interested in publishing with us? Contact book.department@intechopen.com

Numbers displayed above are based on latest data collected. For more information visit www.intechopen.com

## Meet the editors

Dr. Juber Akhtar has more than 15 years of research and teaching experience. He obtained his BPharm and MPharm from Jamia Hamdard, New Delhi, India, and Manipal University, Karnataka, India. He obtained his Ph.D. from Integral University, Lucknow, India. Dr. Akhtar was nominated as the head of the Faculty of Pharmacy, at Integral University. He also served as chairman cum biological scientist for the Institutional Animal

Ethics Committee (IAEC). He is a member of the Causality Assessment Committee (CAC) and the Animal House facility, at Integral University. He has experience teaching abroad and was previously a professor at Buraydah College of Pharmacy and Dentistry, Kingdom of Saudi Arabia (KSA). Dr. Akhtar has more than ninety publications in reputed journals to his credit. He is an academic book editor and an editorial board member for many esteemed journals. He has supervised a dozen Ph.D. and MPharm students in research projects. Dr. Akhtar's areas of research interest include the development of nanoparticulate drug delivery systems.

Dr. Badruddeen obtained both a BPharm and MPharm from Hamdard University, Delhi, India. He obtained a Ph.D. in Pharmacology from Integral University, Lucknow, India, in 2014. He has expertise in cardiovascular, neurodegenerative, and endocrine diseases. He has experience in conducting research on the safety, toxicity, and clinical efficacy of herbal and allopathic drugs. He has many original articles in reputed journals

to his credit. Dr. Badruddeen is an academic book editor and an editorial board member for many esteemed journals. He has supervised a dozen research scholars and is a member of the Indian Science Congress Association, member secretary of the Institutional Animal Ethics Committee (IAEC), coordinator of the Adverse Drug Monitoring Center (AMC), chairman of the Causality Assessment Committee (CAC), and a member of the Animal House facility, Integral University.

Dr. Mohammad Irfan Khan obtained an MPharm from Hamdard University, Delhi, India. He obtained his Ph.D. in Phytochemistry from Singhania University, Jhunjhunu, India. He has expertise in Ayurvedic medicine and nutraceuticals and has worked with some of the world's major pharmaceutical companies. He has more than 12 years of experience in research and formulation development of Ayurvedic, herbal, and nutra-

ceutical products and successfully developed more than 100 formulations. He has more than forty original papers in international journals to his credit. Dr. Khan has supervised many research scholars at the post-graduate and Ph.D. levels.

Dr. Mohammad Ahmad holds a specialization in Pharmacology from Jamia Hamdard, New Delhi, India. He received his Ph.D. from Integral University, Lucknow, India. Presently, he is an Assistant Professor of Pharmacology, at the Faculty of Pharmacy, at Integral University. He has been teaching PharmD, BPharm, and MPharm students and conducting research in the field of diabetes and cancer. From 2011 to 2014, he worked as a research assistant

on a project sponsored by the Council for Science and Technology, Uttar Pradesh, India. He has published more than twenty original articles in reputed journals. He is a member of the British Society of Nanomedicine, Indian Science Congress Association, and International NanoScience Community, Budapest, Hungry. He is also a social nominee of the Committee for the Purpose of Control and Supervision of Experiments on Animals (CPCSEA). Dr. Ahmad is actively involved in research and his areas of interest include pharmacology and drug and pharmaceutical regulatory affairs. He is also engaged in research pertaining to diabetes mellitus, oral delivery of anticancer agents, and chemoprevention using natural bioactive compounds.

### Contents


#### **Chapter 6 89**

Machine Learning and Artificial Intelligence in Therapeutics and Drug Development Life Cycle *by Subhomoi Borkotoky, Amit Joshi, Vikas Kaushik and Anupam Nath Jha*

## Preface

The development of a new drug product or biologic is an extended, multifarious, and costly progression that usually takes on an average 10 to 12 years. Occasionally, extra time may be required from product development to commercialization. This book presents a comprehensive overview of drug design, highlighting the steps involved from the discovery phase to product approval.

The book is divided into four sections. Section 1 discusses drug development, highlighting analytical processes, some adverse drug reactions associated with drugs in a few conditions, and repurposing of medicaments in various ailments. Section 2 discusses the topical application of drugs. It includes some pharmaceutical nanoformulations and cosmeceuticals useful in various diseases including topical diseases. Section 3 focuses on ocular drug delivery systems and the barriers encountered during the delivery of drugs in the eyes. Section 4 explores *in silico* drug development involving computer-aided drug design, machine learning, and artificial intelligence. It also examines the computational and statistical methods used to investigate and analyze chemicals in pharmaceutical medicine.

This book is written by experts in the field and is a useful resource for students, researchers, and academicians. I am thankful to all the authors for their excellent contributions.

> **Dr. Juber Akhtar** Associate Professor, Faculty of Pharmacy, Integral University, Lucknow, India

**Badruddeen, Mohammad Irfan Khan and Mohammad Ahmad**

Integral University, Lucknow, India

**1**

Section 1

Investigation of Medicine

Progress

### Section 1

## Investigation of Medicine Progress

#### **Chapter 1**

## Introductory Chapter: Drug Development Life Cycle

*Juber Akhtar and Badruddeen*

#### **1. Introduction**

Drug development comprises all the activities involved in transforming a compound from drug candidate (the end product of the discovery phase) to a product approved for marketing, that is, the dosage form which will be available in the market for sale after the approval of the appropriate regulatory authorities (**Figure 1**).

#### **2. Topical formulations**

The drug development of topical formulations involving nanoemulgel delivery system in which fusion of two different delivery systems and the physical state of drug containing nanoemulsion is elaborated. A nanoemulsion which is a thermodynamically stable system might be transformed into the nanogel. A formulator thus can make the incorporation of lipophilic drugs in the system and further might be used in treatment. The poor oral bioavailability, and unpredictable pharmacokinetic and absorption variation of various drugs can be overcome by this technique. Simultaneously, its non-greasy nature and easily spreading ability support the patient compliance. The treatment of acne, pimple, psoriasis, fungal infection, and inflammation caused by osteoarthritis and rheumatoid arthritis is possible.

**Figure 1.** *A brief summary on drug development.*

#### **3. The ophthalmic preparations**

The ophthalmic preparations and various barriers affecting drug penetration and distribution inside the eye were also explained. As per World Health Organization, the prevalence of distance or near vision impairment is increasing. Both the anterior and posterior areas of the eyes are affected by various degenerative infections. These may be age-related macular degeneration and diabetic retinopathy at the posterior segment, which can cause severe vision loss. The ocular drug delivery is one of the challenges for delivery of medicaments since it has number of anatomical and physiological barriers. Keeping in mind one full chapter has been compiled in the book in which various barriers that can protect the external and internal structures of the eye from the passage of drugs are elaborated. However, it is very difficult to attain effective pharmacotherapy because of these barriers. Many conventional dosage forms (eye drops and ointments) cannot achieve therapeutic concentrations in the posterior region of eye since only an extremely small amount of drug (1/100,000) can reach the retina and choroid. Although investigations into novel dosage forms that can be applied topically are underway, the topical dosage form might be formed and may target posterior segment diseases [1, 2].

#### **4. The machine learning and artificial intelligence and computer-aided drug design (CADD)**

The machine learning and artificial intelligence (AI) and computer-aided drug design (CADD) are also the new challenges for formulation developer. One can identify and practically implement the number of computational and statistical methods for analyzing biomedical entities, so that target identification will be easy, cost-effective, and validated ones. To complete the drug development processes, CADD can be used to attain biochemical safety and effectiveness, and to stay away from toxicity. The *in silico* techniques that are accepted in academics, firms, and administration [3, 4] may lead to momentous improvement in drug blueprint and innovations. Since a huge raw data (primary data) were obtained during and after biological, chemical, and pharmaceutical medicine development, there is need of machine learning algorithms that can be optimized and same to be applied in the countryside of CADD. In this way, significant improvement in the competence of drug design and discovery processes is possible. If a formulator apply computational methods and tools during drug design and discovery and development, one can think over an accurate and reliable pre-processed data [5, 6]. Further artificial intelligence (AI) approaches might be useful for pre-processing of huge data [7] and its modeling [8, 9] and overall design of the dosage forms.

*Introductory Chapter: Drug Development Life Cycle DOI: http://dx.doi.org/10.5772/intechopen.106480*

#### **Author details**

Juber Akhtar\* and Badruddeen Faculty of Pharmacy, Integral University, Lucknow, India

\*Address all correspondence to: jakhtar@iul.ac.in

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

### **References**

[1] Lee J, Rhee YS. Ophthalmic dosage forms for drug delivery to posterior segment. Journal of Pharmaceutical Investigation. 2022;**52**:161-173

[2] Loftsson T. Topical drug delivery to the retina: Obstacles and routes to success. Expert Opinion on Drug Delivery. 2022;**19**(1):9-21

[3] Schaduangrat N, Lampa S, Simeon S, Gleeson MP, Spjuth O, Nantasenamat C. Towards reproducible computational drug discovery. Journal of Cheminformatics. 2020;**12**:9. DOI: 10.1186/s13321-020-0408-x

[4] Yang X, Wang YF, Byrne R, Schneider G, Yang SY. Concepts of artificial intelligence for computerassisted drug discovery. Chemical Reviews. 2019;**119**:10520-10594. DOI: 10.1021/acs.chemrev.8b00728

[5] Katsila T, Spyroulias GA, Patrinos GP, Matsoukas MT. Computational approaches in target identification and drug discovery. Computational and Structural Biotechnology Journal. 2016;**14**:177-184. DOI: 10.1016/j.csbj.2016.04.004

[6] Macalino SJY, Gosu V, Hong SH, Choi S. Role of computer-aided drug design in modern drug discovery. Archives of Pharmacal Research. 2015;**38**:1686-1701. DOI: 10.1007/ s12272-015-0640-5

[7] Hu YH, Lin WC, Tsai CF, Ke SW, Chen CW. An efficient data preprocessing approach for large scale medical data mining. Technology and Health Care. 2015;**23**:153-160. DOI: 10.3233/THC-140887

[8] Car J, Sheikh A, Wicks P, Williams MS. Beyond the hype of big data and artificial intelligence: Building foundations for knowledge and wisdom. BMC Medicine. 2019;**17**:143. DOI: 10.1186/s12916-019-1382-x

[9] Saez C, Garcia-Gomez JM. Kinematics of big biomedical data to characterize temporal variability and seasonality of data repositories: Functional data analysis of data temporal evolution over non-parametric statistical manifolds. International Journal of Medical Informatics. 2018;**119**:109-124. DOI: 10.1016/j.ijmedinf.2018.09.015

### **Chapter 2**

A Method for Plotting Disease Drug Analysis and Its Complications by Combining Sources of Scientific Documents Using Deep Learning Method with Drug Repurposing: Case Study Metformin

*Zahra Rezaei and Behnaz Eslami*

#### **Abstract**

Drugs for medical purposes aim at saving one's life and improving their life quality. Side effects or adverse drug reactions (ADRs) on patients are studied as an important issue in pharmacology. In order to prevent the adverse drug effects, clinical trials are conducted on the drug production process, but the process of these trials is very costly and time consuming. So, various text mining methods are used to identify ADRs on scientific documents and articles. Using existing articles in the reference websites such as PubMed to predict an effective drug in the disease is a vital way to declare the drug effective. However, the effective integration of biomedical literature and biological drug network information is one of the major challenges in diagnosing a new drug. In this study, we use medical text documents to train the BioBERT model so that we can use it to discover potential drugs for treating diseases. Then, we are able to create a graphical network of drugs and their side effects with this method as well as it provides us with an opportunity to identify effective drugs that have been used in many diseases so far while having the ability to be used effectively on other diseases.

**Keywords:** adverse drug reactions, drug repurposing, deep learning, natural language processing, social network

#### **1. Introduction**

What makes reusing old drugs worthwhile is the cost of developing a new drug, according to research [1]. The cost of developing a new drug reaches billions of dollars, which includes analysis, testing, validation costs, and so on. More importantly, duration of developing new drugs may take long nearly 9 to 12 years to launch a new drug.

This practice, therefore, is deemed to be of importance in the pharmaceutical industry because it accelerates the development of drugs and reduces the cost of drug production, especially for pandemic diseases such as COVID-19, and the need to use this scientific process in the field of artificial intelligence is essential.

Due to the rapid growth of scientific articles in medical research, the analysis of medical textual documents using text mining methods has become very popular. The emergence of powerful deep learning methods and their maturity in text mining has created various development ways for different types of text analysis. The only drawback of deep learning models is their training using a large number of input data, which has made an unsurmountable challenge in medical topics. Fortunately, various medical sites such as PubMed allow the use of textual data and have seriously contributed to the development of deep learning models.

Improvements in healthcare and nutrition have generated remarkable increases in life expectancy worldwide. Although our understanding of the molecular basis of these morbidities has quickly advanced, effective novel treatments are still lacking. Today, the topic of reusing drugs based on text mining methods and based on valid scientific articles is important and vital, because based on the characteristics of pharmacokinetics and pharmacodynamics, the process of data generation has already been approved and validated by scientific communities and the study of side effects, and their impact on other diseases will significantly save the time and cost of the data generation process. Creating new drug profiles based on previously valid drugs is a way of bypassing the drug production cycle.

Metformin is one such drug currently being investigated for novel applications. What is clear from the clinical evidence is that metformin is prescribed in the treatment of diabetes. The aim of this research is to investigate the effects of metformin on various diseases that are reflected in PubMed documents. What will be studied in this report are the results of the use of metformin in the prevention of various diseases.

This chapter aims to provide the reported results, available in medical literatures for potential of metformin to prevent or treat different kinds of disorders.

Furthermore, some of the previous researches in the field of drug reuse have been reviewed in the second chapter. In the third chapter, the proposed research model is discussed and in the fourth chapter, the explanation of the model architecture is discussed. The implementation results and final outputs of the proposed method are explained in the last section.

#### **2. Related works**

Drug reuse is used to treat diseases other than an approved disease (such as drug use in new drugs, development of indications, or change of indications), including the development of new medical applications for previously approved drugs, as well as the evolutionary cycle. A drug is defined for the use and development of drugs that are in the drug archive. This strategy is not very new, but it has gained significant momentum in the last decade as approved scientific sources on drug reuse have identified side effects.

About one-third of the approvals in recent years correspond to drug repurposing, and repurposed drugs currently generate around 25% of the annual revenue for the pharmaceutical industry [2].

*A Method for Plotting Disease Drug Analysis and Its Complications by Combining Sources… DOI: http://dx.doi.org/10.5772/intechopen.107858*

Drug reuse involves identifying new uses for existing drugs. Prominent examples of the use of these methods include sildenafil and thalidomide as a result of serendipity [3].

Graphs of drugs, genes, and diseases are created and clustering methods are developed to predict new edges between drugs and diseases [4].

Disease genes and drug genes are modeled. Relationships from Medical Scientific Documents and Induction of Indirect Relationships Between Drugs and Diseases Proposed a ranking method based on the similarity of the drug target to rank these relationships [5].

When predicting a new drug-target interactions (DTI), drug-drug interaction (DDI) [6], there are three levels of prediction using machine learning techniques. First, it preprocesses input data such as drug side effects, drug chemical structure, and disease genes and provides training data through feature extraction. The appropriate machine learning algorithm is then used for training. Third, we apply a predictive model to get the results of drug repositioning in the test dataset. The data is transformed into a consistent, normalized format, such as computer-readable vectors and matrices, before being entered into the machine learning model to train the representation. Representation learning [7] (or feature learning) is a set of techniques for transforming raw data into something that can be effectively used through machine learning. Representation learning is mainly divided into a supervised learning approach and an unsupervised learning approach and extracts the properties of the input data of the downstream.

#### **3. Material and methods**

We look for relevant publication in PubMed through using metformin as key word. The data of this research are documents and scientific articles written in English between 1994 and 2020. In this direction, we applied named entity recognition (NER) BioBERT method.

The used NER method includes three main phases (**Figure 1**); it is started with textual documents from PubMed which are entered as input data, followed by preprocessing phase to improve data, and eventually, in the third phase, grouping data into train and test categories is done, and NER via deep learning algorithm – BioBERT method – runs to extract patterns.

#### **3.1 Data sources**

We looked at 18000 publications in PubMed using metformin as a keyword. The abstracts of 16,000 out of them were analyzed by NER BioBERT. This search covered studies which have been done between 1994 and 2020.

**Figure 1.** *The workflow of the proposed model-based strategy.*

#### **3.2 Preprocessing**

The preprocessing of comments in both datasets was performed as follows:


#### **3.3 Deep classification**

Bidirectional Encoder Representations from Transformers for Biomedical Text Mining can be considered a particular language pretrained model on a large-scale biomedical corpus. According to the mentioned architecture, the knowledge from a large number of biomedical documents by BioBERT [8] is transferred to biomedical text mining models with the least amount of modification in the architecture. Whereas competitive performances with previous novel models appeared by BERT and BIOBERT essentially have better performance on the following three biomedical text mining functions: biomedical named entity recognition and biomedical clustering based on the effect of Metformin.

Different diseases and the trend of metformin impact in publication during several years, based on drug effect on various diseases.

BioBERT effectively moved the data from a part of biomedical textual documents to biomedical text mining models by some alterations in a particular structure. Whereas BERT had outlined excellent function with previous models, BioBERT discernibly overwhelmed them on entity recognition and clustering concerning metformin effect on individuals' wellbeing.

We investigated publications based on the association between metformin and type 1 and 2 diabetes. And separately, we explored them in accordance with the effect of metformin on other disease.

PubTator [9] and BEST [10] are two of the potential sources that automatically can extract compounds and proteins from PubMed or PubMed Central (PMC). However, these two sources are not able to extract the combined and interactive relationships between the drug and the disease. To address these issues, we began building a pipeline using NER to identify studies containing DTI and extract related data. We, then, trained the BioBERT model on known studies containing DTIs and used this model to predict new drug studies.

Indeed, given an input sentence X = {x1, x2, …, xN} where xi is the i-th word/token and N speaks to the length of the sentence. The objective of NER is to categorize each word/token in X and allot it to corresponding name y ϵ Y, where Y may be a predefined list of all conceivable name sorts (e.g. CHEMICAL as Drugs, Infection).

Additionally, this structure was used after preprocessing to identify the relationship between the drug and the disease. In future research, we are going to create this *A Method for Plotting Disease Drug Analysis and Its Complications by Combining Sources… DOI: http://dx.doi.org/10.5772/intechopen.107858*

graph of relationships and use the number of drug references to a chemical structure as a weight to discover drug relationships.

#### **4. Result**

There have been a few detailed examinations into the relationship between metformin and the results of cures in different diseases. Moreover, these preclinical reports and dependable biological pathways have been known which clarify the atomic component of metformin and addressed in our research work. Nevertheless, the vital reply to this issue is the level of metformin adequacy against nondiabetic disarranges.

Metformin is the generic name of the drug which is produced and supplied under different brand names such as Metformex, Glucophage, and so on. As shown in **Figure 2**, the drugs extracted from the authoritative scientific articles are in the drug groups related to diabetes and some other drug groups.There are several classes of drugs used to control diabetes, and members belonging to each group have similar


#### **Figure 2.**

*The word-cloud of the BioBERT model in drugs.*

**Figure 3.**

*The word-cloud of the BioBERT model in disease.*

**Figure 4.** *The word-cloud of the BioBERT model in disease.*

*A Method for Plotting Disease Drug Analysis and Its Complications by Combining Sources… DOI: http://dx.doi.org/10.5772/intechopen.107858*

**Figure 5.** *The word-cloud of the BioBert model in disease.*

functions. One of these drug classes is biguanides. Metformin, the only member of this drug group, works in three ways:


Diabetes medications are generally prescribed to lower blood glucose For example, in articles, it refers to synthetic alternatives and antidiabetic drugs to reduce perfusion or kidney function, exacerbate the antihypertensive effects, exacerbate metabolic acidosis, and so on (**Figures 3**–**5**).

According to the NERBIOBERT model, out of 16,781 articles reviewed by the PubMed site and analyzed in the article, 6185 papers refer to type 2 diabetes and 221 papers refer to type 1 diabetes as we know it. Type 2 diabetes is a chronic disease. It is characterized by high levels of sugar in the blood. Type 2 diabetes is also called type 2 diabetes mellitus and adult-onset diabetes. Although, 2388 papers used metformin in type 2 diabetes mellitus, and 1178 articles did not mention any disease at all. Therefore, based on the type of articles, if the adverse use of metformin for the treatment of another disease has been identified, it can be used in the treatment of that disease. What is important in this analysis is a demonstration of the disease and the drug so that by analyzing a large volume of authoritative articles, the use of the approved drug can be used in the treatment of other diseases.

#### **5. Conclusion**

Experimental results on the drugs and disease with using advanced deep learning models like Bret show that integrating pretrained biomedical language representation models (i.e. BERT and BioBERT) into a pipe of information extraction methods with multitask learning can improve the ability to collect drug repurposing knowledge from PubMed.

Hitherto, there has not been any clear answer for that in clinical trial, and also, the role of metformin on treatment or prevention of disease remains hypothetical on next step, and we will extract the association between diabetes and other relevant disease with respect to administration of metformin as treatment.

### **Acknowledgements**

The authors have no proprietary, financial, professional, or other personal interest of any nature in any product, service, or company. There is no conflict of interest in this study.

#### **Author details**

Zahra Rezaei1 \* and Behnaz Eslami<sup>2</sup>

1 Department of Statistics and Information Technology, Institute of Judiciary, Tehran, Iran

2 Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran

\*Address all correspondence to: z.rezaei2010@gmail.com

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*A Method for Plotting Disease Drug Analysis and Its Complications by Combining Sources… DOI: http://dx.doi.org/10.5772/intechopen.107858*

#### **References**

[1] Dickson M, Gagnon JP. The cost of new drug discovery and development. Discovery Medicine. 2009;**22**(4):172-179

[2] Naylor S, Kauppi MJ, Schonfeld JM. Therapeutic drug repurposing, repositioning and rescue part II: Business review. Drug Discovery World. 2015;**16**:57-72

[3] Liu Z, Fang H, Reagan K, Xu X, Mendrick DL, William Slikker WT Jr. In silico drug repositioning: What we need to know. Drug Discovery Today. 2013;**18**:110-115

[4] Sun P, Guo J, Winnenburg R, Baumbach J. Drug repurposing by integrated literature mining and drug-gene-disease triangulation. Drug Discovery Today. 2017;**22**:615-619

[5] Yang H-T, Ju J-H, Wong Y-T, Shmulevich I, Chiang J-H. Literaturebased discovery of new candidates for drug repurposing. Briefings in Bioinformatics. 2017;**18**:488-497

[6] Zhu S, Bai Q, Li L, Xu T. Drug repositioning in drug discovery of T2DM and repositioning potential of antidiabetic agents. Computational and Structural Biotechnology Journal. 2022;**20**:2839-2847

[7] Bengio ACPY. Vincent representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2013;**35**(8):1798-1828

[8] Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, et al. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;**36**(4):1234-1240

[9] Wei C-H, Kao H-Y, Lu Z. PubTator.A web-based text mining tool for assisting biocuration. Nucleic Acids Research. 2013;**41**:W518-W522

[10] Lee S, Kim D, Lee K, Choi J, Kim S, Jeon M, et al. BEST: Next-generation biomedical entity search tool for knowledge discovery from biomedical literature. PLoS One. 2016;**11**(10):e0164680

Section 2
