**Introduction**

**Chapter 1**

**Provisional chapter**

**Introductory Chapter: Making Health Care Smart**

**Introductory Chapter: Making Health Care Smart**

DOI: 10.5772/intechopen.78993

The age of eHealth and Smart Medicine is upon us, but what exactly does this mean? As technology advances, we are able to create electronic devices that collect and analyze data, electronic communication methods that alert health care providers immediately when adverse events arise, and electronic algorithms that help automate and speed up clinical

A primary leader in smart medicine is the use of wearable technology. These electronic devices enable the collection of important medical data. Combining wearable devices such as heart rate monitors, pulse oximeters, and sleep monitors with blockchain technology allows this important patient information to be recorded accurately, remain immutable over time, and interact with algorithms designed to improve medical diagnosis and treatment. Wearable technology is already well developed. Making this technology interoperable with electronic medical records in a manner allowing smart execution of health care protocols becomes pos-

Satoshi Nakamoto set forth the initial implementation of blockchain technology in the white paper "Bitcoin: a peer-to-peer electronic cash system" in 2008 [1]. This white paper presented a method to create an Internet-based currency that did not require a trusted third-party intermediary such as a bank, government, or Federal Reserve. Instead of using a third-party intermediary, the blockchain method utilized computers hooked up to the Internet to confirm transactions in a manner that would prevent malicious hacking, cheating, or double-spending. Bitcoin was subsequently created, with the first transaction occurring in January, 2009. Nakamoto's blockchain method serving as the foundation for bitcoin has proven to be widely successful, with the market capitalization of bitcoin as of early 2018 equal to approximately

> © 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

distribution, and reproduction in any medium, provided the original work is properly cited.

Thomas F. Heston

**1. Introduction**

decision-making.

\$150 billion USD.

Additional information is available at the end of the chapter

Thomas F. HestonAdditional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.78993

sible with the use of blockchain technology.

#### **Introductory Chapter: Making Health Care Smart Introductory Chapter: Making Health Care Smart**

DOI: 10.5772/intechopen.78993

#### Thomas F. Heston Thomas F. Heston

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.78993

**1. Introduction**

The age of eHealth and Smart Medicine is upon us, but what exactly does this mean? As technology advances, we are able to create electronic devices that collect and analyze data, electronic communication methods that alert health care providers immediately when adverse events arise, and electronic algorithms that help automate and speed up clinical decision-making.

A primary leader in smart medicine is the use of wearable technology. These electronic devices enable the collection of important medical data. Combining wearable devices such as heart rate monitors, pulse oximeters, and sleep monitors with blockchain technology allows this important patient information to be recorded accurately, remain immutable over time, and interact with algorithms designed to improve medical diagnosis and treatment. Wearable technology is already well developed. Making this technology interoperable with electronic medical records in a manner allowing smart execution of health care protocols becomes possible with the use of blockchain technology.

Satoshi Nakamoto set forth the initial implementation of blockchain technology in the white paper "Bitcoin: a peer-to-peer electronic cash system" in 2008 [1]. This white paper presented a method to create an Internet-based currency that did not require a trusted third-party intermediary such as a bank, government, or Federal Reserve. Instead of using a third-party intermediary, the blockchain method utilized computers hooked up to the Internet to confirm transactions in a manner that would prevent malicious hacking, cheating, or double-spending. Bitcoin was subsequently created, with the first transaction occurring in January, 2009. Nakamoto's blockchain method serving as the foundation for bitcoin has proven to be widely successful, with the market capitalization of bitcoin as of early 2018 equal to approximately \$150 billion USD.

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The backbone for bitcoin is a simple blockchain of transactions that is immutable and secure due to a global distributed network of computer nodes (also known as miners) that confirms new transactions and secures old transactions. This distributed ledger technology works well, powering about \$2 billion USD in transactions per day, with a total number of financial transactions to date of over 300 million [2, 3]. The success of bitcoin has created a wide expansion of blockchain technology, to the point where distributed computers around the world now confirm smart contracts [4], provide cloud storage [5], and facilitate communications between small devices (e.g., wearable wrist health bands) that make up the Internet of Things [6].

Blockchain technology creates trustworthy data that is reliably stored, easily accessed, and resistant to corruption. Wearable technology such as heart rate monitors, bed monitors, and pulse oximeters collect important information that when entered into a blockchain ledger can be processed by digital doctors that not only can be programmed by expert physicians, but can ultimately learn and improve through artificial intelligence. As we have seen in the home, the Internet of Things (IOT) has led to considerable advances in the creation of smart homes. Now, this technology is being applied to monitoring health with wrist monitors, blood glucose monitors, temperature monitors, and more. The time is right for not only having smart homes, but having smart hospitals. IOT along with blockchain technology is leading the way.

Introductory Chapter: Making Health Care Smart http://dx.doi.org/10.5772/intechopen.78993 5

Elson S. Floyd College of Medicine, Washington State University, Spokane, Washington,

[1] Nakamoto S. Bitcoin: A Peer-to-Peer Electronic Cash System [Online]. 2008. Available

[2] Bitcoin Now Processes \$2 Billion Worth of Transactions per Day, A 10x Increase in 2017 [Internet]. Available from: https://www.forbes.com/sites/ktorpey/2017/11/20/bitcoinnow-processes-2-billion-worth-of-transactions-per-day-a-10x-increase-in-

[3] Total Number of Transactions [Internet]. Blockchain.info. Available from: https://blockchain.info/charts/n-transactions-total?timespan=all [Accessed: May 20, 2018]

[4] Buterin V. Visions, Part 1: The Value of Blockchain Technology [Internet]. Ethereum Blog. 2015. Available from: https://blog.ethereum.org/2015/04/13/visions-part-1-the-value-of-

[5] Harper C. What is Siacoin? A Beginner's Guide to Decentralized Cloud Storage [Internet]. CoinCentral. 2018. Available from: https://coincentral.com/siacoin-beginner-guide/

[6] Mark. What Is IOTA? [Internet]. The Merkle. 2018. Available from: https://themerkle.

[7] CREDITS. Blockchain is the key to the Internet of Things in medicine [Internet]. 2018. Available from: https://medium.com/@credits/blockchain-is-the-key-to-the-internet-of-

from: https://bitcoin.org/bitcoin.pdf. [Accessed: 7 June 2018]

**Author details**

Thomas F. Heston

USA

**References**

Address all correspondence to: tom.heston@wsu.edu

2017/#6cd1eed32fba [Accessed: May 21, 2018]

blockchain-technology/ [Accessed: July 13, 2017]

com/what-is-iota-cryptocurrency/ [Accessed: May 19, 2018]

things-in-medicine-71668652d88b [Accessed: May 09, 2018]

[Accessed: May 21, 2018]

Through the integration of electronic devices with blockchain technology, the utility of wearable monitors increases tremendously [7]. By creating an immutable, trusted ledger of patient data, blockchain technology not only allows monitors to trigger human responses but also collects important physiologic information that can be analyzed later by both human doctors and also by "digital doctors," i.e., smart algorithms that would trigger actions based upon the input. In the blockchain world, these smart algorithms that trigger actions are called smart contracts [8].

Digital doctors can serve multiple purposes. First of all, alarms set off by existing monitors in most hospitals can be missed for example when the medical ward is busy making hearing the alarm more difficult. Monitors displaying bells or popups are only effective when a human is actively monitoring the screen in a focused, non-distracted way. Digital doctors, however, act according to algorithms, which execute instantaneously. Digital doctors do not get distracted, they do not require sleep, and they have an infinite attention span.

What can these digital doctors do, and why do they require blockchain technology? First of all, digital doctors can instantly initiate codes. For example, a "code sepsis" can instantly be initiated whenever a patient's vital signs become unstable; a "rapid response code" could be instantly initiated whenever the cardiac monitor displayed a malignant arrhythmia. In some cases, these digital doctors could act spontaneously without human intervention (e.g., this is done with wearable insulin pumps and implanted cardiac defibrillators), and in other cases they could trigger initiation of a medical treatment protocol that would require physician review before implemented.

The key to digital doctors becoming useful and effective is trustworthy, accurate, immutable, and private data. Medical care requires accurate collection of patient health data. Scans must be done properly, blood tests must be processed appropriately, and real-time monitors must be calibrated. This is where blockchain technology can really help, because it allows the collection of data in a prompt manner that can be trusted and immutable. Recording data for digital doctors in a centralized database would result in a system that was vulnerable to a single point attack, whether it be an electricity failure or human hacker. Blockchain technology, on the other hand, would make the data more interoperable by ensuring it is readily accessible to digital doctors. It would make the data more reliable through blockchain consensus mechanisms that would be strongly resistant against hacking. It would also make the data easier to audit for quality improvement purposes. Finally, using cryptography inherent in blockchain technology, patient confidentiality is prioritized [9].

Blockchain technology creates trustworthy data that is reliably stored, easily accessed, and resistant to corruption. Wearable technology such as heart rate monitors, bed monitors, and pulse oximeters collect important information that when entered into a blockchain ledger can be processed by digital doctors that not only can be programmed by expert physicians, but can ultimately learn and improve through artificial intelligence. As we have seen in the home, the Internet of Things (IOT) has led to considerable advances in the creation of smart homes. Now, this technology is being applied to monitoring health with wrist monitors, blood glucose monitors, temperature monitors, and more. The time is right for not only having smart homes, but having smart hospitals. IOT along with blockchain technology is leading the way.

#### **Author details**

The backbone for bitcoin is a simple blockchain of transactions that is immutable and secure due to a global distributed network of computer nodes (also known as miners) that confirms new transactions and secures old transactions. This distributed ledger technology works well, powering about \$2 billion USD in transactions per day, with a total number of financial transactions to date of over 300 million [2, 3]. The success of bitcoin has created a wide expansion of blockchain technology, to the point where distributed computers around the world now confirm smart contracts [4], provide cloud storage [5], and facilitate communications between small devices (e.g., wearable wrist health bands) that make up the Internet of

Through the integration of electronic devices with blockchain technology, the utility of wearable monitors increases tremendously [7]. By creating an immutable, trusted ledger of patient data, blockchain technology not only allows monitors to trigger human responses but also collects important physiologic information that can be analyzed later by both human doctors and also by "digital doctors," i.e., smart algorithms that would trigger actions based upon the input. In the blockchain world, these smart algorithms that trigger actions are called smart

Digital doctors can serve multiple purposes. First of all, alarms set off by existing monitors in most hospitals can be missed for example when the medical ward is busy making hearing the alarm more difficult. Monitors displaying bells or popups are only effective when a human is actively monitoring the screen in a focused, non-distracted way. Digital doctors, however, act according to algorithms, which execute instantaneously. Digital doctors do not get distracted,

What can these digital doctors do, and why do they require blockchain technology? First of all, digital doctors can instantly initiate codes. For example, a "code sepsis" can instantly be initiated whenever a patient's vital signs become unstable; a "rapid response code" could be instantly initiated whenever the cardiac monitor displayed a malignant arrhythmia. In some cases, these digital doctors could act spontaneously without human intervention (e.g., this is done with wearable insulin pumps and implanted cardiac defibrillators), and in other cases they could trigger initiation of a medical treatment protocol that would require physician

The key to digital doctors becoming useful and effective is trustworthy, accurate, immutable, and private data. Medical care requires accurate collection of patient health data. Scans must be done properly, blood tests must be processed appropriately, and real-time monitors must be calibrated. This is where blockchain technology can really help, because it allows the collection of data in a prompt manner that can be trusted and immutable. Recording data for digital doctors in a centralized database would result in a system that was vulnerable to a single point attack, whether it be an electricity failure or human hacker. Blockchain technology, on the other hand, would make the data more interoperable by ensuring it is readily accessible to digital doctors. It would make the data more reliable through blockchain consensus mechanisms that would be strongly resistant against hacking. It would also make the data easier to audit for quality improvement purposes. Finally, using cryptography inherent

they do not require sleep, and they have an infinite attention span.

in blockchain technology, patient confidentiality is prioritized [9].

Things [6].

4 eHealth - Making Health Care Smarter

contracts [8].

review before implemented.

Thomas F. Heston

Address all correspondence to: tom.heston@wsu.edu

Elson S. Floyd College of Medicine, Washington State University, Spokane, Washington, USA

#### **References**


[8] Shaik K. Why blockchain and IoT are best friends [Internet]. Blockchain Unleashed: IBM Blockchain Blog. 2018. Available from: https://www.ibm.com/blogs/blockchain/2018/01/ why-blockchain-and-iot-are-best-friends/ [Accessed: May 08, 2018]

**Section 2**

**Fundamentals**

[9] Snell E. Why Blockchain Technology Matters for Healthcare Security [Internet]. 2016. Available from: https://healthitsecurity.com/features/why-blockchain-technology-matters-for-healthcare-security [Accessed: April 12, 2018]

**Section 2**

## **Fundamentals**

[8] Shaik K. Why blockchain and IoT are best friends [Internet]. Blockchain Unleashed: IBM Blockchain Blog. 2018. Available from: https://www.ibm.com/blogs/blockchain/2018/01/

[9] Snell E. Why Blockchain Technology Matters for Healthcare Security [Internet]. 2016. Available from: https://healthitsecurity.com/features/why-blockchain-technology-mat-

why-blockchain-and-iot-are-best-friends/ [Accessed: May 08, 2018]

ters-for-healthcare-security [Accessed: April 12, 2018]

6 eHealth - Making Health Care Smarter

**Chapter 2**

**Provisional chapter**

**Terminology Services: Standard Terminologies to**

**Control Medical Vocabulary. "Words are Not What** 

**Terminology Services: Standard Terminologies to** 

**Say but What they Mean"**

**they Say but What they Mean"**

http://dx.doi.org/10.5772/intechopen.75781

Julia Frangella

Julia Frangella

**Abstract**

Daniel Luna, Carlos Otero, María L. Gambarte and

Daniel Luna, Carlos Otero, María L. Gambarte and

Additional information is available at the end of the chapter

Additional information is available at the end of the chapter

**Control Medical Vocabulary. "Words are Not What they**

Data entry is an obstacle for the usability of electronic health records (EHR) applications and the acceptance of physicians, who prefer to document using "free text". Natural language is huge and very rich in details but at the same time is ambiguous; it has great dependence on context and uses jargon and acronyms. Healthcare Information Systems should capture clinical data in a structured and preferably coded format. This is crucial for data exchange between health information systems, epidemiological analysis, quality and research, clinical decision support systems, administrative functions, etc. In order to address this point, numerous terminological systems for the systematic recording of clinical data have been developed. These systems interrelate concepts of a particular domain and provide reference to related terms and possible definitions and codes. The purpose of terminology services consists of representing facts that happen in the real world through database management. This process is named Semantic Interoperability. It implies that different systems understand the information they are processing through the use of codes of clinical terminologies. Standard terminologies allow controlling medical vocabulary. But how do we do this? What do we need? Terminology services are a

fundamental piece for health data management in health environment.

semantic interoperability, standard terminology

**Keywords:** terminology server, interface vocabulary, controlled vocabularies,

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

distribution, and reproduction in any medium, provided the original work is properly cited.

DOI: 10.5772/intechopen.75781

#### **Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not What they Say but What they Mean" Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not What they Say but What they Mean"**

DOI: 10.5772/intechopen.75781

Daniel Luna, Carlos Otero, María L. Gambarte and Julia Frangella Daniel Luna, Carlos Otero, María L. Gambarte and Julia Frangella

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.75781

#### **Abstract**

Data entry is an obstacle for the usability of electronic health records (EHR) applications and the acceptance of physicians, who prefer to document using "free text". Natural language is huge and very rich in details but at the same time is ambiguous; it has great dependence on context and uses jargon and acronyms. Healthcare Information Systems should capture clinical data in a structured and preferably coded format. This is crucial for data exchange between health information systems, epidemiological analysis, quality and research, clinical decision support systems, administrative functions, etc. In order to address this point, numerous terminological systems for the systematic recording of clinical data have been developed. These systems interrelate concepts of a particular domain and provide reference to related terms and possible definitions and codes. The purpose of terminology services consists of representing facts that happen in the real world through database management. This process is named Semantic Interoperability. It implies that different systems understand the information they are processing through the use of codes of clinical terminologies. Standard terminologies allow controlling medical vocabulary. But how do we do this? What do we need? Terminology services are a fundamental piece for health data management in health environment.

**Keywords:** terminology server, interface vocabulary, controlled vocabularies, semantic interoperability, standard terminology

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### **1. Introduction**

Recently, major healthcare stakeholders around the world have emphasized on the importance of establishing electronic health records (EHR) for all health care institutions. Their goals for doing so include increasing patient safety, reducing medical errors, improving efficiency and reducing costs [1, 2]. Everyday practical data entry, presentation and document retrieval for clinical tasks must be taken into account, so that the differences between the needs of users and the needs of available software's are addressed. Data entry is an obstacle for the adoption of EHR with structured data method and the acceptance of healthcare providers, who prefer to document healthcare findings, processes and outcomes using unfettered "free text" or narrative text in natural language [3]. Natural language is huge and very rich in details but at the same time ambiguous, having great dependence on context, it uses jargon and acronyms and it lacks of rigorous definitions.

find one that meets all their needs. Each author who expresses a need for a controlled vocabulary does so with a particular purpose in mind, so there are also multiple characteristics that it should accomplish [7, 8]. Because of all the reasons mentioned before, for a long time, there has been a discussion regarding the use of free text versus structured text for data entry in EHR that later must be codified. Free text has the advantage of allowing health care providers to express themselves freely, but as disadvantage it has the need for an arduous codification process to allow further analysis. Structured text allows a quick codification process but has the disadvantage of being time consuming for the physician and contains expressions to the level of detail of the selected entry terminology [9]. It has been suggested that tension between clinical usability and meticulous knowledge representation may result from a fundamental conflict between the

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

11

Ideally, clinical data should be coded by the practitioner at the time of the consultation, which is known as primary coding, so they can utilize their knowledge of the patient situation while being aware of the limitations set down by the selected classification system [10]. However there are several practical difficulties in setting primary coding. It is time consuming for the practitioner and requires major efforts in their training, to ensure that the same code would be chosen in the same situation by different physicians [10]. It also limits the physician's expression in the registry and creates high levels of resistance in its use. Enforcing mandatory as opposed to optional modifier codes results in lower rates of incomplete coding [11]. One answer to this problem is centralized secondary coding, where a reduced number of trained persons codify the narrative text recorded by the physicians taking care of the patients. Centralized secondary coding by non medical coders had proved to be reliable and can be used for coding medical problems from an electronic problem-oriented medical record [10]. As regards as the coding tool, manual coding versus computerized coding, it has been demonstrated that the use of a computerized coding tool can save time and result in higher quality coding. A study that compares both of them had shown that manual coding takes 100% longer [11]. It is fundamental to contemplate that time spent on coding may be underestimated when we look at individual coding times instead of looking at the whole task of processing a clinical scenario [11]. The completeness of coding had also been demonstrated that can be improved using a computerized coding tool [11]. A step-forward option is to achieve textautocoding, allowing free or narrative text entry together with dynamic interaction of the

As a result, the challenge consists on finding the complex balance between the freedom of use of free text and the benefits of structured text for data entry in EHR. In order to answer to this need, interface terminology and a terminology server were developed. It is crucial to highlight that communication is successful only if the sender and the receiver know both, the language (code) and the context. Notice the importance of the context, which must be read in

needs of humans and those of computer programs that use terminologies [3].

**1.3. Primary coding versus secondary coding**

information system at the time of entering data.

an identical manner for both parties involved [5].

Some of the aims to coding in EHR are:

**1.4. But again, why do we need to codify in electronic health records?**

#### **1.1. The importance of narrative**

Free text narrative formats allow physicians to share complex ideas in an efficient and effortless manner. It use in electronic health records allows them to synthesize facts and to point a full picture rich with meaning that it might be easily interpreted by other health care providers [3]. Between physicians` register motivation, the main one is their own use of the information. Many current systems that provide EHRs use template-based system in order to capture structured data elements in databases. Structure data entry does not support the expressiveness and flexibility to which clinicians are accustomed, and it can be difficult to interpret and reconstruct meaning from structure data due to loss of contextual information [4]. To represent medical knowledge, it is necessary to represent patient's data from different sources including: problem list and sometimes progress notes, procedures, medication list, labs and complementary tests results, social determinants of health environmental information, people's decisions about health and medical treatments, genomics and proteomics, etc. As a result, ambiguities must be resolved and vocabulary standardized.

#### **1.2. The need of a standard codification system**

To accomplish these, EHR should capture the clinical data in a structured and preferably coded format. Looking at the definition of codifying, we found "To reduce to a code" [5]. Codes are usually numeric or alphanumeric. In order to represent facts that happen in the real world to be managed in a database, the need of a standard codification system (SCS) arise. Evans et al. stated that the medical community required a "common, uniform, and comprehensive approach to the representation of medical information" [6].

This SCS should be able to capture clinical findings, index medical records, index medical literature and represent medical knowledge, etc. Provided that possible, the codification should be one-to-one: one term should only exist for a given object. Each term should describe only one object. The aim is to avoid ambiguity through polysemy or homonymy [5].

In fact, many SCS have been proposed but their adoption has been slow and incomplete. System developers generally indicate that, while they would like to make use of standards, they cannot find one that meets all their needs. Each author who expresses a need for a controlled vocabulary does so with a particular purpose in mind, so there are also multiple characteristics that it should accomplish [7, 8]. Because of all the reasons mentioned before, for a long time, there has been a discussion regarding the use of free text versus structured text for data entry in EHR that later must be codified. Free text has the advantage of allowing health care providers to express themselves freely, but as disadvantage it has the need for an arduous codification process to allow further analysis. Structured text allows a quick codification process but has the disadvantage of being time consuming for the physician and contains expressions to the level of detail of the selected entry terminology [9]. It has been suggested that tension between clinical usability and meticulous knowledge representation may result from a fundamental conflict between the needs of humans and those of computer programs that use terminologies [3].

#### **1.3. Primary coding versus secondary coding**

**1. Introduction**

10 eHealth - Making Health Care Smarter

and acronyms and it lacks of rigorous definitions.

**1.2. The need of a standard codification system**

**1.1. The importance of narrative**

Recently, major healthcare stakeholders around the world have emphasized on the importance of establishing electronic health records (EHR) for all health care institutions. Their goals for doing so include increasing patient safety, reducing medical errors, improving efficiency and reducing costs [1, 2]. Everyday practical data entry, presentation and document retrieval for clinical tasks must be taken into account, so that the differences between the needs of users and the needs of available software's are addressed. Data entry is an obstacle for the adoption of EHR with structured data method and the acceptance of healthcare providers, who prefer to document healthcare findings, processes and outcomes using unfettered "free text" or narrative text in natural language [3]. Natural language is huge and very rich in details but at the same time ambiguous, having great dependence on context, it uses jargon

Free text narrative formats allow physicians to share complex ideas in an efficient and effortless manner. It use in electronic health records allows them to synthesize facts and to point a full picture rich with meaning that it might be easily interpreted by other health care providers [3]. Between physicians` register motivation, the main one is their own use of the information. Many current systems that provide EHRs use template-based system in order to capture structured data elements in databases. Structure data entry does not support the expressiveness and flexibility to which clinicians are accustomed, and it can be difficult to interpret and reconstruct meaning from structure data due to loss of contextual information [4]. To represent medical knowledge, it is necessary to represent patient's data from different sources including: problem list and sometimes progress notes, procedures, medication list, labs and complementary tests results, social determinants of health environmental information, people's decisions about health and medical treatments, genomics and proteomics, etc.

To accomplish these, EHR should capture the clinical data in a structured and preferably coded format. Looking at the definition of codifying, we found "To reduce to a code" [5]. Codes are usually numeric or alphanumeric. In order to represent facts that happen in the real world to be managed in a database, the need of a standard codification system (SCS) arise. Evans et al. stated that the medical community required a "common, uniform, and compre-

This SCS should be able to capture clinical findings, index medical records, index medical literature and represent medical knowledge, etc. Provided that possible, the codification should be one-to-one: one term should only exist for a given object. Each term should describe only

In fact, many SCS have been proposed but their adoption has been slow and incomplete. System developers generally indicate that, while they would like to make use of standards, they cannot

As a result, ambiguities must be resolved and vocabulary standardized.

hensive approach to the representation of medical information" [6].

one object. The aim is to avoid ambiguity through polysemy or homonymy [5].

Ideally, clinical data should be coded by the practitioner at the time of the consultation, which is known as primary coding, so they can utilize their knowledge of the patient situation while being aware of the limitations set down by the selected classification system [10]. However there are several practical difficulties in setting primary coding. It is time consuming for the practitioner and requires major efforts in their training, to ensure that the same code would be chosen in the same situation by different physicians [10]. It also limits the physician's expression in the registry and creates high levels of resistance in its use. Enforcing mandatory as opposed to optional modifier codes results in lower rates of incomplete coding [11]. One answer to this problem is centralized secondary coding, where a reduced number of trained persons codify the narrative text recorded by the physicians taking care of the patients. Centralized secondary coding by non medical coders had proved to be reliable and can be used for coding medical problems from an electronic problem-oriented medical record [10]. As regards as the coding tool, manual coding versus computerized coding, it has been demonstrated that the use of a computerized coding tool can save time and result in higher quality coding. A study that compares both of them had shown that manual coding takes 100% longer [11]. It is fundamental to contemplate that time spent on coding may be underestimated when we look at individual coding times instead of looking at the whole task of processing a clinical scenario [11]. The completeness of coding had also been demonstrated that can be improved using a computerized coding tool [11]. A step-forward option is to achieve textautocoding, allowing free or narrative text entry together with dynamic interaction of the information system at the time of entering data.

As a result, the challenge consists on finding the complex balance between the freedom of use of free text and the benefits of structured text for data entry in EHR. In order to answer to this need, interface terminology and a terminology server were developed. It is crucial to highlight that communication is successful only if the sender and the receiver know both, the language (code) and the context. Notice the importance of the context, which must be read in an identical manner for both parties involved [5].

#### **1.4. But again, why do we need to codify in electronic health records?**

Some of the aims to coding in EHR are:

• to support health services research: this system can promote quality of care by providing a link to medical knowledge and current publications that can be used for outcome measurement.

While many terminologies have been developed, no single terminology has been accepted as a universal standard for the representation of clinical concepts. By contrast, individual terminologies or components have been identified by standards organizations as candidates for specific uses [13]. The recommended terminologies include the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT); Logical Observation Identifiers Names and Codes (LOINC) and Unified Medical Language System (UMLS) [14], between many other. These

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

13

In the nineteenth century, the advancement of clinical pathology and technology changed the framework of classification, moving emphasis from the patient's experience to phenomena

The diagnostic entities in medicine are changing as a consequence of expanded and revised knowledge of the functions of the human body. Techniques for differential diagnostic strategies contribute to new categories, while old labels are gradually abandoned. There is a need to acknowledge the potency of classification systems as dynamic tools for medical practice and research [17]. According to all these changes, during twentieth century, the importance of "concept orientation" in terminology construction arises. Concept orientation allows a terminology to be helpful in several situations, depicted in different languages and easily evaluated for quality [18]. This transition from the use of Classification Systems to Reference Terminology was not only a change in institution's choices, but also both of them defined their

In the following sections, we will briefly present Classifications Systems, Reference Terminology and Interface Terminology. Finally, we will present our experience developing and imple-

A classification is "a system that arranges or organizes like or related entities" [21] (for more information, see **Table 1**). Classifications provide a useful framework for a systematic representation and codification of medical concepts. Monoaxial classifications form a hierarchy of terms based on a common root. The most commonly used example of monoaxial hierarchical classifications is the International Classification of Diseases, Tenth Revision, Clinical Modification and International Classification of Diseases, Procedure Coding System (ICD-10-CM/PCS), published by the World Health Organization, represents an example of the clinical classification systems. It has been designed for providing outputs in terms of reports and statistics [5, 22, 23]. Multi-axial or multifaceted classifications combine terms belonging to different classes that themselves may be organized in a hierarchy. SNOMED is an example of this type of classification [5]. Classification systems are intended for classification of clinical conditions and procedures to support statistical data analysis across the healthcare system. They are mutually exclusive and exhaustive and they can provide standards for comparisons

of health statistics at national and international levels. They have been used:

will develop in the following sections [15, 16].

determined by physician using diagnostic procedures [17].

purposes, potential functions, strength and limitations.

menting our Terminology Server.

**2. Classification systems**



**Table 1.** Introduction's definition summary.

While many terminologies have been developed, no single terminology has been accepted as a universal standard for the representation of clinical concepts. By contrast, individual terminologies or components have been identified by standards organizations as candidates for specific uses [13]. The recommended terminologies include the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT); Logical Observation Identifiers Names and Codes (LOINC) and Unified Medical Language System (UMLS) [14], between many other. These will develop in the following sections [15, 16].

In the nineteenth century, the advancement of clinical pathology and technology changed the framework of classification, moving emphasis from the patient's experience to phenomena determined by physician using diagnostic procedures [17].

The diagnostic entities in medicine are changing as a consequence of expanded and revised knowledge of the functions of the human body. Techniques for differential diagnostic strategies contribute to new categories, while old labels are gradually abandoned. There is a need to acknowledge the potency of classification systems as dynamic tools for medical practice and research [17]. According to all these changes, during twentieth century, the importance of "concept orientation" in terminology construction arises. Concept orientation allows a terminology to be helpful in several situations, depicted in different languages and easily evaluated for quality [18]. This transition from the use of Classification Systems to Reference Terminology was not only a change in institution's choices, but also both of them defined their purposes, potential functions, strength and limitations.

In the following sections, we will briefly present Classifications Systems, Reference Terminology and Interface Terminology. Finally, we will present our experience developing and implementing our Terminology Server.

#### **2. Classification systems**

• to support health services research: this system can promote quality of care by providing a link to medical knowledge and current publications that can be used for outcome

• to enable decision support programs use at the point of clinical care: a computer-based EHR system might work with a diagnostic expert system to backing physicians´ decisions. In order to achieve optimal integration, the transference of patient information from EHR to the diagnostic expert system would need to be automated. The major barrier to do so, are

• to exchange data between health information systems: the concept of Semantic Interoperability arises. We defined it as the possibility of different systems to understand the infor-

• for epidemiological analysis: it can be used by patients, physicians, researchers, quality control and management personnel and other administrative functions like accounting,

• for the process of codifying medical information systems actually count with vocabularies (artifacts that describe and systematize meanings of terms), with the common distinctions between terminologies (which provide standardized meanings), thesauri (which introduce semantic relations between groups of terms) and classifications (which introduce exhaustive partitions for statistical purposes). Some of them are used in an international level, while others have been defined according to local needs. (for more information, see **Table 1**)

Terminology Collections of words or phrases, called terms, aggregated in a systematic fashion to

Thesaurus List of terms created from free text inputs extracted from the clinical data repository.

Clinical coding Designating descriptions of diseases, injuries and procedures into numeric or

restriction and exchanged term are needed [20].

analysis across the healthcare system.

use [9].

**Table 1.** Introduction's definition summary.

Semantic Interoperability assignment [19].

recorded by the physicians [10].

Classification System Intended for classification of clinical conditions and procedures to support statistical data

represent the conceptual information that makes up a given knowledge domain.

The terms included in the thesaurus are divided into concepts (real clinical entities) and descriptions (different ways of naming these clinical entities). The thesaurus has capabilities to reject invalid terms already flagged as not appropriate for the intended

alphanumeric designations. It involves the use of a EHR as the source for determining code

Primary coding: clinical data are coded by the practitioner at the time of the consultation [10]. Secondary coding: a reduced number of trained persons codify the narrative text

It refers to human interpretation of the content. There is a common comprehension among people about the meaning of the information that is being exchanged (correct interpretation is guaranteed, for this reason formal definitions of each entity, attribute, relationship,

the variance between the controlled vocabularies of the two systems [7, 8, 12].

mation they are processing through the use of codes of clinical terminologies.

measurement.

12 eHealth - Making Health Care Smarter

billing and coding personnel.

A classification is "a system that arranges or organizes like or related entities" [21] (for more information, see **Table 1**). Classifications provide a useful framework for a systematic representation and codification of medical concepts. Monoaxial classifications form a hierarchy of terms based on a common root. The most commonly used example of monoaxial hierarchical classifications is the International Classification of Diseases, Tenth Revision, Clinical Modification and International Classification of Diseases, Procedure Coding System (ICD-10-CM/PCS), published by the World Health Organization, represents an example of the clinical classification systems. It has been designed for providing outputs in terms of reports and statistics [5, 22, 23]. Multi-axial or multifaceted classifications combine terms belonging to different classes that themselves may be organized in a hierarchy. SNOMED is an example of this type of classification [5]. Classification systems are intended for classification of clinical conditions and procedures to support statistical data analysis across the healthcare system. They are mutually exclusive and exhaustive and they can provide standards for comparisons of health statistics at national and international levels. They have been used:


#### **2.1. International classification of diseases background**

Work on classification systems began in the middle of the seventeenth century with John Gaunt's refinement of the late sixteenth-century classification scheme for the London Bills of Mortality [24, 25]. International Classification of Diseases (ICD) was first adopted in Paris in 1900 [24, 25]. Architecturally, the ICD has not fundamentally changed from the sixteenth century model of the London Bills, in that each new code is added as a new row in a single list. United States did not choose to adopt ICD-10 until the end of our 25-year window, in 2015. Besides, at the time that ICD-10 was introduced, it stayed as paper book, ICD-10 was not published in electronic format [26].

suggesting codes based on the clinical documentation in the EHR system. Thus, ICD-10-CM/ PCS coding is semi-automated at best and requires human intervention to either assign or

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

15

With the advent of capitated payments, the inevitable need of how to objectively determine severity of illness, in order to appropriately adjust capitated payments, or case mix. As outlined above, traditional disease classifications such as the ICD did not enjoy explicit severity of illness parameters; all that could be done was to infer disease severity on the basis of co-morbidity [26]. However, there may be no evidence demonstrating causality between the condition of interest and the co-morbidities. Case mix required some objective metrics and co-morbidity was it [26]. The set of measures for co-morbidity found everywhere has been the Diagnosis-Related Groups (DRGs) [28]. Since their beginning, multiple versions have continued to come out, changing architecturally combining demographic, diagnoses, and procedures into several hundred categories of care. These categories can in turn be considered to have, or not have, "complications" [26]. The 11th version, ICD-11, is now being developed through a continuous revision process, it will be finalized in 2018. For the first time, through advances in information technology, public health users, stakeholders and others interested can provide input to the beta version of ICD-11 using an online revision process. Peer-reviewed comments and input will be added through the revision period. When finalized, ICD-11 will be ready to use with EHR and information systems. WHO encourages broad participation in the 11th revision, so that the final classification meets the needs of health

ICD's strengths include non-redundancy, meaning by this that each concept should only be expressed in one way. If two terms refer to the same concept, the sensitivity of the replies to database queries will be reduced. The ability to manage synonyms, this is important because allows the presence of authorized intermediate terms that refer to a unique term used to encode, index, and find the useful information. And finally, there are explicit relationships this refers to the types of relationships between terms in a nomenclature are

As regard as SCS's limitations, we can name the completeness and the non-ambiguity. A full description of the medical vocabulary is very hard to achieve. About non-ambiguity, if two different types of data are stored under the same term, the specificity would be affects [5].

Currently, many classification systems exist and are maintained by responsible agencies.

validate selected codes [27].

**2.2. Diagnosis-Related Groups (DRGs)**

information users and is more comprehensive [27].

**2.3. ICD: strengths and limitations**

**2.4. Others standard classification systems**

Next, in **Table 2** we will briefly name and describe some of them.

clear [5].

In Australia, The National Center for Classifications in Health at the University of Sydney, was the first group to migrate ICD-10 into an electronic format [26].

By 2005, the WHO-FIC, an organization chartered by the World Health Organization (WHO) comprising national centers for classification around the world, created an international forum to advise on the content and evolution of WHO's Family of International Classifications (WHO-FIC) which includes ICD and the International Classification of Functioning (ICF). Currently, the WHO manages an electronic revision and update platform with WHO-FIC as a web page [26]. ICD-10-CM/PCS is an output system that was designed for general reporting purposes, public health surveillance, administrative performance monitoring, and reimbursement of healthcare services [22].

The ICD was developed to code death certificates but its use was extensive to include a large range of statistical reporting. ICD-10 has been used since the 1990s to collect mortality statistics around the world. The WHO defines coding as "the translation of diagnoses, procedures, co-morbidities and complications that occur over the course of a patient's encounter from medical terminology to an internationally coded syntax" [21]. According to this definition, ICD system has capability of being used for clinical coding and classification to enable international comparisons as regard to mortality and morbidity statistics [27].

Professional coders, who used to manually assign codes to patients' diagnoses and procedures, performed ICD-10-CM/PCS coding. Nowadays, coders use computer-assisted coding applications. These applications can facilitate accurate and efficient coding by automatically suggesting codes based on the clinical documentation in the EHR system. Thus, ICD-10-CM/ PCS coding is semi-automated at best and requires human intervention to either assign or validate selected codes [27].

#### **2.2. Diagnosis-Related Groups (DRGs)**

• to support other applications in healthcare including reimbursement,

Work on classification systems began in the middle of the seventeenth century with John Gaunt's refinement of the late sixteenth-century classification scheme for the London Bills of Mortality [24, 25]. International Classification of Diseases (ICD) was first adopted in Paris in 1900 [24, 25]. Architecturally, the ICD has not fundamentally changed from the sixteenth century model of the London Bills, in that each new code is added as a new row in a single list. United States did not choose to adopt ICD-10 until the end of our 25-year window, in 2015. Besides, at the time that ICD-10 was introduced, it stayed as paper book, ICD-10 was not

In Australia, The National Center for Classifications in Health at the University of Sydney,

By 2005, the WHO-FIC, an organization chartered by the World Health Organization (WHO) comprising national centers for classification around the world, created an international forum to advise on the content and evolution of WHO's Family of International Classifications (WHO-FIC) which includes ICD and the International Classification of Functioning (ICF). Currently, the WHO manages an electronic revision and update platform with WHO-FIC as a web page [26]. ICD-10-CM/PCS is an output system that was designed for general reporting purposes, public health surveillance, administrative performance monitoring, and reimburse-

The ICD was developed to code death certificates but its use was extensive to include a large range of statistical reporting. ICD-10 has been used since the 1990s to collect mortality statistics around the world. The WHO defines coding as "the translation of diagnoses, procedures, co-morbidities and complications that occur over the course of a patient's encounter from medical terminology to an internationally coded syntax" [21]. According to this definition, ICD system has capability of being used for clinical coding and classification to enable international comparisons as regard to mortality and morbidity statistics [27].

Professional coders, who used to manually assign codes to patients' diagnoses and procedures, performed ICD-10-CM/PCS coding. Nowadays, coders use computer-assisted coding applications. These applications can facilitate accurate and efficient coding by automatically

• for public health reporting,

14 eHealth - Making Health Care Smarter

• education, • research,

• to improve quality of care assessment,

• performance monitoring [21–23].

published in electronic format [26].

ment of healthcare services [22].

**2.1. International classification of diseases background**

was the first group to migrate ICD-10 into an electronic format [26].

With the advent of capitated payments, the inevitable need of how to objectively determine severity of illness, in order to appropriately adjust capitated payments, or case mix. As outlined above, traditional disease classifications such as the ICD did not enjoy explicit severity of illness parameters; all that could be done was to infer disease severity on the basis of co-morbidity [26]. However, there may be no evidence demonstrating causality between the condition of interest and the co-morbidities. Case mix required some objective metrics and co-morbidity was it [26]. The set of measures for co-morbidity found everywhere has been the Diagnosis-Related Groups (DRGs) [28]. Since their beginning, multiple versions have continued to come out, changing architecturally combining demographic, diagnoses, and procedures into several hundred categories of care. These categories can in turn be considered to have, or not have, "complications" [26]. The 11th version, ICD-11, is now being developed through a continuous revision process, it will be finalized in 2018. For the first time, through advances in information technology, public health users, stakeholders and others interested can provide input to the beta version of ICD-11 using an online revision process. Peer-reviewed comments and input will be added through the revision period. When finalized, ICD-11 will be ready to use with EHR and information systems. WHO encourages broad participation in the 11th revision, so that the final classification meets the needs of health information users and is more comprehensive [27].

#### **2.3. ICD: strengths and limitations**

ICD's strengths include non-redundancy, meaning by this that each concept should only be expressed in one way. If two terms refer to the same concept, the sensitivity of the replies to database queries will be reduced. The ability to manage synonyms, this is important because allows the presence of authorized intermediate terms that refer to a unique term used to encode, index, and find the useful information. And finally, there are explicit relationships this refers to the types of relationships between terms in a nomenclature are clear [5].

As regard as SCS's limitations, we can name the completeness and the non-ambiguity. A full description of the medical vocabulary is very hard to achieve. About non-ambiguity, if two different types of data are stored under the same term, the specificity would be affects [5].

#### **2.4. Others standard classification systems**

Currently, many classification systems exist and are maintained by responsible agencies. Next, in **Table 2** we will briefly name and describe some of them.


After Cimino's Desiderata, the difference between Terminology Systems like SNOMED CT and Classification Systems like ICD-10-CM/PCS became clearer. Both coding schemes provide the necessary data structure needed to support healthcare clinical and administrative processes. Clinical terminology systems as well as clinical classification systems were originally designed to serve different purposes and different users' requirements [27]. ICD-10 is a classification system and it was designed as an output general reporting purposes like public health surveillance, administrative monitoring, and repayments of healthcare services. For all of these reasons, a classification system can be less detailed than a clinical terminology. Contrary, SNOMED CT (**Table 3**) is a clinical terminology, it was developed to attend as a standard data infrastructure

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

17

for clinical application, for these reason it requires a higher degree of specificity [36].

SNOMED has been used successfully on an international basis in areas such as anatomypathology and radiology. It has been translated into several languages. The Systematized Nomenclature of Medicine (SNOMED) nomenclature is an example of a multi-axial classification, developed by North American pathologists and extended from the Systematic

In 1965, the Systematized Nomenclature of Pathology (SNOP) was published by the College

In 1975, CAP expanded SNOP to create the Systematized Nomenclature of Medicine (SNOMED). In 1979, the most extensively adopted version of SNOMED named as SNOMED II was published. In 2000, in collaboration with Kaiser Permanente, CAP developed a new logic-based version named SNOMED RT. In the UK during twentieth century, Dr. James Read developed the Read Codes. In the end, under the National Health Service, they evolved into Clinical Terms Version 3 (CTV3). The first version of SNOMED CT, was published in January 2002, after a merge the CTV3 and SNOMED RT, performed by CAP. The merged product was called SNOMED Clinical Terms, which was shortened to SNOMED CT. SNOMED

of American Pathologists (CAP) to describe morphology and anatomy.

International considers SNOMED CT to be a brand name, not an acronym [37].

**3.2. SNOMED CT: background**

Nomenclature of Pathology (SNOP) [5].

*"Complete coverage of domain specific content"*

*"Concepts must evolve with change in knowledge"*

*"Concepts have single explicit formal definitions" "Support for multiple levels of concept detail"*

From Cimino et al. [7].

*"Use of concepts rather than terms, phrases, and words"* (concept orientation) *"Concepts do not change with time, view, or use"* (concept consistency)

*"Concepts identified through nonsense identifiers"* (context free identifier) *"Representation if concept context consistently from multiple hierarchies"*

*"Methods, or absence of, to identify duplication, ambiguity, and synonymy" "Synonyms uniquely identified and appropriately mapped to relevant concepts" "Support for compositionality to create concepts at multiple levels of detail"*

**Table 3.** Desiderata for controlled medical vocabularies in the twenty-first century.

**Table 2.** Examples of standard classification systems.

#### **3. Reference terminology**

According to the International Standards Organization (ISO), terminologies should be formal aggregations of language-independent concepts, that concepts should be represented by one favored term and appropriate synonymous terms, and that relationships among concepts should be explicitly represented [33, 34]. The ISO specification also stated that terminologies must define their purpose and scope, quantify the extent of their domain coverage, and provide mappings to external terminologies designed for classification and to support administrative functions [33, 34]. The ISO also highlighted the value of mapping among separate terminologies designed to meet different needs. This would allow, for example, a physician to choose a concept from a clinically oriented terminology for constructing a patient's problem list and a mapped concept in an administrative classification (like ICD-9-CM) could be selected in an automated fashion for billing purposes [33, 34].

#### **3.1. Reference terminology: a new paradigm**

In 1998, J. Cimino summarized several works groups' toward defining the precise attributes of a multipurpose and shareable terminology [7, 8]. He stressed the value of "concept orientation" pending terminology construction. Concept orientation imply "*…to use concepts as basic building blocks ahead words, terms, or phrases".* It allows a terminology to be useful in several situations, represented in different languages and easily assess for quality. For Cimino, the aim was to have a universal single clinical terminology that would cover a specialty domain's concepts completely at multiple levels of detail. Nonspecific phrases such as "not elsewhere classified" must be avoid [7, 8]. It is important to point out the need for complete and comprehensive domain coverage using non-ambiguous, non-overlapping concepts. In the absence of complete domain coverage, terminologies should integrate with other terminologies. Terminologies need to support synonymy and compositionality [35]. "High-quality vocabulary" has been defined as the vocabulary approaches completeness, is well organized and has terms whose meanings are clear [7, 8].

After Cimino's Desiderata, the difference between Terminology Systems like SNOMED CT and Classification Systems like ICD-10-CM/PCS became clearer. Both coding schemes provide the necessary data structure needed to support healthcare clinical and administrative processes. Clinical terminology systems as well as clinical classification systems were originally designed to serve different purposes and different users' requirements [27]. ICD-10 is a classification system and it was designed as an output general reporting purposes like public health surveillance, administrative monitoring, and repayments of healthcare services. For all of these reasons, a classification system can be less detailed than a clinical terminology. Contrary, SNOMED CT (**Table 3**) is a clinical terminology, it was developed to attend as a standard data infrastructure for clinical application, for these reason it requires a higher degree of specificity [36].

#### **3.2. SNOMED CT: background**

**3. Reference terminology**

treating the condition [30].

16 eHealth - Making Health Care Smarter

creator Clem McDonald [26].

**Table 2.** Examples of standard classification systems.

According to the International Standards Organization (ISO), terminologies should be formal aggregations of language-independent concepts, that concepts should be represented by one favored term and appropriate synonymous terms, and that relationships among concepts should be explicitly represented [33, 34]. The ISO specification also stated that terminologies must define their purpose and scope, quantify the extent of their domain coverage, and provide mappings to external terminologies designed for classification and to support administrative functions [33, 34]. The ISO also highlighted the value of mapping among separate terminologies designed to meet different needs. This would allow, for example, a physician to choose a concept from a clinically oriented terminology for constructing a patient's problem list and a mapped concept in an administrative classification (like ICD-9-CM) could be

ICD The ICD is the global health information standard for mortality and morbidity statistics. ICD is

DRG Statistical system for classify all inpatient stay into groups for the aim of payment. The DRG

LOINC Logical Observation Identifiers Names and Codes was developed to provide a definitive standard for

NANDA Prior to the year 2002, "NANDA" was an acronym for the North American Nursing Diagnosis

They are in charge of definitions and classification of the guide to nursing diagnoses [32].

manage health care, monitor outcomes and allocate resources [29].

increasingly used in clinical care and research to define diseases and study disease patterns, as well as

classification system divides possible diagnoses into more than 20 major body systems and subdivides them into almost 500 groups. It was born for the purpose of Medicare reimbursement. In order to determine the payment, factors consider include the diagnosis involved and the resources necessary for

identifying clinical information in electronic reports. Its database provides a set of universal names and ID codes for identifying laboratory and clinical test results [31]. It aims is providing a means of uniquely identifying the information elements in EHR. LOINC is remarkable for being the first completely open clinical terminology, making all content available without royalties or charges; this was driven by its

Association. In 2002, they officially became NANDA International Nursing Diagnoses Classification.

In 1998, J. Cimino summarized several works groups' toward defining the precise attributes of a multipurpose and shareable terminology [7, 8]. He stressed the value of "concept orientation" pending terminology construction. Concept orientation imply "*…to use concepts as basic building blocks ahead words, terms, or phrases".* It allows a terminology to be useful in several situations, represented in different languages and easily assess for quality. For Cimino, the aim was to have a universal single clinical terminology that would cover a specialty domain's concepts completely at multiple levels of detail. Nonspecific phrases such as "not elsewhere classified" must be avoid [7, 8]. It is important to point out the need for complete and comprehensive domain coverage using non-ambiguous, non-overlapping concepts. In the absence of complete domain coverage, terminologies should integrate with other terminologies. Terminologies need to support synonymy and compositionality [35]. "High-quality vocabulary" has been defined as the vocabulary approaches completeness, is well organized and has terms whose meanings are clear [7, 8].

selected in an automated fashion for billing purposes [33, 34].

**3.1. Reference terminology: a new paradigm**

SNOMED has been used successfully on an international basis in areas such as anatomypathology and radiology. It has been translated into several languages. The Systematized Nomenclature of Medicine (SNOMED) nomenclature is an example of a multi-axial classification, developed by North American pathologists and extended from the Systematic Nomenclature of Pathology (SNOP) [5].

In 1965, the Systematized Nomenclature of Pathology (SNOP) was published by the College of American Pathologists (CAP) to describe morphology and anatomy.

In 1975, CAP expanded SNOP to create the Systematized Nomenclature of Medicine (SNOMED). In 1979, the most extensively adopted version of SNOMED named as SNOMED II was published. In 2000, in collaboration with Kaiser Permanente, CAP developed a new logic-based version named SNOMED RT. In the UK during twentieth century, Dr. James Read developed the Read Codes. In the end, under the National Health Service, they evolved into Clinical Terms Version 3 (CTV3). The first version of SNOMED CT, was published in January 2002, after a merge the CTV3 and SNOMED RT, performed by CAP. The merged product was called SNOMED Clinical Terms, which was shortened to SNOMED CT. SNOMED International considers SNOMED CT to be a brand name, not an acronym [37].

From Cimino et al. [7].

**Table 3.** Desiderata for controlled medical vocabularies in the twenty-first century.

*<sup>&</sup>quot;Complete coverage of domain specific content"*

*<sup>&</sup>quot;Use of concepts rather than terms, phrases, and words"* (concept orientation)

*<sup>&</sup>quot;Concepts do not change with time, view, or use"* (concept consistency)

*<sup>&</sup>quot;Concepts must evolve with change in knowledge"*

*<sup>&</sup>quot;Concepts identified through nonsense identifiers"* (context free identifier)

*<sup>&</sup>quot;Representation if concept context consistently from multiple hierarchies"*

*<sup>&</sup>quot;Concepts have single explicit formal definitions"*

*<sup>&</sup>quot;Support for multiple levels of concept detail"*

*<sup>&</sup>quot;Methods, or absence of, to identify duplication, ambiguity, and synonymy"*

*<sup>&</sup>quot;Synonyms uniquely identified and appropriately mapped to relevant concepts"*

*<sup>&</sup>quot;Support for compositionality to create concepts at multiple levels of detail"*

SNOMED has been translated into several languages and successfully implemented around the world, in specialties such as anatomy-pathology and radiology. Novel development concerns the use of SNOMED as a reference terminology for health care. The next version nominated as SNOMED RT, will include data related to the causes and symptoms of diseases, treatment of patients, and the outcome of health care process [5]. SNOMED RT has the possibility to represent multiple types of hierarchies and to make the types fully explicit, after the proposed changes.

cross map data has also been developed. These mappings provide the aggregate terminology features to SNOMED CT [47]. However, coding in SNOMED CT is different from conventional coding using ICD-10-CM/PCS. Coding using SNOMED CT is always automated: end users cannot view the codes assigned by the system. For this reason, software developers and EHR vendors are using SNOMED CT to help communication between different applications through a SCS. In fact, we can think of SNOMED CT as a programing language; users utilize

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

19

SNOMED CT provides functionalities in three layers: entry terminology, reference terminology and aggregate Terminology. Between SNOMED CT's strengths, we can name the completeness, the non-ambiguity (terms must refer to only one concept), the ability to manage synonyms and finally the explicit relationships (this refers to the types of relationships

About limitations, they include non-redundancy, meaning by this that each concept should only be expressed in one way. If two terms refer to the same concept, the sensitivity of the replies to database queries will be reduced [5]. It is also remarkable that for those cases when an institutional term cannot be represented with a standard SNOMED CT code, to create new

SNOMED Nomenclature created by the CAP and is evolving into an international Standards Development

compositional concept representation model, called Description Logics [48] Terms Collections of words or phrases, aggregated in a systematic fashion to represent the conceptual

Concept Unit of symbolic processing in control vocabulary, a representation of a particular meaning.

Non-ambiguity Terms must correspond to at least one unequivocal meaning and no more than one meaning,

associated-with, equivalent-to, is-in are the most usual relationships [5]

than one meaning. Meanings correspond to no more than one term [7, 8, 18]

Organization and currently regarded as the most advanced initiative in knowledge-based representations with clinical application. Each of the 300,000 terms included are defined using relationships with other terms, creating a powerful semantic network. SNOMED CT data model allows continuous extension of the nomenclature, adding new terms, always following the same

information that makes up a given knowledge domain. Terms in a terminology generally correspond to actual events or entities and to their cognitive representations in people's minds

Concept orientation means that terms must correspond to at least one meaning and no more

based on context. A distinction must be made between ambiguity of the meaning of a concept

The kind of relationships between terms in a nomenclature is not clear. Is-a, is-part-of, causes,

Each concept in the vocabulary has a single, coherent meaning, although its meaning might vary, depending on its appearance in a context (such as a medical record). Terminologies also typically contain hierarchical organizations and other representations of linkages among concepts, such as the "is-a-type-of" relationship between "high blood pressure" and "disorder of

applications that apply it without knowing what is at work in the backend [27].

**3.4. SNOMED CT: strengths and limitations**

between terms in a nomenclature are clear) [5].

concepts is not allow (for more information, see **Table 4**).

(called concepts) [14, 43]

Explicit relationships

Concept orientation

**Table 4.** Definition summary.

Non-vagueness Terms must correspond to at least one meaning [7, 8, 18]

and ambiguity of its usage [7, 8, 18] Non-redundancy Meanings correspond to no more than one term [7, 8, 18]

cardiovascular system" [33, 34, 49]

#### **3.3. SNOMED CT as clinical reference terminology**

Reference terminology was defined as a set of concepts and relationships that provide a common reference point for comparisons and aggregation of data about the entire health care process, recorded by multiple different individuals, systems, or institutions [38]. Cornet et al. defined it as "*…a system of concepts with assigned identifiers and human language terms, typically involving some kind of semantic hierarchy. Some systems may support the assignment of multiple terms, or synonyms, to a given concept…*" [26] SNOMED CT was developed to serve as a standard data infrastructure for clinical application, which requires a greater degree of specificity. A classification system can be less detailed than a clinical terminology [22]. In fact, the systems complement each other and contribute to providing quality data for different domains of the healthcare system [39]. Accordingly, both systems may be use depending on which degree of specificity is required: SNOMED is a better election to recognize unusual illness, mind while ICD-10 is consider more efficient for statistical reporting, such as collecting the top reasons of mortality and morbidity [40].

In order to accomplish "domain coverage", terminology developers have created new concepts by the utilization of two methods: pre-coordination and post-coordination. With pre-coordination, also named enumeration, is possible to model suitable levels of detail with distinct concepts, derived from real world, non-restricted usage by physicians. Generally, only clinically meaningful concepts are pre-coordinated [41]. By contrast post-coordination, also called compositionality, complex concepts can be composed from simple concepts [42]. Pre-coordination and post-coordination can complement each other, with pre-coordination providing logic and complexity and post-coordination, allowing expressivity and more complete domain coverage.

Existing terminologies that allow post-coordination are better capable to represent phrases and concepts extracted from clinical documents compare to pre-coordinated terminologies [43]. The reason is because users can both: access existing concepts and dynamically compose new concepts according to their needs, such terminologies may improve terminology domain coverage. However, even using post-coordination, it has not yet successfully modeled the entire scope of medical knowledge.

SNOMED CT provides a unified language, it may be used as a standard for communication among healthcare providers. It also highly promotes to semantic interoperability in healthcare information systems [44–46]. Its standardized logical structure and its wide acceptation make it more appropriate for high-level information exchange at national and also international levels [44–46].

SNOMED CT also includes several descriptions that can be used as an entry terminology. Finally, SNOMED CT has a standard cross mapping model; the official distribution includes data for mapping to ICD-9 (ICD—International Classification of Diseases). Additional ICD-10 cross map data has also been developed. These mappings provide the aggregate terminology features to SNOMED CT [47]. However, coding in SNOMED CT is different from conventional coding using ICD-10-CM/PCS. Coding using SNOMED CT is always automated: end users cannot view the codes assigned by the system. For this reason, software developers and EHR vendors are using SNOMED CT to help communication between different applications through a SCS. In fact, we can think of SNOMED CT as a programing language; users utilize applications that apply it without knowing what is at work in the backend [27].

#### **3.4. SNOMED CT: strengths and limitations**

SNOMED has been translated into several languages and successfully implemented around the world, in specialties such as anatomy-pathology and radiology. Novel development concerns the use of SNOMED as a reference terminology for health care. The next version nominated as SNOMED RT, will include data related to the causes and symptoms of diseases, treatment of patients, and the outcome of health care process [5]. SNOMED RT has the possibility to represent multiple types of hierarchies and to make the types fully explicit, after the proposed changes.

Reference terminology was defined as a set of concepts and relationships that provide a common reference point for comparisons and aggregation of data about the entire health care process, recorded by multiple different individuals, systems, or institutions [38]. Cornet et al. defined it as "*…a system of concepts with assigned identifiers and human language terms, typically involving some kind of semantic hierarchy. Some systems may support the assignment of multiple terms, or synonyms, to a given concept…*" [26] SNOMED CT was developed to serve as a standard data infrastructure for clinical application, which requires a greater degree of specificity. A classification system can be less detailed than a clinical terminology [22]. In fact, the systems complement each other and contribute to providing quality data for different domains of the healthcare system [39]. Accordingly, both systems may be use depending on which degree of specificity is required: SNOMED is a better election to recognize unusual illness, mind while ICD-10 is consider more efficient for statistical reporting, such as collecting the

In order to accomplish "domain coverage", terminology developers have created new concepts by the utilization of two methods: pre-coordination and post-coordination. With pre-coordination, also named enumeration, is possible to model suitable levels of detail with distinct concepts, derived from real world, non-restricted usage by physicians. Generally, only clinically meaningful concepts are pre-coordinated [41]. By contrast post-coordination, also called compositionality, complex concepts can be composed from simple concepts [42]. Pre-coordination and post-coordination can complement each other, with pre-coordination providing logic and complexity and post-coordination, allowing expressivity and more complete domain coverage. Existing terminologies that allow post-coordination are better capable to represent phrases and concepts extracted from clinical documents compare to pre-coordinated terminologies [43]. The reason is because users can both: access existing concepts and dynamically compose new concepts according to their needs, such terminologies may improve terminology domain coverage. However, even using post-coordination, it has not yet successfully modeled the

SNOMED CT provides a unified language, it may be used as a standard for communication among healthcare providers. It also highly promotes to semantic interoperability in healthcare information systems [44–46]. Its standardized logical structure and its wide acceptation make it more appropriate for high-level information exchange at national and also interna-

SNOMED CT also includes several descriptions that can be used as an entry terminology. Finally, SNOMED CT has a standard cross mapping model; the official distribution includes data for mapping to ICD-9 (ICD—International Classification of Diseases). Additional ICD-10

**3.3. SNOMED CT as clinical reference terminology**

18 eHealth - Making Health Care Smarter

top reasons of mortality and morbidity [40].

entire scope of medical knowledge.

tional levels [44–46].

SNOMED CT provides functionalities in three layers: entry terminology, reference terminology and aggregate Terminology. Between SNOMED CT's strengths, we can name the completeness, the non-ambiguity (terms must refer to only one concept), the ability to manage synonyms and finally the explicit relationships (this refers to the types of relationships between terms in a nomenclature are clear) [5].

About limitations, they include non-redundancy, meaning by this that each concept should only be expressed in one way. If two terms refer to the same concept, the sensitivity of the replies to database queries will be reduced [5]. It is also remarkable that for those cases when an institutional term cannot be represented with a standard SNOMED CT code, to create new concepts is not allow (for more information, see **Table 4**).


**Table 4.** Definition summary.

#### **4. Interface terminology**

Interface terminology (IT), which has also been called colloquial terminologies, application terminologies and entry terminologies, has been defined as a systematic collection of healthcare-related phrases (terms) that supports clinicians' entry of patient-related information into computer programs [42]. But how does it happen? When health care providers type into EHR, IT links free text patient descriptors to structured, coded internal data elements used by specific clinical computer programs. Interface terminologies also facilitate display of computer stored patient information to clinician-users as simple human readable text [42]. These terminologies generally embody a rich set of flexible, user-friendly phrases displayed in the graphical or text interfaces of specific computer programs. The "entry" terminologies allow users to interact easily with concepts through common colloquial terms and synonyms. Entry terms can then map to explicitly defined concepts in a more formal terminology, such as a reference terminology, which can then define relationships among concepts [50]. EHR depend on interface terminologies for successful implementation in clinical settings because such terminologies provide the translation from clinicians' own natural language expressions into the more structured representations required by application programs [42]. Interface terminologies are crucial to foment direct categorical data entry by physicians in EHR. Historically, the efforts performed by terminology developers and the standards community, have been orientate to other kind of terminologies, like reference and administrative, instead of interface terminologies.

of common colloquial phrases in medical discourse; rich synonymy should improve the

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

21

A very frequently asked question is why to use TS instead of only SNOMED as interface ter-

• When a single concept is not enough to define the information is possible to build a new one using post-coordination, understood as the representation of a clinical meaning using

• Thesaurus allows to manage: synonyms (different descriptions related to a concept), list of valid and not recognized terms (error typing, etc), validated jargon and acronyms, list of "Not Valid" terms, thesaurus with local extension in a continuous learning process and

• SNOMED has pharmaceutical information as a single entity, not represented independently: Quantity of drug in the pharmaceutical presentation, measurement unit or pharmaceu-

Many definitions for terminology service exist. In previous publications, we defined as complex system of conceptual representation of medical knowledge, with relationships between concepts, with external representations of concepts in lists of standard terms (classifications)

A terminology server (TS) is a software that is composed of (**Figure 1**): a thesaurus or local interface vocabulary. This is a list of terms created from free text inputs extracted from the clinical data repository. The terms restrained in the thesaurus are split into concepts (real clinical entities) and descriptions (different ways of naming clinical entities). Thesaurus has been mapped to a reference a vocabulary, for example to SNOMED CT [9, 54]. The TS also is able to reject invalid terms before pointed out as not appropriate for the intended use [9]. The TS should also provide interactive information for refining concepts. This feature of the TS is achieved using semantic information included on SNOMED CT, navigating the subtypes/super-types hierarchy [9]. On the desiderata for TS, Chute et al. [50] attempt to articulate the functional needs of a terminology server oriented toward the clinical needs of care providers using applications in an operational environment. Between the desirable characteristics for a terminology server they included: Word Normalization, Word Completion, Target Terminology Specification, Spelling Correction, Lexical Matching, Term Completion,

tical form. ut for clinical use, we need to identify single data components.

According to all the limitations mentioned before, terminology services arise.

nuance with which users can express themselves when using the terminology [53].

minology? Between the reasons why we chose it, we can name:

a combination of two or more SNOMED concept identifiers.

drug composition information (commercial products) [47].

and with lexical tools that facilitate the search for terms [54].

Semantic Locality, Term Composition, Term Decomposition (**Figure 1**).

• It is simpler for end users.

**5. Terminology services**

Between the aims of interface terminology, we can mention: to provide an institutional vocabulary for all user interfaces so they interact with known terms, including local jargon and preferences; to proportion concept lookup functions with loose lexical matches and options, to be employed for the time of data entry process of new items in a problems list or similar user interfaces. It is also important to provide short pick-lists definitions for more structured data entry in specific use templates, with a short list of valid entries and different preferred terms for the same concept in different settings. It should include the ability to accept new terms from the user, in case a concept or description is not represented and detect inappropriate terms for being too general or not valid in a subset [47].

The "usability" of an interface terminology refers to the ease with which its users can accomplish their intended tasks using the terminology. In addition, it has been demonstrated that interface terminology usability correlates with the presence of attributes that enhance efficiency of term selection and composition [51, 52]. The usability of a clinical interface terminology designed correlates with the presence of relevant insertional medical knowledge; adequacy of synonymy; a balance between pre-coordination and post-coordination; and mapping to terminologies having formal concept representations. IT enhances its usability by decreasing the number of steps required for users to find or compose the terms needed for a given task [41, 53].

Synonymy refers to the number of individual terms that can correctly represent a unique concept. Synonym types may include alternate phrases, acronyms, definitional phrases and eponyms [53]. Clinical interface terminologies are specifically designed to represent the variety of common colloquial phrases in medical discourse; rich synonymy should improve the nuance with which users can express themselves when using the terminology [53].

A very frequently asked question is why to use TS instead of only SNOMED as interface terminology? Between the reasons why we chose it, we can name:

• It is simpler for end users.

**4. Interface terminology**

20 eHealth - Making Health Care Smarter

instead of interface terminologies.

ate terms for being too general or not valid in a subset [47].

Interface terminology (IT), which has also been called colloquial terminologies, application terminologies and entry terminologies, has been defined as a systematic collection of healthcare-related phrases (terms) that supports clinicians' entry of patient-related information into computer programs [42]. But how does it happen? When health care providers type into EHR, IT links free text patient descriptors to structured, coded internal data elements used by specific clinical computer programs. Interface terminologies also facilitate display of computer stored patient information to clinician-users as simple human readable text [42]. These terminologies generally embody a rich set of flexible, user-friendly phrases displayed in the graphical or text interfaces of specific computer programs. The "entry" terminologies allow users to interact easily with concepts through common colloquial terms and synonyms. Entry terms can then map to explicitly defined concepts in a more formal terminology, such as a reference terminology, which can then define relationships among concepts [50]. EHR depend on interface terminologies for successful implementation in clinical settings because such terminologies provide the translation from clinicians' own natural language expressions into the more structured representations required by application programs [42]. Interface terminologies are crucial to foment direct categorical data entry by physicians in EHR. Historically, the efforts performed by terminology developers and the standards community, have been orientate to other kind of terminologies, like reference and administrative,

Between the aims of interface terminology, we can mention: to provide an institutional vocabulary for all user interfaces so they interact with known terms, including local jargon and preferences; to proportion concept lookup functions with loose lexical matches and options, to be employed for the time of data entry process of new items in a problems list or similar user interfaces. It is also important to provide short pick-lists definitions for more structured data entry in specific use templates, with a short list of valid entries and different preferred terms for the same concept in different settings. It should include the ability to accept new terms from the user, in case a concept or description is not represented and detect inappropri-

The "usability" of an interface terminology refers to the ease with which its users can accomplish their intended tasks using the terminology. In addition, it has been demonstrated that interface terminology usability correlates with the presence of attributes that enhance efficiency of term selection and composition [51, 52]. The usability of a clinical interface terminology designed correlates with the presence of relevant insertional medical knowledge; adequacy of synonymy; a balance between pre-coordination and post-coordination; and mapping to terminologies having formal concept representations. IT enhances its usability by decreasing the number of steps required for users to find or compose the terms needed for a given task [41, 53]. Synonymy refers to the number of individual terms that can correctly represent a unique concept. Synonym types may include alternate phrases, acronyms, definitional phrases and eponyms [53]. Clinical interface terminologies are specifically designed to represent the variety


According to all the limitations mentioned before, terminology services arise.

#### **5. Terminology services**

Many definitions for terminology service exist. In previous publications, we defined as complex system of conceptual representation of medical knowledge, with relationships between concepts, with external representations of concepts in lists of standard terms (classifications) and with lexical tools that facilitate the search for terms [54].

A terminology server (TS) is a software that is composed of (**Figure 1**): a thesaurus or local interface vocabulary. This is a list of terms created from free text inputs extracted from the clinical data repository. The terms restrained in the thesaurus are split into concepts (real clinical entities) and descriptions (different ways of naming clinical entities). Thesaurus has been mapped to a reference a vocabulary, for example to SNOMED CT [9, 54]. The TS also is able to reject invalid terms before pointed out as not appropriate for the intended use [9]. The TS should also provide interactive information for refining concepts. This feature of the TS is achieved using semantic information included on SNOMED CT, navigating the subtypes/super-types hierarchy [9]. On the desiderata for TS, Chute et al. [50] attempt to articulate the functional needs of a terminology server oriented toward the clinical needs of care providers using applications in an operational environment. Between the desirable characteristics for a terminology server they included: Word Normalization, Word Completion, Target Terminology Specification, Spelling Correction, Lexical Matching, Term Completion, Semantic Locality, Term Composition, Term Decomposition (**Figure 1**).

**6.2. Terminology server of HIBA**

**6.3. Terminology server of HIBA: evolution**

server [56].

service [56].

**Figure 2.** Schema of terminology server.

The terminology server of HIBA is composed of a local interface terminology (thesaurus) mapped to a reference terminology, SNOMED CT. Our main objective was to design a new terminology system, whose objectives can be related to the functions of the terminology sys-

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

23

The IT is updated every day by a team of experts, who audit, assign codes and link each new term to the SNOMED CT (reference terminology), and use the official mapping into SNOMED to another classification (like ICD 9). If SNOMED does not offer an official mapping, the team generates a manual cross-link through functionality on the terminology

In 1998, the terminology work team started centralized secondary coding, where a reduced number of trained persons codify the narrative text recorded by the physicians taking care of

In 2004, we achieved 1 million of narrative text secondary coded. After this, we started an auto-codification process, through a thesaurus using interface terminology as a centralized

tem previously described (entry, reference and aggregate terminology) **Figure 2**.

the patients. The coding included problem list, diagnostics and procedures [10].

**Figure 1.** Schema of the functionality of the terminology server in reference to the pyramid of terminological systems.

#### **6. Italians' hospital of Buenos Aires terminology services experience**

#### **6.1. Setting**

The Hospital Italiano de Buenos Aires (HIBA) is a non-profit healthcare academic center founded in 1853, with over 2700 physicians, 2700 other health team members (including 1200 nurses) and 1800 administrative and support employees. Since 2015, it is a Joint Commission International (JCI) accredited institution. The HIBA has a network of two hospitals with 750 beds (200 for intensive care), 41 operating rooms, 800 home care beds, 25 outpatient clinics and 150 associated private practices located in Buenos Aires city and its suburban area. It has a Health Maintenance Organization (Plan de Salud) that covers more than 160,000 people and also provides health services to another 1,500,000 people who are covered by affiliated insurers. Annually, over 50,000 inpatients were admitted to its hospitals, there were 45,000 surgical procedures (50% ambulatory) and 3,000,000 outpatient visits. In addition, the HIBA is a teaching hospital, with over 30 medical residency-training programs and 34 fellowship programs. There are currently 400 residents and fellows in training. Since 1995, the HIBA runs an in-house developed health information system, which includes clinical and administrative data. Its EHR system called *Italica*, is an integrated, modular, problem-oriented and patientcentered system that works in different clinical settings (outpatient, inpatient, emergency and home care). *Italica* allows computer physician order entry for medications and medical tests, and storage and retrieval of tests results, including images through a picture archiving and communication system. In 2017, HIBA has been certified by the HIMSS as level 7 in the EHR Adoption Model, being the first hospital in Argentina and the second in Latin America reaching this stage [55]. Several health informatics standards had been implemented, including HL7, CDA Version 2, ICD-9, DRG, ICD-10, and ICPC.

#### **6.2. Terminology server of HIBA**

The terminology server of HIBA is composed of a local interface terminology (thesaurus) mapped to a reference terminology, SNOMED CT. Our main objective was to design a new terminology system, whose objectives can be related to the functions of the terminology system previously described (entry, reference and aggregate terminology) **Figure 2**.

The IT is updated every day by a team of experts, who audit, assign codes and link each new term to the SNOMED CT (reference terminology), and use the official mapping into SNOMED to another classification (like ICD 9). If SNOMED does not offer an official mapping, the team generates a manual cross-link through functionality on the terminology server [56].

#### **6.3. Terminology server of HIBA: evolution**

**6. Italians' hospital of Buenos Aires terminology services experience**

**Figure 1.** Schema of the functionality of the terminology server in reference to the pyramid of terminological systems.

HL7, CDA Version 2, ICD-9, DRG, ICD-10, and ICPC.

The Hospital Italiano de Buenos Aires (HIBA) is a non-profit healthcare academic center founded in 1853, with over 2700 physicians, 2700 other health team members (including 1200 nurses) and 1800 administrative and support employees. Since 2015, it is a Joint Commission International (JCI) accredited institution. The HIBA has a network of two hospitals with 750 beds (200 for intensive care), 41 operating rooms, 800 home care beds, 25 outpatient clinics and 150 associated private practices located in Buenos Aires city and its suburban area. It has a Health Maintenance Organization (Plan de Salud) that covers more than 160,000 people and also provides health services to another 1,500,000 people who are covered by affiliated insurers. Annually, over 50,000 inpatients were admitted to its hospitals, there were 45,000 surgical procedures (50% ambulatory) and 3,000,000 outpatient visits. In addition, the HIBA is a teaching hospital, with over 30 medical residency-training programs and 34 fellowship programs. There are currently 400 residents and fellows in training. Since 1995, the HIBA runs an in-house developed health information system, which includes clinical and administrative data. Its EHR system called *Italica*, is an integrated, modular, problem-oriented and patientcentered system that works in different clinical settings (outpatient, inpatient, emergency and home care). *Italica* allows computer physician order entry for medications and medical tests, and storage and retrieval of tests results, including images through a picture archiving and communication system. In 2017, HIBA has been certified by the HIMSS as level 7 in the EHR Adoption Model, being the first hospital in Argentina and the second in Latin America reaching this stage [55]. Several health informatics standards had been implemented, including

**6.1. Setting**

22 eHealth - Making Health Care Smarter

In 1998, the terminology work team started centralized secondary coding, where a reduced number of trained persons codify the narrative text recorded by the physicians taking care of the patients. The coding included problem list, diagnostics and procedures [10].

In 2004, we achieved 1 million of narrative text secondary coded. After this, we started an auto-codification process, through a thesaurus using interface terminology as a centralized service [56].

**Figure 2.** Schema of terminology server.

In 2010, remote Terminology Services (RTS) provided by HIBA through a transnational and interinstitutional implementation [57].

• cross maps to ICD-9, ICD-10, LOINC, ICPC-2, ATC,

and it was designed with EHR implementations in mind.

divided in several subsets; examples of these are:

• Problems list terminology • Procedures terminology

• Findings in chest radiography • Administration routes for drugs • State of consciousness description

• Physical examination subset

**Service Description**

• Liver failure diagnosis

**6.5. Our institutional entry terminology, how does it work?**

Inteligente prompting Perform a preliminary search entering the first three characters.

Assign classifier Valid term plus classification return back the corresponding code.

List domains Return the domain available (problems, procedures, medications, etc.).

the medical record.

List classification Return back available classification.

DRG code.

List domain elements Returns back terms contained in a domain.

**Table 5.** Functionalities included into terminological web services provided by HIBA.

Term recognition Search for the text entered in the interface vocabulary and offer the alternative to improve

Creation of a new term Enter new term in the interface vocabulary and it is entered into the audit circuit.

Assign DRG From a discharge summary encoded with ICD9-CM and other metadata, returns back

• creation of different types of refsets according to the needs of the organization,

The interface terminology is based in the use of SNOMED CT which is used as the reference terminology. In this sense SNOMED CT serves as a uniform backend representation allowing our interface terminology to adapt the local needs of the institutes we serve. SNOMED CT is the most comprehensive clinical terminology, provides a semantic network with formal structured meanings, has an extendable model, it is widely adopted as an international standard,

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

25

The institutional entry terminology is composed of concepts and descriptions. We use SNOMED definition of these terms, where concepts represent distinct clinical meanings and descriptions are a phrase used to name a concept. Our institutional entry terminology can be

• a drug composition service modeled after the UK's dm + d model

In 2011, the Startup process take places with the aim to extract the greatest amount of clinical information possible from the existing system (mostly in free text), and add this information into the new clinical data repository by coding it. To this purpose, extracted data were processed by the RTS and coded it when it was possible. This data included allergies, reason(s) for the consultation, habits, risk factors, symptoms and diagnosis entered by physicians in a free text form, and only coded diagnoses when they felt it particularly necessary. With the batch processing of these data, the RTS recognized and auto coded 11,118,760 (78.74%) texts (included valid and not valid text), and did not recognized 3,001,991 (21.26%) of the original data [57].

In 2012, we started creation of natural language processing tools and extension of terminological services to the domain of drugs, practices and procedures.

In 2014, the Department of Health Informatics of HIBA, during an effort to achieve international standards of patient health care, in the context of an accreditation process by the JCI, the hospital implemented a software tool for synchronous disambiguation in the EHR, developed in-house. Studies have shown that while the use of abbreviations helps to save time and space during documentation, its use can bring some disadvantages such as unambiguous meanings that often can confuse other healthcare providers with the consequence of causing errors in patient health care. In this sense, the JCI requires that the use of abbreviations must be controlled and documented. To this end since November 2014, an Abbreviations Regulation Committee was established in our hospital with the aim of being in charge of the management and classification of abbreviations used in historical health records. As result of this implementation, 800 abbreviations were classified as doubtful or ambiguous with a total of 400 replacement variations.

The Synchronous Self-Expanding Abbreviation System (SSAS) that detects abbreviations in a free text field. This system was user-centered design and typical abbreviations and their meanings were collected from different areas of the hospital in its construction. The abbreviations can be "unequivocal" (one meaning), "ambiguous" (more than one meaning) and "undefined" (undefined terms). SSAS detected about 4000 abbreviations (1000 univocal, 5000 Ambiguous and 2500 not defined), decreasing almost 40% in the use of abbreviations post implementation [58]. The interface vocabulary takes context parameters with terminology control such as user preferences, specialty or knowledge domain to make a decision that offers a single SNOMED CT concept. The concept-id retrieved is then used to automatically replace the abbreviation with the preferred term. The use of an interface vocabulary offers flexibility to use abbreviations with the added benefit that comes with a reference ontology [59].

#### **6.4. The actual HIBA's terminology web service description** (**Table 5**)

We provide terminology services to several healthcare organizations in the countries of Argentina, Chile, and Uruguay. These include:


The interface terminology is based in the use of SNOMED CT which is used as the reference terminology. In this sense SNOMED CT serves as a uniform backend representation allowing our interface terminology to adapt the local needs of the institutes we serve. SNOMED CT is the most comprehensive clinical terminology, provides a semantic network with formal structured meanings, has an extendable model, it is widely adopted as an international standard, and it was designed with EHR implementations in mind.

#### **6.5. Our institutional entry terminology, how does it work?**

The institutional entry terminology is composed of concepts and descriptions. We use SNOMED definition of these terms, where concepts represent distinct clinical meanings and descriptions are a phrase used to name a concept. Our institutional entry terminology can be divided in several subsets; examples of these are:

• Problems list terminology

In 2010, remote Terminology Services (RTS) provided by HIBA through a transnational and

In 2011, the Startup process take places with the aim to extract the greatest amount of clinical information possible from the existing system (mostly in free text), and add this information into the new clinical data repository by coding it. To this purpose, extracted data were processed by the RTS and coded it when it was possible. This data included allergies, reason(s) for the consultation, habits, risk factors, symptoms and diagnosis entered by physicians in a free text form, and only coded diagnoses when they felt it particularly necessary. With the batch processing of these data, the RTS recognized and auto coded 11,118,760 (78.74%) texts (included valid and not valid text), and did not recognized 3,001,991 (21.26%) of the original data [57].

In 2012, we started creation of natural language processing tools and extension of termino-

In 2014, the Department of Health Informatics of HIBA, during an effort to achieve international standards of patient health care, in the context of an accreditation process by the JCI, the hospital implemented a software tool for synchronous disambiguation in the EHR, developed in-house. Studies have shown that while the use of abbreviations helps to save time and space during documentation, its use can bring some disadvantages such as unambiguous meanings that often can confuse other healthcare providers with the consequence of causing errors in patient health care. In this sense, the JCI requires that the use of abbreviations must be controlled and documented. To this end since November 2014, an Abbreviations Regulation Committee was established in our hospital with the aim of being in charge of the management and classification of abbreviations used in historical health records. As result of this implementation, 800 abbreviations were classified as doubtful or ambiguous with a total of 400 replacement variations.

The Synchronous Self-Expanding Abbreviation System (SSAS) that detects abbreviations in a free text field. This system was user-centered design and typical abbreviations and their meanings were collected from different areas of the hospital in its construction. The abbreviations can be "unequivocal" (one meaning), "ambiguous" (more than one meaning) and "undefined" (undefined terms). SSAS detected about 4000 abbreviations (1000 univocal, 5000 Ambiguous and 2500 not defined), decreasing almost 40% in the use of abbreviations post implementation [58]. The interface vocabulary takes context parameters with terminology control such as user preferences, specialty or knowledge domain to make a decision that offers a single SNOMED CT concept. The concept-id retrieved is then used to automatically replace the abbreviation with the preferred term. The use of an interface vocabulary offers flexibility

to use abbreviations with the added benefit that comes with a reference ontology [59].

We provide terminology services to several healthcare organizations in the countries of

• a thesaurus tailored to the local needs and jargon of the professionals who interact with

• SNOMED CT as reference standard for interoperability and to implement CDSS,

**6.4. The actual HIBA's terminology web service description** (**Table 5**)

Argentina, Chile, and Uruguay. These include:

the EHR,

logical services to the domain of drugs, practices and procedures.

interinstitutional implementation [57].

24 eHealth - Making Health Care Smarter



**Table 5.** Functionalities included into terminological web services provided by HIBA.

Some subsets are very large, including thousands of concepts (i.e. the problems list subset). Others are short lists (i.e. the liver failure subset). Each subset was designed in order to be used as the entry terminology in a specific scenario. Concepts are defined only once, regardless of its inclusion in more than one subset; therefore, accessing liver cirrhosis from the problems list or from the Liver failure subset brings the user to the same concept.

in SNOMED CT relationships, like obtaining more refined or more general terms, and means of updating to new versions of SNOMED CT without losing information. We used SNOMED CT Spanish Language Version as the reference terminology, but it is important to note that all different language versions of SNOMED CT share the same concepts and relationships. During the translation process only, new descriptions are added. Both entry and reference terminologies were stored following the SNOMED data model, and using SNOMED tools to represent the concepts of the entry terminology. SNOMED CT defines concepts by its relationships with others, so we created new relationships as part of our SNOMED CT extension. SNOMED CT has around 300,000 concepts, but in a clinical setting, health professionals usually use very detailed expressions, adding modifiers to general concepts, like mild ankle sprain. To prevent the exponential growth of the nomenclature, SNOMED CT avoids including such level of combination with modifiers, providing the general concepts (ankle sprain), the possible modifiers

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

27

(mild) and the rules to correctly relate them (using the has severity relationship).

**6.7. Aggregate terminology: functions and system description**

and the mappings.

Terminology Maintenance Software.

Any new concept can be represented using this post-coordination technique, creating more detailed subtypes of existing SNOMED CT concepts. Around 33% of the concepts included in the Problems List subset could be directly mapped with existing SNOMED CT concepts; the other 77% needed the addition of one or more modifiers (post-coordination) in order to fully represent the meaning of the entry terminology concept. This rate of post-coordination was dictated by a very permissive policy allowing the use of any term requested by the users, often very specific or personalized. The total of 24,800 concepts was represented with 45,000 new relationships. In each subset, professionals usually try to enter terms that are not valid for later use. We would like the doctor to record the proper diagnosis or reason for encounter instead. In order to reject these terms and for the invalid terms administration, we tag them and add an information text so the professional understands the coding guidelines of the institution. This module provides the tools for tagging these terms and editing the information.

Between the aggregate terminology functions, our TS provides output to several standard classifications: ICD-9CM (diagnosis and procedures); ICD-10 (diagnosis); ICPC-2 (diagnosis) (International Classification of Primary Care); ATC (drugs) (Anatomical Therapeutic Chemical Classification); Local billing nomenclatures; Aggregate data according to SNOMED CT hierarchies. All these functions run on a centralized software and data structure. The Terminology Server provides these functions to all existing applications in the Health Information System in the form of Web Services. A terminology maintenance software application should also be developed to administrate the institutional terminology, its relationship with SNOMED CT

About Aggregate terminology, the official SNOMED CT cross maps model was implemented, a multi-classification interface was created as part of the Terminology Maintenance Software to visualize, test and modify mappings from SNOMED to different classifications. An SQL algorithm was designed (Oracle SQL) to aggregate concepts according to knowledge stored in SNOMED CT relationships, like all kinds of diabetes, including diabetes complications and excluding maternal and neonatal diseases. These queries are maintained from a module in the

The process of adding concepts to the entry terminology and organizing them in subsets is manual. This is done by trained coders that were previously working with the same information in secondary coding using classifications [10]. Construction of the problems list subset was one of our biggest challenges. We decided to base our work on the historic database of our EHR with more than 2 million free text inputs since 1998. All problems list entries and discharge notes were processed to extract all different textual descriptions. We considered that these texts, entered by our own professionals in a completely unconstrained way, would be representative of the local natural language, including abbreviations and jargon. A manual depuration process, assisted by string normalization functions, led to the creation of the Problems List subset with 24,800 different concepts, with 110,000 descriptions in total.

Other subsets were created using arbitrary lists of concepts selected by the clinical terminology team with user input. New concepts or descriptions were accepted from user interfaces and stored for manual evaluation. The data model for the entry terminology was the standard SNOMED CT data model for concepts, descriptions and subsets [60]. New concepts and descriptions were added to the standard SNOMED CT distribution following official SNOMED rules for creating institutional extensions.

Since 2000, physicians at the HIBA have used an inpatient EHR for creating the discharge summary using free text. The discharge summary is a structured abstract of the hospitalization episode where data are registered for caring and management purposes. We developed and implemented a modification of the discharge summary data entry user interface that allows the selection of already coded terms from a local terminology. To achieve this we had to introduce a more restrictive user interface that requires users to select terms from an existing list. The new system should have functions that can facilitate migration from the previous unconstrained text entry model. Information contained in discharge summary is structured in several domains. This structure has the purpose of collecting all the necessary information to group episodes using DRG. In each of these fields, the physician entered free text descriptions. The previous version of the discharge summary software tried to automatically code the entered text using the terminology server. If the term did not match an existing entry in the local terminology, it was addressed to the terminology team for secondary manual codification. The terminology team reviewed all the discharge summaries, assigned ICD-9CM codes and manually grouped them into a DRG [61]. The availability of online consultation about the terminology and input terms created acceptance among users, and led us to maximize the benefits of free and structured texts [9].

#### **6.6. Reference terminology: functions and system description**

As regards as reference terminology functions, our TS allows the entry terminology should be represented in the reference terminology (SNOMED CT Spanish Language Version); new concepts can be created for institutional terms that cannot be represented with a standard SNOMED CT code. The system also provides tools to take advantage of the knowledge stored in SNOMED CT relationships, like obtaining more refined or more general terms, and means of updating to new versions of SNOMED CT without losing information. We used SNOMED CT Spanish Language Version as the reference terminology, but it is important to note that all different language versions of SNOMED CT share the same concepts and relationships. During the translation process only, new descriptions are added. Both entry and reference terminologies were stored following the SNOMED data model, and using SNOMED tools to represent the concepts of the entry terminology. SNOMED CT defines concepts by its relationships with others, so we created new relationships as part of our SNOMED CT extension. SNOMED CT has around 300,000 concepts, but in a clinical setting, health professionals usually use very detailed expressions, adding modifiers to general concepts, like mild ankle sprain. To prevent the exponential growth of the nomenclature, SNOMED CT avoids including such level of combination with modifiers, providing the general concepts (ankle sprain), the possible modifiers (mild) and the rules to correctly relate them (using the has severity relationship).

Any new concept can be represented using this post-coordination technique, creating more detailed subtypes of existing SNOMED CT concepts. Around 33% of the concepts included in the Problems List subset could be directly mapped with existing SNOMED CT concepts; the other 77% needed the addition of one or more modifiers (post-coordination) in order to fully represent the meaning of the entry terminology concept. This rate of post-coordination was dictated by a very permissive policy allowing the use of any term requested by the users, often very specific or personalized. The total of 24,800 concepts was represented with 45,000 new relationships. In each subset, professionals usually try to enter terms that are not valid for later use. We would like the doctor to record the proper diagnosis or reason for encounter instead. In order to reject these terms and for the invalid terms administration, we tag them and add an information text so the professional understands the coding guidelines of the institution. This module provides the tools for tagging these terms and editing the information.

#### **6.7. Aggregate terminology: functions and system description**

Some subsets are very large, including thousands of concepts (i.e. the problems list subset). Others are short lists (i.e. the liver failure subset). Each subset was designed in order to be used as the entry terminology in a specific scenario. Concepts are defined only once, regardless of its inclusion in more than one subset; therefore, accessing liver cirrhosis from the prob-

The process of adding concepts to the entry terminology and organizing them in subsets is manual. This is done by trained coders that were previously working with the same information in secondary coding using classifications [10]. Construction of the problems list subset was one of our biggest challenges. We decided to base our work on the historic database of our EHR with more than 2 million free text inputs since 1998. All problems list entries and discharge notes were processed to extract all different textual descriptions. We considered that these texts, entered by our own professionals in a completely unconstrained way, would be representative of the local natural language, including abbreviations and jargon. A manual depuration process, assisted by string normalization functions, led to the creation of the Problems List subset with 24,800 different concepts, with 110,000 descriptions in total. Other subsets were created using arbitrary lists of concepts selected by the clinical terminology team with user input. New concepts or descriptions were accepted from user interfaces and stored for manual evaluation. The data model for the entry terminology was the standard SNOMED CT data model for concepts, descriptions and subsets [60]. New concepts and descriptions were added to the standard SNOMED CT distribution following official

Since 2000, physicians at the HIBA have used an inpatient EHR for creating the discharge summary using free text. The discharge summary is a structured abstract of the hospitalization episode where data are registered for caring and management purposes. We developed and implemented a modification of the discharge summary data entry user interface that allows the selection of already coded terms from a local terminology. To achieve this we had to introduce a more restrictive user interface that requires users to select terms from an existing list. The new system should have functions that can facilitate migration from the previous unconstrained text entry model. Information contained in discharge summary is structured in several domains. This structure has the purpose of collecting all the necessary information to group episodes using DRG. In each of these fields, the physician entered free text descriptions. The previous version of the discharge summary software tried to automatically code the entered text using the terminology server. If the term did not match an existing entry in the local terminology, it was addressed to the terminology team for secondary manual codification. The terminology team reviewed all the discharge summaries, assigned ICD-9CM codes and manually grouped them into a DRG [61]. The availability of online consultation about the terminology and input terms created accep-

tance among users, and led us to maximize the benefits of free and structured texts [9].

As regards as reference terminology functions, our TS allows the entry terminology should be represented in the reference terminology (SNOMED CT Spanish Language Version); new concepts can be created for institutional terms that cannot be represented with a standard SNOMED CT code. The system also provides tools to take advantage of the knowledge stored

**6.6. Reference terminology: functions and system description**

lems list or from the Liver failure subset brings the user to the same concept.

SNOMED rules for creating institutional extensions.

26 eHealth - Making Health Care Smarter

Between the aggregate terminology functions, our TS provides output to several standard classifications: ICD-9CM (diagnosis and procedures); ICD-10 (diagnosis); ICPC-2 (diagnosis) (International Classification of Primary Care); ATC (drugs) (Anatomical Therapeutic Chemical Classification); Local billing nomenclatures; Aggregate data according to SNOMED CT hierarchies. All these functions run on a centralized software and data structure. The Terminology Server provides these functions to all existing applications in the Health Information System in the form of Web Services. A terminology maintenance software application should also be developed to administrate the institutional terminology, its relationship with SNOMED CT and the mappings.

About Aggregate terminology, the official SNOMED CT cross maps model was implemented, a multi-classification interface was created as part of the Terminology Maintenance Software to visualize, test and modify mappings from SNOMED to different classifications. An SQL algorithm was designed (Oracle SQL) to aggregate concepts according to knowledge stored in SNOMED CT relationships, like all kinds of diabetes, including diabetes complications and excluding maternal and neonatal diseases. These queries are maintained from a module in the Terminology Maintenance Software.

To code the terms in the EHR by a specific classification, the coding application requests, to select the appropriate classification. The system displays a list of classifications available and the operator must select one of them. The system then assigns the code for each term. Using this mechanism, it is possible to select the classifier ICPC-2 for the epidemiological analysis from a problem list of the outpatient EHR, ICD-9 and ICD-10 for a discharge summary in the inpatient EHR. This mapping is possible because we used the official cross-match offer by our reference terminology (SNOMED) or creates our own mapping by the specific terminology team. From a discharge summary coded in ICD-9, it may apply an assigned DRG Service to obtain the corresponding code.

Further reduction of manual classification coding will require adjustments of mapping specifications and user interface changes, aimed to reduce the number of new concepts proposals and enforcing the selection of existing terms. Due to acceptability issues, we have always tried to minimize user interface constraints, thus implementation of these changes will be a slow process. By means of a much more detailed implementation, the milestone of our new terminology system is the centralization of knowledge representation. The health information system rep-

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

29

One of the most integrated health network of Chile, Megasalud, was using for a decade an EHR named SiapWin. In 2007, they decided to develop their own HIS allowing longitudinal care of patients treated in the network with the mentoring of the medical informatics expertise of HIBA. On behalf of this project, HIBA decided to modify the functionality of their

In the layer of access to information, Web Services developed with JAVA, JDK 1.6 was used. The Web Services (WS) were deployed in a SUN's Glassfish application server, and the data was stored in an Oracle 11 g database. The WS were published in the Internet for the remote

First implementation in a Chilean provider in 2008: they had clinical data stored and processed in the historical system: about 14 million of unique text phrases. With the terminology services, more than 11 million (78.74%) of texts were automatically codified. In 8 month about 600,000 pieces of new text were entered. About 89.64% of these new texts were successfully recognized by terminology services Nowadays, we are able to recognize above 90% in all regional implementations [57]. The clinical data stored in the legacy system of Megasalud were 14,120,751 single text phrases enabled to process by the RTS. With the batch processing of these data, the RTS recognized and auto coded 11,118,760 (78.74%) texts (included valid and not valid text), and did not recognized 3,001,991 (21.26%) of the original data. In the period between March 1 and October 1, 2009, the physicians at Megasalud entered 592,249 pieces of text in the problem-oriented EHR, 530,897 (89.64%) of them were successfully recognized in the interface terminology of Megasalud by the utilization of RTS in real time. The remainder 61,352 (10.36%) went under

We consider great value to provide services to other institutions by our RTS. Creating and maintaining a sharable Spanish interface vocabulary database between different countries is a big task as medical Spanish is a rich vocabulary and there are different ways of naming the same clinical entities (polysemy), and different acronyms and synonyms between countries.

Published WS allow the most of the progress achieved by HIBA in the management of terminological domain. There are several services that can be used to process the text entered by a

resents uniformly the clinical data entered at any level of care in the institution.

terminology server to provide terminology services to other institutions.

**6.10. Terminology service: experience in other settings**

access of the applications of other institutions.

the audit process and manual modeling [57].

physician in their distance applications [57].

Some examples of others institutions currently consuming RTS are:

Chile: Healthcare providers and FONASA (National Agreement)

Argentina: Healthcare providers and in progress with the federal government

#### **6.8. Terminology maintenance software**

The Terminology Maintenance Software includes the following modules:


#### **6.9. Status report**

Four trained modelers are maintaining the interface terminology, modeling pending concepts or descriptions, running routine quality control checks and maintaining subsets.

We created an ad-hoc automatic process to recode all historic data in our clinical repository, using string matching algorithms; more than 2,200,000 entries were processed.

Around 85% of the original texts received a concept code of the new entry terminology, 10% of them were recognized as invalid entries: therefore, 75% were finally mapped to SNOMED CT. The coding services are used online by our ambulatory and inpatient medical record, receiving around 55,000 requests each month. The task of creating an institutional entry terminology demands a lot of work, but provides an excellent service to the users, and also isolates the terminology system from SNOMED CT changes in newer versions. Local concepts will always be valid, and in the worst case a correction of modeling against SNOMED CT would be required. We found that SNOMED CT cross maps data to ICD-9 is still not adequate for clinical use in our setting, requiring additional manual work on the maps. This may be caused by a different use of the classification in Argentina and the United States.

Our clinical data extraction process, using rules based in SNOMED CT knowledge data, is very effective; however, these rules should be revised for each new SNOMED CT version, as changes in hierarchies and models may affect its effectiveness.

Further reduction of manual classification coding will require adjustments of mapping specifications and user interface changes, aimed to reduce the number of new concepts proposals and enforcing the selection of existing terms. Due to acceptability issues, we have always tried to minimize user interface constraints, thus implementation of these changes will be a slow process.

By means of a much more detailed implementation, the milestone of our new terminology system is the centralization of knowledge representation. The health information system represents uniformly the clinical data entered at any level of care in the institution.

#### **6.10. Terminology service: experience in other settings**

To code the terms in the EHR by a specific classification, the coding application requests, to select the appropriate classification. The system displays a list of classifications available and the operator must select one of them. The system then assigns the code for each term. Using this mechanism, it is possible to select the classifier ICPC-2 for the epidemiological analysis from a problem list of the outpatient EHR, ICD-9 and ICD-10 for a discharge summary in the inpatient EHR. This mapping is possible because we used the official cross-match offer by our reference terminology (SNOMED) or creates our own mapping by the specific terminology team. From a discharge summary coded in ICD-9, it may apply an assigned DRG Service to obtain the corresponding code.

• Entry Terminology Administration: allows the creation of new concepts, description as-

• Subset Administration: creation of new subsets, addition and removal of concepts from the

• Pending Concepts or descriptions: all proposed new concepts or descriptions are stored in a list, waiting to be evaluated and modeled, ordered by the number of proposals.

• Cross Maps Administration: existing cross maps can be visualized, edited and tested using

• Data Extraction Rules Administration: a software interface to visualize and update

Four trained modelers are maintaining the interface terminology, modeling pending concepts

We created an ad-hoc automatic process to recode all historic data in our clinical repository,

Around 85% of the original texts received a concept code of the new entry terminology, 10% of them were recognized as invalid entries: therefore, 75% were finally mapped to SNOMED CT. The coding services are used online by our ambulatory and inpatient medical record, receiving around 55,000 requests each month. The task of creating an institutional entry terminology demands a lot of work, but provides an excellent service to the users, and also isolates the terminology system from SNOMED CT changes in newer versions. Local concepts will always be valid, and in the worst case a correction of modeling against SNOMED CT would be required. We found that SNOMED CT cross maps data to ICD-9 is still not adequate for clinical use in our setting, requiring additional manual work on the maps. This may be

Our clinical data extraction process, using rules based in SNOMED CT knowledge data, is very effective; however, these rules should be revised for each new SNOMED CT version, as

or descriptions, running routine quality control checks and maintaining subsets.

using string matching algorithms; more than 2,200,000 entries were processed.

caused by a different use of the classification in Argentina and the United States.

changes in hierarchies and models may affect its effectiveness.

The Terminology Maintenance Software includes the following modules:

signment and modeling of each concept with SNOMED CT.

subsets, defining hierarchies for tree interfaces.

SNOMED based data extraction queries.

**6.8. Terminology maintenance software**

28 eHealth - Making Health Care Smarter

this module.

**6.9. Status report**

One of the most integrated health network of Chile, Megasalud, was using for a decade an EHR named SiapWin. In 2007, they decided to develop their own HIS allowing longitudinal care of patients treated in the network with the mentoring of the medical informatics expertise of HIBA. On behalf of this project, HIBA decided to modify the functionality of their terminology server to provide terminology services to other institutions.

In the layer of access to information, Web Services developed with JAVA, JDK 1.6 was used. The Web Services (WS) were deployed in a SUN's Glassfish application server, and the data was stored in an Oracle 11 g database. The WS were published in the Internet for the remote access of the applications of other institutions.

First implementation in a Chilean provider in 2008: they had clinical data stored and processed in the historical system: about 14 million of unique text phrases. With the terminology services, more than 11 million (78.74%) of texts were automatically codified. In 8 month about 600,000 pieces of new text were entered. About 89.64% of these new texts were successfully recognized by terminology services Nowadays, we are able to recognize above 90% in all regional implementations [57].

The clinical data stored in the legacy system of Megasalud were 14,120,751 single text phrases enabled to process by the RTS. With the batch processing of these data, the RTS recognized and auto coded 11,118,760 (78.74%) texts (included valid and not valid text), and did not recognized 3,001,991 (21.26%) of the original data. In the period between March 1 and October 1, 2009, the physicians at Megasalud entered 592,249 pieces of text in the problem-oriented EHR, 530,897 (89.64%) of them were successfully recognized in the interface terminology of Megasalud by the utilization of RTS in real time. The remainder 61,352 (10.36%) went under the audit process and manual modeling [57].

We consider great value to provide services to other institutions by our RTS. Creating and maintaining a sharable Spanish interface vocabulary database between different countries is a big task as medical Spanish is a rich vocabulary and there are different ways of naming the same clinical entities (polysemy), and different acronyms and synonyms between countries.

Published WS allow the most of the progress achieved by HIBA in the management of terminological domain. There are several services that can be used to process the text entered by a physician in their distance applications [57].

Some examples of others institutions currently consuming RTS are:

Argentina: Healthcare providers and in progress with the federal government

Chile: Healthcare providers and FONASA (National Agreement)

Uruguay: Healthcare providers and AGESIC (National Agreement)

Colombia: Healthcare providers

Actually, we are translating the thesaurus to the Portuguese, for Brazilian institutions.

[7] Cimino J. Desiderata for controlled medical vocabularies in the twenty-first century.

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

31

[8] Cimino JJ. Terminology tools: State of the art and practical lessons. Methods of

[9] Navas H, Osornio AL, Baum A, Gomez A, Luna D, de Quiros FGB. Creation and evaluation of a terminology server for the interactive coding of discharge summaries. Studies in Health Technology and Informatics [Internet]. 2007;**129**(Pt 1):650-654. Available from:

[10] Luna D, De Quirós FGB, Garfi L, Soriano E, Flaherty MO. Reliability of secondary central coding of medical problems in primary care by non medical coders, using the International Classification of Primary Care (ICPC). Objetives: Materials and Methods:

[11] Hohnloser JH, Kadlec P, Puerner F. Experiments in coding clinical information: An analysis of clinicians using a computerized coding tool. Computers and Biomedical Research [Internet]. Oct 1995;**28**(5):393-401. Available from: http://www.ncbi.nlm.nih.

[12] Wong ET, Pryor TA, Huff SM, Haug PJ, Warner HR. Interfacing a stand-alone diagnostic expert system with a hospital information system. Computers and Biomedical Research [Internet]. Apr 1994;**27**(2):116-129. Available from: http://linkinghub.elsevier.

[13] Osenbloom STRR, Iller RA a M, Ohnson KEBJ, Lkin PELE, Rown STHB, Rosenbloom ST, et al. Interface termminologies: Facilitating direct entry of clinical data into electronic health record systems. Journal of the American Medical Informatics Association [Internet]. 2006;**13**(3):277-288. Available from: http://jamia.oxfordjournals.org/content/

[14] Humphreys BL, Lindberg DA, Schoolman HM, Barnett GO. The unified medical language system: An informatics research collaboration. Journal of the American Medical Informatics Association [Internet]. 1998;**5**(1):1-11 Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=61271&tool=pmcentrez&rendertype=abstract [15] Humphreys BL, McCray AT, Cheh ML. Evaluating the coverage of controlled health data terminologies: Report on the results of the NLM/AHCPR large scale vocabulary

test. Journal of the American Medical Informatics Association. 1997;**4**(6):484-500

from: https://www.ncvhs.hhs.gov/wp-content/uploads/2014/08/hipaa000706.pdf [17] Malterud K, Hollnagel H. The magic influence of classification systems in clinical prac-

tice. Scandinavian Journal of Primary Health Care. 1997;**15**(1):5-6

Methods of Information in Medicine. 2011;**50**(2):101-104

[16] National Committee on Vital and Health Statistics (NCVHS). Washington, DC: Report on Uniform Data Standards for Patient Medical Record Information; 2000. p. 65. Available

[18] Cimino JJ. High-quality, standard, controlled healthcare terminologies come of age.

13/3/277.abstract\nhttp://www.ncbi.nlm.nih.gov/pubmed/16501181

Methods of Information in Medicine. 2012;**37**(1998):394-403

Information in Medicine. 2001;**40**(4):298-306

http://www.ncbi.nlm.nih.gov/pubmed/17911797

Results. 2001;**10**(Pt 2):94-97

gov/pubmed/8612401

com/retrieve/pii/S0010480984710123

#### **Conflict of interest**

The authors declare that they have no conflict of interest.

#### **Author details**

Daniel Luna\*, Carlos Otero, María L. Gambarte and Julia Frangella

\*Address all correspondence to: daniel.luna@hospitalitaliano.org.ar

Health Informatics Department, Hospital Italiano de Buenos Aires, Buenos Aires City, Argentina

#### **References**


[7] Cimino J. Desiderata for controlled medical vocabularies in the twenty-first century. Methods of Information in Medicine. 2012;**37**(1998):394-403

Uruguay: Healthcare providers and AGESIC (National Agreement)

The authors declare that they have no conflict of interest.

Daniel Luna\*, Carlos Otero, María L. Gambarte and Julia Frangella \*Address all correspondence to: daniel.luna@hospitalitaliano.org.ar

rtid=2247898&tool=pmcentrez&rendertype=abstract

Informatics Association. 1994;**1**(3):207-217

Actually, we are translating the thesaurus to the Portuguese, for Brazilian institutions.

Health Informatics Department, Hospital Italiano de Buenos Aires, Buenos Aires City,

[1] Bates DW, Evans RS, Murff H, Stetson PD, Pizziferri L, Hripcsak G. Detecting adverse events using information technology. Journal of the American Medical Informatics Association [Internet]. 2003;**10**(2):115-128 Available from: http://www.pubmedcentral.

[2] Bates DW, Boyle DL, Teich JM. Impact of computerized physician order entry on physician time. Proceedings of the Annual Symposium on Computer Application in Medical Care. 1994:996. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?a

[3] Rector AL. Clinical terminology: Why is it so hard? Methods of Information in Medicine [Internet]. Dec 1999;**38**(4-5):239-252. Available from: http://www.ncbi.nlm.nih.gov/

[4] Shapiro JS, Bakken S, Hyun S, Melton GB, Schlegel C, Johnson SB. Document ontology: Supporting narrative documents in electronic health records. AMIA Annual Symposium Proceedings [Internet]. 2005;**2005**:684-688. Available from: http://www.pubmedcentral.

nih.gov/articlerender.fcgi?artid=1560738&tool=pmcentrez&rendertype=abstract

[6] Evans DA, Cimino JJ, Hersh WR, Huff SM, Bell DS, Group C. Toward a Medicalconcept Representation Language. The Canon Group. Journal of the American Medical

[5] "Codify". Merrianm-Webster.com. Merrian-Webster, n.d. Web. 19 Apr. 2018

nih.gov/articlerender.fcgi?artid=150365&tool=pmcentrez&rendertype=abstract

Colombia: Healthcare providers

30 eHealth - Making Health Care Smarter

**Conflict of interest**

**Author details**

Argentina

**References**

pubmed/10805008


[19] AHIMA Project Offers Insights into SNOMED, ICD-9-CM Mapping Process [Internet]. Available from: http://library.ahima.org/doc?oid=60018#.Wmozc7ziaM and http:// library.ahima.org/doc?oid=60018#.Wmozc7ziaM8

[33] ISO/TS 17117:2002(E): Health Informatics—Controlled health terminology—Structure and high-level indicators: Technical Commitee ISO/TC 215, Health Informatics; 2002.

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

33

[34] International Standards Organization (ISO). TC 215 – Health Informatics. 2007. (Tc 215). Available from: http://www.iso.org/iso/iso\_catalogue/catalogue\_tc/catalogue\_tc\_bro-

[35] Chute CG, Cohn SP, Campbell JR. A framework for comprehensive health terminology systems in the United States: Development guidelines, criteria for selection, and public policy implications. Journal of the American Medical Informatics Association.

[36] Perspectives in Health Information Management. Fall 2014 introduction. Perspectives in Health Information Management [Internet]. 2014;**11**:1a. Available from: http://www.

[37] SNOMED CT [Internet]. Available from: https://www.snomed.org/snomed-ct/what-

[38] Imel M, Campbell J. Mapping from a clinical terminology to a classification. In: AHIMA's 75th Anniversary National Convention [Internet]. 2003. Available from: http://scholar. google.com/scholar?hl=en&btnG=Search&q=intitle:Mapping+from+a+Clinical+Termino

[39] Chavis S. Two systems, one direction. For the Record [Internet]. 2013;**25**:10. Available

[40] Centers for Disease Control and Prevention. International Classification of Diseases. 10th revision. Clinical Modification (ICD-10-CM). 2016. Internet [Internet]. Available

[41] Rassinoux AM, Miller RA, Baud RH, Scherrer JR. Compositional and enumerative designs for medical language representation. Proceedings of the AMIA Annual Fall Symposium [Internet]. 1997;**1997**:620-624. Available from: http://www.pubmedcentral.

nih.gov/articlerender.fcgi?artid=2233357&tool=pmcentrez&rendertype=abstract

[42] Rosenbloom ST, Miller RA, Johnson KB, Elkin PL, Brown SH. Interface terminologies: Facilitating direct entry of clinical data into electronic health record systems. Journal of the American Medical Informatics Association [Internet]. May 1, 2006;**13**(3):277-288 Available from: https://academic.oup.com/jamia/article-lookup/doi/10.1197/jamia.M1957

[43] Chute CG, Cohn SP, Campbell KE, Oliver DE, Campbell JR. The content coverage of clinical classifications. For the Computer-Based Patient Record Institute's Work Group on Codes & Structures. Journal of the American Medical Informatics Association.

[44] Duarte J, Castro S, Santos M, Abelha A, Machado J. Improving quality of electronic health records with SNOMED. Procedia Technology [Internet]. 2014;**16**:1342-1350 Available

from: http://linkinghub.elsevier.com/retrieve/pii/S2212017314003788

from: http://www.fortherecordmag.com/archives/1013p10.shtml

from: http://www.cdc.gov/nchs/icd/icd10cm.htm

p. 18. Available from: https://www.iso.org/obp/ui/#iso:std:iso:ts:17117:en

wse.htm?commid=54960

ncbi.nlm.nih.gov/pubmed/25593567

is-snomed-ct/history-of-snomed-ct

logy+to+a+Classification#0

1996;**3**(3):224-233

1998;**5**(6):503-510


[33] ISO/TS 17117:2002(E): Health Informatics—Controlled health terminology—Structure and high-level indicators: Technical Commitee ISO/TC 215, Health Informatics; 2002. p. 18. Available from: https://www.iso.org/obp/ui/#iso:std:iso:ts:17117:en

[19] AHIMA Project Offers Insights into SNOMED, ICD-9-CM Mapping Process [Internet]. Available from: http://library.ahima.org/doc?oid=60018#.Wmozc7ziaM and http://

[20] Pan American Health Organization. eHealth in Latin America and the Caribbean: Interoperability Standards Review. Washington, DC: PAHO; 2016. Available from: http://iris.paho.org/xmlui/bitstream/handle/123456789/28189/9789275118818\_eng.

[21] Giannangelo K. Healthcare Code Sets, Clinical Terminologies, and Classification Systems. 2nd ed. Chicago: American Health Information Management Association (AHIMA); 2007

[22] Davoudi S, Dooling JA, Glondys B, Jones TD, Kadlec L, Overgaard SM, Ruben K, Wendicke A. Data quality management model (2015 update). Journal of American Health Information Management Association [Internet]. 2015;**86**(10):62-65. Available

[23] Bishop J, Bronnert J, Cook J, Macksoud L, Mastronardi C, Morsch M, et al. Automated coding workflow and CAC practice guidance. Journal of American Health Information

[24] Chute CG. Clinical classification and terminology. Journal of the American Medical

[25] Elkin PL, Brown SH, Carter J, Bauer BA, Wahner-Roedler D, Bergstrom L, et al. Guideline and quality indicators for development, purchase and use of controlled health vocabularies. International Journal of Medical Informatics [Internet]. Dec 18, 2002;**68**(1-3):175-

[26] Cornet R, Chute C. Health concept and knowledge management: Twenty-five years of evolution. IMIA Yearbook [Internet]. 2016;**25**(Suppl. 1):S32-S41 Available from: http://

[27] Alakrawi ZM. Clinical terminology and clinical classification systems: A critique using AHIMA's data quality management model. Perspectives in Health Information Management [Internet]. 2016. Retrieved April 2018. Available from: http://perspectives.

[28] Vladeck BC. Diagnostic-related groups. Journal of the American Medical Association [Internet]. Jun 25, 1982;**247**(24):3314-3315. Available from: http://www.ncbi.nlm.nih.gov/

[30] DRG [Internet]. Available from: https://www.healthlawyers.org/hlresources/Health

[32] NANDA [Internet]. Available from: http://www.nanda.org/nanda-international-glos-

ahima.org/clinical- terminology-and-clinical-classification-systems-a-critique

from: http://library.ahima.org/PB/DataQualityModel#.Wmo5FrziaM9

186. Available from: http://www.ncbi.nlm.nih.gov/pubmed/12467801

www.schattauer.de/index.php?id=1214&doi=10.15265/IYS-2016-s037

[29] Available from: http://www.who.int/classifications/icd/factsheet/en/

[31] LOINC [Internet]. Available from: https://loinc.org/faq/basics/

Law Wiki/Diagnosis-related group (DRG).aspx

library.ahima.org/doc?oid=60018#.Wmozc7ziaM8

pdf?sequence=1&isAllowed=y

32 eHealth - Making Health Care Smarter

Management Association. 2010;**81**(7):51-56

Informatics Association. 2000;**7**(3):298-303

pubmed/6806489

sary-of-terms.html


[45] Gøeg KR, Chen R, Højen AR, Elberg P. Content analysis of physical examination templates in electronic health records using SNOMED CT. International Journal of Medical Informatics [Internet]. 2014 Oct;**83**(10):736-749 Available from: http://linkinghub.elsevier.com/retrieve/pii/S1386505614001075

[57] Luna D, Lopez G, Otero C, Mauro A, Casanelli CT, de Quirós FGB. Implementation of interinstitutional and transnational remote terminology services. AMIA Annual Symposium Proceedings [Internet]. 2010;**2010**:482-486. Available from: http://www. pubmedcentral.nih.gov/articlerender.fcgi?artid=3041368&tool=pmcentrez&rendertype

Terminology Services: Standard Terminologies to Control Medical Vocabulary. "Words are Not…

http://dx.doi.org/10.5772/intechopen.75781

35

[58] Rodriguez JF, Lede DR, Pérez D, Benítez S, Luna D. Abbreviations System: Preliminary

[59] Williams M d PA, Perez D. Building an Snomed CT content extension with abbreviation support. Systematized Nomenclature of Medicine. SNOMED CT EXPO 2015;**2015** [60] International Health Terminology Standards Development Organization (IHTSDO).

[61] Lopez Osornio A, Luna D, Bernaldo de Quiros FG. Creación de un sistema para la codificación automática de una lista de problemas. 5to Simp Informática en Salud – 31 JAIIO [Internet]. 2002;**2002**:31. Available from: http://www.sis.org.ar/sis2002/paperssis/SIS30.

Results of a Satisfaction Study. Cbis. 2016;**2016**:1-6

SNOMED CT® Clinical Terms User Guide. January 2010;**2010**:99

=abstract

pdf


[57] Luna D, Lopez G, Otero C, Mauro A, Casanelli CT, de Quirós FGB. Implementation of interinstitutional and transnational remote terminology services. AMIA Annual Symposium Proceedings [Internet]. 2010;**2010**:482-486. Available from: http://www. pubmedcentral.nih.gov/articlerender.fcgi?artid=3041368&tool=pmcentrez&rendertype =abstract

[45] Gøeg KR, Chen R, Højen AR, Elberg P. Content analysis of physical examination templates in electronic health records using SNOMED CT. International Journal of Medical Informatics [Internet]. 2014 Oct;**83**(10):736-749 Available from: http://linkinghub.else-

[46] Lee D, de Keizer N, Lau F, Cornet R. Literature review of SNOMED CT use. Journal of the American Medical Informatics Association [Internet]. Feb 1, 2014;**21**(e1):e11-e19. Available from: https://academic.oup.com/jamia/article-lookup/doi/10.1136/amiajnl-2013-001636

[47] Gambarte ML, Osornio AL, Martinez M, Reynoso G, Luna D, de Quiros FGB. A practical approach to advanced terminology services in health information systems. Studies in Health Technology and Informatics [Internet]. 2007;**129**(Pt 1):621-625. Available from:

[48] Spackman KA. Normal forms for description logic expressions of clinical concepts in SNOMED RT. Proceedings of the AMIA Symposium [Internet]. 2001;**2001**:627-631. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2243264&

[49] Hammond WE, Stead WW, Straube MJ, Jelovsek FR. Functional characteristics of a computerized medical record. Methods of Information in Medicine [Internet]. Jul 1980;**19**(3):157-162. Available from: http://www.ncbi.nlm.nih.gov/pubmed/7412563 [50] Chute CG, Elkin PL, Sherertz DD, Tuttle MS. Desiderata for a clinical terminology server. Proceedings of the AMIA Symposium [Internet]. 1999;**2**(1):42-46 Available from: http:// www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2232621&tool=pmcentrez&rende

[51] Kahn CE, Wang K, Bell DS. Structured entry of radiology reports using world wide web

[52] The Computer-Based Patient Record [Internet]. Washington, D.C.: National Academies

[53] Masarie FE, Miller RA, Bouhaddou O, Giuse NB, Warner HR. An interlingua for electronic interchange of medical information: Using frames to map between clinical vocabularies. Computers and Biomedical Research [Internet]. Aug 1991;**24**(4):379-400 Available

[54] Lopez Osornio A, Gambarte ML, Otero C, Gomez A, Martinez M, Soriano E, et al. Desarrollo de un servidor de terminología clínico. 8mo Simp Informática en Salud – 34

[55] Luna D, Plazzotta F, Otero C, González Bernaldo de Quirós F, Baum A, Benítez S. Incorporación de tecnologías de la información y de las comunicaciones en el Hospital Italiano de Buenos Aires. Spanish; 2012-01. Serie: Documentos de Proyectos No.459,

[56] Osornio AL, Luna D, Gambarte ML, Gomez A, Reynoso G, de Quirós FGB. Creation of a local interface terminology to SNOMED CT. Studies in Health Technology and Informatics [Internet]. 2007;**129**(Pt 1):765-769. Available from: http://www.ncbi.nlm.nih.

CEPAL; https://www.youtube.com/watch?v=5Pb\_3q6TQ58&feature=youtu.be

vier.com/retrieve/pii/S1386505614001075

34 eHealth - Making Health Care Smarter

http://www.ncbi.nlm.nih.gov/pubmed/17911791

technology. Radiographics. 1996;**16**(3):683-691

Press; 1997. Available from: http://www.nap.edu/catalog/5306

from: http://linkinghub.elsevier.com/retrieve/pii/001048099190035U

tool=pmcentrez&rendertype=abstract

rtype=abstract

JAIIO. 2005;**34**(April):29-43

gov/pubmed/17911820


**Chapter 3**

Provisional chapter

**Multivariate-Stepwise Gaussian Classifier (MSGC): A**

Multivariate-Stepwise Gaussian Classifier (MSGC):

A New Classification Algorithm Tested Over Real

**Data Sets**

Alexandre Serra Barreto

Alexandre Serra Barreto

Abstract

1. Introduction

Disease Data Sets

Additional information is available at the end of the chapter

they proved to be effective enough to be put into practice.

ing, marketing, psychology and medical diagnosis and prediction.

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.74703

**New Classification Algorithm Tested Over Real Disease**

DOI: 10.5772/intechopen.74703

In data mining, classification is the process of assigning one amongst previously known classes to a new observation. Mathematical algorithms are intensively used for classification. In these, a generalization is inferred from the data, so as to classify new cases, or individuals. The algorithm may misclassify an individual if the inference machine is not able to sufficiently discriminate it. Therefore, it is necessary to go further into the analysis of the information provided by the individual, until it can be sufficiently identified as belonging to a class. This chapter developed this idea for the improvement of a certain class of classifiers, using medical data sets to validate the new algorithm proposed here: The Multivariate-Stepwise Gaussian Classifier (MSGC). The results showed that MSGC is at least as competitive as the Gaussian Maximum Likelihood Classifier. MSGC attained the greatest accuracy rate in two of the data sets, and obtained identical results in the two remaining data sets. Concerning medical applications, once a classification method has been successfully validated considering a particular scope of data, the recommendable would be its use for the best diagnosis. Meanwhile, other algorithms could be tested until

Keywords: data mining, classification, algorithm, medical diagnosis and prediction

Mankind has performed classification since remote years, as a part of daily life and survival. With human evolution, our motivation to classify has become more complex and wide, comprehending classification in a wide variety of fields like engineering, management, bank-

> © 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and eproduction in any medium, provided the original work is properly cited.

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

distribution, and reproduction in any medium, provided the original work is properly cited.

#### **Multivariate-Stepwise Gaussian Classifier (MSGC): A New Classification Algorithm Tested Over Real Disease Data Sets** Multivariate-Stepwise Gaussian Classifier (MSGC): A New Classification Algorithm Tested Over Real Disease Data Sets

DOI: 10.5772/intechopen.74703

#### Alexandre Serra Barreto Alexandre Serra Barreto

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.74703

#### Abstract

In data mining, classification is the process of assigning one amongst previously known classes to a new observation. Mathematical algorithms are intensively used for classification. In these, a generalization is inferred from the data, so as to classify new cases, or individuals. The algorithm may misclassify an individual if the inference machine is not able to sufficiently discriminate it. Therefore, it is necessary to go further into the analysis of the information provided by the individual, until it can be sufficiently identified as belonging to a class. This chapter developed this idea for the improvement of a certain class of classifiers, using medical data sets to validate the new algorithm proposed here: The Multivariate-Stepwise Gaussian Classifier (MSGC). The results showed that MSGC is at least as competitive as the Gaussian Maximum Likelihood Classifier. MSGC attained the greatest accuracy rate in two of the data sets, and obtained identical results in the two remaining data sets. Concerning medical applications, once a classification method has been successfully validated considering a particular scope of data, the recommendable would be its use for the best diagnosis. Meanwhile, other algorithms could be tested until they proved to be effective enough to be put into practice.

Keywords: data mining, classification, algorithm, medical diagnosis and prediction

#### 1. Introduction

Mankind has performed classification since remote years, as a part of daily life and survival. With human evolution, our motivation to classify has become more complex and wide, comprehending classification in a wide variety of fields like engineering, management, banking, marketing, psychology and medical diagnosis and prediction.

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and eproduction in any medium, provided the original work is properly cited. © 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In the context of data mining, classification can be understood as the process of assigning to a new observation (sample) one among a set of previously known classes. In fact, the rapid increase in computational processing capacity, coupled with the low cost of storage, has contributed to the greater use of supervised or nonsupervised mathematical algorithms for computational classification. In these, in the learning phase, certain kind of generalization is inferred from the data, so that new cases, or individuals, can be classified by the inference machine.

In this context, this chapter seeks to develop this idea for the improvement of a certain class of well-known classifiers; for this purpose, it uses real medical data sets for the validation of the

Multivariate-Stepwise Gaussian Classifier (MSGC): A New Classification Algorithm Tested Over Real Disease…

2. Classifiers based on the assumption that the form for the underlying

In parametric classification techniques, we learn from data under the assumption that the form for the underlying density function is known. The most common procedure is to consider the normal distribution, as is the case of Gaussian Maximum Likelihood Classifier (GMLC).

measurements made on the sample from p attributes, GMLC will assign to X the class h (h = 1,…, c) having the highest likelihood among the classes. GMLC assumes that the data follows

<sup>∣</sup>Σh<sup>∣</sup> <sup>p</sup> exp -1=<sup>2</sup> <sup>X</sup>-μ<sup>h</sup>

In this equation, μ<sup>h</sup> is the mean vector of class h, Σ<sup>h</sup> is the covariance matrix for class h and ∣Σh∣ is the determinant of Σh. Usually, these parameters are not known and must be estimated from training samples. The sample mean is typically the estimate for the density mean, and the covariance matrix is usually estimated via the sample covariance matrix or the maximum likelihood covariance matrix estimate. The sample mean and the maximum likelihood covariance matrix estimates maximize the joint likelihood of training samples, which are assumed to

The depicted above is mostly the case of some well-known classifiers like linear discriminant analysis (LDA), quadratic discriminant analysis (QDA) and regularized discriminant analysis (RDA), which are trustworthy classifiers based on GMLC computations that reach good results in several data situations. A basic difference between these three classifiers is that in the case of LDA, it is assumed that each class h comes from a normal distribution with a classspecific mean vector and a common global covariance matrix. On the other hand, QDA provides a model that assumes as many covariance matrices (Σh) as there are classes (h). RDA provides a kind of mix of them by means of tuning parameters ð Þ λ; γ , which provides an optimal mix of sample covariance matrix, global covariance matrix and the identity matrix, for instance, if (λ ¼ 0 and γ ¼ 0) RDA will represent QDA, and if (λ ¼ 1 and γ ¼ 0) RDA will represent LDA. It is important to note that among them there is no method considered better. For instance, in [11], it is possible to see that the performance of a classification method varies according to the database considered. The reader could refer to [12], ([13] p. 331–335) and [14]

However, for sure, these aforementioned methods have their shortcomings. Barreto [11] lists the more commonly identified shortcomings in the field literature, such as the fact that the mean and covariance estimates are optimal only asymptotically and can produce lower

� �<sup>T</sup>

Σh


� � depicting p

http://dx.doi.org/10.5772/intechopen.74703

39

Suppose there are <sup>c</sup> distinct classes, given a sample vector <sup>X</sup><sup>T</sup> <sup>¼</sup> <sup>x</sup>1; <sup>x</sup>2;…; xp

ð Þ <sup>2</sup><sup>π</sup> <sup>p</sup>

algorithm proposed here.

density function is known

the multivariate normal density function:

be statistically independent [10].

f Xjμh;Σ<sup>h</sup>

for the accessing of LDA, QDA and RDA foundations.

� � <sup>¼</sup> <sup>1</sup> ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

It should be mentioned that in the medical field, there are several examples of researches applying successfully computational classification as an aid to the medical diagnosis. It can be referred, for instance, the research in [1], which apply a multivariate statistical analysis to explore the Dermatology Data Set (available in the UCI data repository, [2]) and construct a classifier, based only on the 12 clinical attributes, as an aid for the first medical consultation and diagnosis of erythemato-squamous dermatological diseases. The research results provide enhanced knowledge that can help to enrich dermatological diagnoses made by doctors. Also, the classifier developed using the linear discriminant analysis (LDA) obtains a high mean accuracy rate in relation to the six diseases (83.73% correct classifications). This rate means that patients have a good chance of being treated adequately, while biopsies may also be solicited to confirm diagnosis. A classification algorithm developed in [3] was tested over the Dermatology Data Set. This study reported mean accuracy rates (96.2 and 99.2% for a modified version of the algorithm). Note that it utilized all 34 features in the data set (clinical + histological attributes), which can certainly inform further the classifier, since it works knowing the biopsy results. In [4], an analysis is outlined attempting to classify the Dermatology Data Set by decision trees and employing all 34 features in the data set. The authors reported a 5.5 +/ 1.46 error rate. A modified decision tree based on a genetic algorithm for attribute selection achieved a 4.2 +/0.96 error rate. In [5], a classification algorithm is demonstrated, based on genetic algorithms that discovered comprehensible IF-THEN rules. The algorithm was submitted to all 34 features in the Dermatology Data Set and the result was 95% accuracy rate for classifications. By visiting the UCI data repository website, many other studies focusing several medical data sets are listed and can be accessed by the reader.

However, occasionally such generalization may not correctly classify an individual if the inference machine is not able to sufficiently discriminate it among the possible classes. Therefore, it is necessary to go more deeply into the analysis of the information provided by the individual being classified, until it can be sufficiently identified as belonging to a class.

This pursuing, moreover, may be analogous to the efforts made by physicians while performing their crucial diagnostics. In fact, medical theory and practice well acknowledge a basic foundation in medicine, that no two individuals are alike, either in health or illness. For this reason, more and more medical guidelines pursue this maxim, the individualities being considered in the midst of large numbers, examples being the programs of family physicians, homeopathy, psychoanalysis, encouragement of anamnesis rather than light and machine consultations and recent considerations involving slow medicine. Not to lengthen this subject too much, reference is made to the works ([6] p. 5–6, [7] p. 11–12], [8], and [9] p. 3). It could still be possible refer to a series of other initiatives that denote the search for health in its individual fullness, but what is important is that common sense says that such a foundation should also inform statistical methods and artificial intelligence applied to the classification of individuals.

In this context, this chapter seeks to develop this idea for the improvement of a certain class of well-known classifiers; for this purpose, it uses real medical data sets for the validation of the algorithm proposed here.

In the context of data mining, classification can be understood as the process of assigning to a new observation (sample) one among a set of previously known classes. In fact, the rapid increase in computational processing capacity, coupled with the low cost of storage, has contributed to the greater use of supervised or nonsupervised mathematical algorithms for computational classification. In these, in the learning phase, certain kind of generalization is inferred from

It should be mentioned that in the medical field, there are several examples of researches applying successfully computational classification as an aid to the medical diagnosis. It can be referred, for instance, the research in [1], which apply a multivariate statistical analysis to explore the Dermatology Data Set (available in the UCI data repository, [2]) and construct a classifier, based only on the 12 clinical attributes, as an aid for the first medical consultation and diagnosis of erythemato-squamous dermatological diseases. The research results provide enhanced knowledge that can help to enrich dermatological diagnoses made by doctors. Also, the classifier developed using the linear discriminant analysis (LDA) obtains a high mean accuracy rate in relation to the six diseases (83.73% correct classifications). This rate means that patients have a good chance of being treated adequately, while biopsies may also be solicited to confirm diagnosis. A classification algorithm developed in [3] was tested over the Dermatology Data Set. This study reported mean accuracy rates (96.2 and 99.2% for a modified version of the algorithm). Note that it utilized all 34 features in the data set (clinical + histological attributes), which can certainly inform further the classifier, since it works knowing the biopsy results. In [4], an analysis is outlined attempting to classify the Dermatology Data Set by decision trees and employing all 34 features in the data set. The authors reported a 5.5 +/ 1.46 error rate. A modified decision tree based on a genetic algorithm for attribute selection achieved a 4.2 +/0.96 error rate. In [5], a classification algorithm is demonstrated, based on genetic algorithms that discovered comprehensible IF-THEN rules. The algorithm was submitted to all 34 features in the Dermatology Data Set and the result was 95% accuracy rate for classifications. By visiting the UCI data repository website, many other studies focusing sev-

However, occasionally such generalization may not correctly classify an individual if the inference machine is not able to sufficiently discriminate it among the possible classes. Therefore, it is necessary to go more deeply into the analysis of the information provided by the

This pursuing, moreover, may be analogous to the efforts made by physicians while performing their crucial diagnostics. In fact, medical theory and practice well acknowledge a basic foundation in medicine, that no two individuals are alike, either in health or illness. For this reason, more and more medical guidelines pursue this maxim, the individualities being considered in the midst of large numbers, examples being the programs of family physicians, homeopathy, psychoanalysis, encouragement of anamnesis rather than light and machine consultations and recent considerations involving slow medicine. Not to lengthen this subject too much, reference is made to the works ([6] p. 5–6, [7] p. 11–12], [8], and [9] p. 3). It could still be possible refer to a series of other initiatives that denote the search for health in its individual fullness, but what is important is that common sense says that such a foundation should also inform statistical

individual being classified, until it can be sufficiently identified as belonging to a class.

methods and artificial intelligence applied to the classification of individuals.

the data, so that new cases, or individuals, can be classified by the inference machine.

38 eHealth - Making Health Care Smarter

eral medical data sets are listed and can be accessed by the reader.

#### 2. Classifiers based on the assumption that the form for the underlying density function is known

In parametric classification techniques, we learn from data under the assumption that the form for the underlying density function is known. The most common procedure is to consider the normal distribution, as is the case of Gaussian Maximum Likelihood Classifier (GMLC). Suppose there are <sup>c</sup> distinct classes, given a sample vector <sup>X</sup><sup>T</sup> <sup>¼</sup> <sup>x</sup>1; <sup>x</sup>2;…; xp � � depicting p measurements made on the sample from p attributes, GMLC will assign to X the class h (h = 1,…, c) having the highest likelihood among the classes. GMLC assumes that the data follows the multivariate normal density function:

$$f\left(\mathbf{X}|\boldsymbol{\mu}\_{h},\boldsymbol{\Sigma}\_{h}\right) = \frac{1}{\sqrt{(2\pi)^{p}|\boldsymbol{\Sigma}\_{h}|}} \exp\left[\cdot\mathbf{1}/2\left(\mathbf{X}\cdot\boldsymbol{\mu}\_{h}\right)^{T}\boldsymbol{\Sigma}\_{h}^{-1}\left(\mathbf{X}\cdot\boldsymbol{\mu}\_{h}\right)\right].\tag{1}$$

In this equation, μ<sup>h</sup> is the mean vector of class h, Σ<sup>h</sup> is the covariance matrix for class h and ∣Σh∣ is the determinant of Σh. Usually, these parameters are not known and must be estimated from training samples. The sample mean is typically the estimate for the density mean, and the covariance matrix is usually estimated via the sample covariance matrix or the maximum likelihood covariance matrix estimate. The sample mean and the maximum likelihood covariance matrix estimates maximize the joint likelihood of training samples, which are assumed to be statistically independent [10].

The depicted above is mostly the case of some well-known classifiers like linear discriminant analysis (LDA), quadratic discriminant analysis (QDA) and regularized discriminant analysis (RDA), which are trustworthy classifiers based on GMLC computations that reach good results in several data situations. A basic difference between these three classifiers is that in the case of LDA, it is assumed that each class h comes from a normal distribution with a classspecific mean vector and a common global covariance matrix. On the other hand, QDA provides a model that assumes as many covariance matrices (Σh) as there are classes (h). RDA provides a kind of mix of them by means of tuning parameters ð Þ λ; γ , which provides an optimal mix of sample covariance matrix, global covariance matrix and the identity matrix, for instance, if (λ ¼ 0 and γ ¼ 0) RDA will represent QDA, and if (λ ¼ 1 and γ ¼ 0) RDA will represent LDA. It is important to note that among them there is no method considered better. For instance, in [11], it is possible to see that the performance of a classification method varies according to the database considered. The reader could refer to [12], ([13] p. 331–335) and [14] for the accessing of LDA, QDA and RDA foundations.

However, for sure, these aforementioned methods have their shortcomings. Barreto [11] lists the more commonly identified shortcomings in the field literature, such as the fact that the mean and covariance estimates are optimal only asymptotically and can produce lower classification accuracy when the training sample is small, actually, unless many more than p þ 1 samples are available, the true covariance matrix is poorly estimated. Also, the assumption of the knowledge about the form for the underlying density function may be suspicious in most applications. Furthermore, the method involves the inversion of Σ<sup>h</sup> estimate and in some cases, this matrix can be ill-conditioned or even singular, making matrix inversion unfeasible. In spite of the research proposing improvements, specifically concerning the covariance matrix estimation, of which RDA is a legitimate representative, these approaches remain operating under the key assumption that the form for the underlying density function is known.

3.1. Description of the algorithm

matrix U being a sample vector U<sup>T</sup>

ple and the class means μh.

(1) while s < ν þ 1 do:

(1.1) j ¼ p;

(1.2) while j > 0 do:

f Usjμh;Σ<sup>h</sup>

end if;

else.

pseudocode is (considering c ¼ 2):

UT

Given c predefined classes and n sample vectors XT

depicting p measurements (dimensions) made on the sample from p attributes, xij means the jth measurement, j ¼ 1, …, p, for the ith sample. So, let X be a data matrix of type ð Þ n; p with the measurements of data (xij) as elements, j ¼ 1, …, p and i ¼ 1, …, n. The MSGC algorithm functions as depicted below, considering X as a training set [with p attributes (variables), n instances (training samples) and c classes], xhij as an element of X belonging to a class h, h ¼ 1, …, c, μh, h ¼ 1, …, c, being the class mean vector for the training set, and the matrix U of type ð Þ ν; p as a new unknown set [with measurements of data (usj), s ¼ 1,…, ν, j ¼ 1, …, p, as elements, with p attributes (variables), ν cases (unknown samples) and c classes, each line of

Multivariate-Stepwise Gaussian Classifier (MSGC): A New Classification Algorithm Tested Over Real Disease…

description in Eq. (1). Besides, note below that MDhs refer to 'the Mahalanobis distance for

<sup>s</sup> <sup>¼</sup> us1; us2;…; usp in relation to the class mean vector <sup>μ</sup><sup>h</sup> ', and <sup>Δ</sup> is the 'the numerical dissimilarity between the distances of Mahalanobis (MDhs, h ¼ 1, …, c) calculated from a sam-

to be classified by the algorithm. Also, consider the density f Xjμh; Σ<sup>h</sup>

(1.2.1) calculate the mahalanobis distances MDhs for U<sup>T</sup>

relation to each class mean vector μ<sup>h</sup> of the training data, h ¼ 1, 2;

for each class <sup>h</sup>, <sup>h</sup> <sup>¼</sup> <sup>1</sup>, 2;

(0) begin algorithm (initialize variables and counters, s ¼ 1);

(1.2.2) calculate f Usjμh; Σ<sup>h</sup>

<sup>=</sup> ff <sup>h</sup> for each class <sup>h</sup>, <sup>h</sup> <sup>¼</sup> <sup>1</sup>, 2;

(1.2.4) if MD1s-MD2<sup>s</sup> < Δ then do:

(1.2.3) if j ¼ p then do:

j ¼ j-1;

j ¼ 0; end do (referring to step 1.2);

(1.3) if MD1s-MD2<sup>s</sup> < Δ.

<sup>i</sup> <sup>¼</sup> xi1; xi2;…; xip , <sup>i</sup> <sup>¼</sup> <sup>1</sup>, …, n, <sup>j</sup> <sup>¼</sup> <sup>1</sup>, …, p,

http://dx.doi.org/10.5772/intechopen.74703

41

and its complete

<sup>s</sup> <sup>¼</sup> us1; us2;…; usj in

<sup>s</sup> <sup>¼</sup> us1; us2;…; usp ], these unknown sample vectors having

' The Multivariate-Stepwise Gaussian Classifier (MSGC) algorithm

Beyond these problems, this chapter wants to discuss that these methods maximize p Xð Þ� jh p hð Þ to predict the class for the vector of data X, that is, p hð Þ jX . But p Xð Þ jh is calculated on the basis of the density in Eq. (1), which involves the calculation of the well-known Mahalanobis distance from the multivariate mean [ X-μ<sup>h</sup> <sup>T</sup> Σ-1 <sup>h</sup> X-μ<sup>h</sup> ], which is a positive measure. Formally, the Mahalanobis distance represents a dimensionless multivariate measure of the distance between the multivariate vector X, with p dimensions, and the class mean μh, that also has the same p characters. The smaller the distance with respect to a specific class mean, say μ<sup>c</sup> , the more the probability that X belongs to class c. Therefore, by the inspection of Eq. (1), it is easy to see that this mathematical density will have problems in classifying a sample that presents close values for its distances of Mahalanobis considered in relation to the means of the involved classes, a particular situation that induces misclassification errors.

The solution to fix this is to benefit both from the training set and from information proportioned by the new sample itself to be classified. Doing this, the classifier can take into account new information that will improve the overall generalization proportioned by these traditional methods. Therefore, the proposal in this chapter is to make the classification algorithm able to identify and provide treatment to the sample cases presenting close values for its Mahalanobis distances until it can reveal more clearly its actual class for the Gaussian classifier.

## 3. The Multivariate-Stepwise Gaussian Classifier: A new classification algorithm

What is proposed is a new classification method: The Multivariate-Stepwise Gaussian Classifier (MSGC). MSGC theoretically works on the basis of the already depicted GMLC method. Its contribution is to treat individually a sample to be classified if this sample presents close values for its Mahalanobis distances with respect to the class means involved in the classification, so that the discrimination made by the classifier is, in thesis, inconclusive. In this case, the algorithm will work employing dimensionality reduction by disregarding, one by one, in a stepwise process, the p dimensions involved in the calculation of the Mahalanobis distances until the calculated distances are dissimilar enough to give greater accuracy (likelihood) to the classification made by the method.

The key question is: what would be the best numerical dissimilarity between the distances of Mahalanobis obtained from a sample and the class means so that its classification is optimal? It can be anticipated that this response depends on the database to be focused, which will require the previous calibration of the method proposed here.

#### 3.1. Description of the algorithm

classification accuracy when the training sample is small, actually, unless many more than p þ 1 samples are available, the true covariance matrix is poorly estimated. Also, the assumption of the knowledge about the form for the underlying density function may be suspicious in most applications. Furthermore, the method involves the inversion of Σ<sup>h</sup> estimate and in some cases, this matrix can be ill-conditioned or even singular, making matrix inversion unfeasible. In spite of the research proposing improvements, specifically concerning the covariance matrix estimation, of which RDA is a legitimate representative, these approaches remain operating

under the key assumption that the form for the underlying density function is known.

<sup>T</sup>

particular situation that induces misclassification errors.

Σ-1 <sup>h</sup> X-μ<sup>h</sup>

characters. The smaller the distance with respect to a specific class mean, say μ<sup>c</sup>

distances until it can reveal more clearly its actual class for the Gaussian classifier.

3. The Multivariate-Stepwise Gaussian Classifier: A new classification

What is proposed is a new classification method: The Multivariate-Stepwise Gaussian Classifier (MSGC). MSGC theoretically works on the basis of the already depicted GMLC method. Its contribution is to treat individually a sample to be classified if this sample presents close values for its Mahalanobis distances with respect to the class means involved in the classification, so that the discrimination made by the classifier is, in thesis, inconclusive. In this case, the algorithm will work employing dimensionality reduction by disregarding, one by one, in a stepwise process, the p dimensions involved in the calculation of the Mahalanobis distances until the calculated distances are dissimilar enough to give greater accuracy (likelihood) to the

The key question is: what would be the best numerical dissimilarity between the distances of Mahalanobis obtained from a sample and the class means so that its classification is optimal? It can be anticipated that this response depends on the database to be focused, which will require

from the multivariate mean [ X-μ<sup>h</sup>

40 eHealth - Making Health Care Smarter

algorithm

classification made by the method.

the previous calibration of the method proposed here.

Beyond these problems, this chapter wants to discuss that these methods maximize p Xð Þ� jh p hð Þ to predict the class for the vector of data X, that is, p hð Þ jX . But p Xð Þ jh is calculated on the basis of the density in Eq. (1), which involves the calculation of the well-known Mahalanobis distance

Mahalanobis distance represents a dimensionless multivariate measure of the distance between the multivariate vector X, with p dimensions, and the class mean μh, that also has the same p

probability that X belongs to class c. Therefore, by the inspection of Eq. (1), it is easy to see that this mathematical density will have problems in classifying a sample that presents close values for its distances of Mahalanobis considered in relation to the means of the involved classes, a

The solution to fix this is to benefit both from the training set and from information proportioned by the new sample itself to be classified. Doing this, the classifier can take into account new information that will improve the overall generalization proportioned by these traditional methods. Therefore, the proposal in this chapter is to make the classification algorithm able to identify and provide treatment to the sample cases presenting close values for its Mahalanobis

], which is a positive measure. Formally, the

, the more the

Given c predefined classes and n sample vectors XT <sup>i</sup> <sup>¼</sup> xi1; xi2;…; xip , <sup>i</sup> <sup>¼</sup> <sup>1</sup>, …, n, <sup>j</sup> <sup>¼</sup> <sup>1</sup>, …, p, depicting p measurements (dimensions) made on the sample from p attributes, xij means the jth measurement, j ¼ 1, …, p, for the ith sample. So, let X be a data matrix of type ð Þ n; p with the measurements of data (xij) as elements, j ¼ 1, …, p and i ¼ 1, …, n. The MSGC algorithm functions as depicted below, considering X as a training set [with p attributes (variables), n instances (training samples) and c classes], xhij as an element of X belonging to a class h, h ¼ 1, …, c, μh, h ¼ 1, …, c, being the class mean vector for the training set, and the matrix U of type ð Þ ν; p as a new unknown set [with measurements of data (usj), s ¼ 1,…, ν, j ¼ 1, …, p, as elements, with p attributes (variables), ν cases (unknown samples) and c classes, each line of matrix U being a sample vector U<sup>T</sup> <sup>s</sup> <sup>¼</sup> us1; us2;…; usp ], these unknown sample vectors having to be classified by the algorithm. Also, consider the density f Xjμh; Σ<sup>h</sup> and its complete description in Eq. (1). Besides, note below that MDhs refer to 'the Mahalanobis distance for UT <sup>s</sup> <sup>¼</sup> us1; us2;…; usp in relation to the class mean vector <sup>μ</sup><sup>h</sup> ', and <sup>Δ</sup> is the 'the numerical dissimilarity between the distances of Mahalanobis (MDhs, h ¼ 1, …, c) calculated from a sample and the class means μh. ' The Multivariate-Stepwise Gaussian Classifier (MSGC) algorithm pseudocode is (considering c ¼ 2):

(0) begin algorithm (initialize variables and counters, s ¼ 1);

(1) while s < ν þ 1 do:

(1.1) j ¼ p;

(1.2) while j > 0 do:

(1.2.1) calculate the mahalanobis distances MDhs for U<sup>T</sup> <sup>s</sup> <sup>¼</sup> us1; us2;…; usj in relation to each class mean vector μ<sup>h</sup> of the training data, h ¼ 1, 2;

(1.2.2) calculate f Usjμh; Σ<sup>h</sup> for each class <sup>h</sup>, <sup>h</sup> <sup>¼</sup> <sup>1</sup>, 2;

(1.2.3) if j ¼ p then do:

f Usjμh;Σ<sup>h</sup> <sup>=</sup> ff <sup>h</sup> for each class <sup>h</sup>, <sup>h</sup> <sup>¼</sup> <sup>1</sup>, 2;

end if;

(1.2.4) if MD1s-MD2<sup>s</sup> < Δ then do:

j ¼ j-1;

else.

$$j = 0\text{\AA}$$

end do (referring to step 1.2);

$$\text{(1.3) if } MD\_{1s} \text{-} MD\_{2s} < \Delta \text{ .}$$

```
(1.3.1) if ff 1 > ff 2 then do:
            assign to sample Us the class h ¼ 1;
            end if;
            (1.3.2) if ff 2 > ff 1 then do:
            assign to sample Us the class h ¼ 2;
            end if;
        else.
            (1.3.3) if f Us j μ1;Σ1
                        > f Us j μ2;Σ2
                                        then do:
            assign to sample Us the class h ¼ 1;
            end if;
            (1.3.4) if f Us j μ2;Σ2
                        > f Us j μ1;Σ1
                                        then do:
            assign to sample Us the class h ¼ 2;
            end if;
        (1.4) s ¼ s þ 1;
end do (referring to step 1);
```
end of the algorithm.

Note that for simplicity of exposition, the above pseudocode was written for c ¼ 2, but steps (1.2.4) and (1.3) can be expanded for any value for c. Another important observation to be made about the pseudocode is that if in steps (1.2.4) and (1.3) Δ is set to zero, then the new algorithm will function strictly as a GMLC.

A 10-fold cross validation is widely used in the related literature like [13, 30] to present a more

Multivariate-Stepwise Gaussian Classifier (MSGC): A New Classification Algorithm Tested Over Real Disease…

http://dx.doi.org/10.5772/intechopen.74703

43

So to calibrate the MSGC algorithm and define the best value for Δ to be applied to each validation bloc, a previous 10-fold cross-validation was performed for each of the 10 training blocs. In the calibration process, the chosen criterion was the greatest accuracy rate reached over the 10-fold cross-validation. In this process were considered Δs starting from 0 up to 1.1, with increments of 0.1 at each iteration. If there was a tie during the process of calibration, the chosen Δ was the lower one. Thereafter, MSGC was submitted to each validation bloc (once adjusted with the best Δ regarding the corresponding training bloc and its proper process of

For comparison GMLC was also implemented in the R program and applied to exactly the

Pima Indians Diabetes Data Set comprises 768 entries (8 medical and demographical attributes and a class variable), 550 of the entries classified as 0 and 268 classified as category 1. Attribute information: (1) number of times pregnant, (2) plasma glucose concentration a 2 hours in an oral glucose tolerance test, (3) diastolic blood pressure (mm Hg), (4) triceps skin fold thickness (mm), (5) 2-hour serum insulin (mu U/ml), (6) body mass index (weight in kg/(height in m)^2), (7) diabetes pedigree function, (8) age (years) and (9) class variable (0 or 1). Ten mutually exclusive folds were randomly sampled from Pima Indians Diabetes Data Set (9 validation folds including 77 entries and the tenth fold comprising 75). The key importance involved in the classification of Pima Indians Diabetes Data Set lies in the possibility of diagnosing diabetes disease, considering the numerical attributes, since class 1 is interpreted as tested positive

Breast Cancer Winsconsin (Original) Data Set comprises 699 entries (9 attributes and a class variable), 458 of them classified as category 2 "benign" and 241 classified as category 4 "malignant" (recoded as 0 and 1, respectively). Attribute information: (1) sample code number (id number), (2) clump thickness: 1–10, (3) uniformity of cell size: 1–10, (4) uniformity of cell shape: 1–10, (5) marginal adhesion: 1–10, (6) single epithelial cell size: 1–10, (7) bare nuclei: 1–10, (8) bland chromatin: 1–10, (9) normal nucleoli: 1–10, (10) mitoses: 1–10 and (11) class: (2 for benign and 4 for malignant). Ten mutually exclusive folds were randomly sampled from the Breast Cancer Winsconsin (Original) Data Set (9 validation folds including 69 entries and the tenth fold comprising 62). Sixteen original entries with missing data were removed. As for Breast Cancer Wisconsin (Original) Data Set, this data set can be used to predict the severity (benign or malignant) of a clump of cells in relation to the nine numerical

Haberman's Survival Data Set comprises 306 entries (three attributes and a class variable), 81 of them classified as category 2 and the remaining 225 classified as category 1 (recoded as 1 and 0, respectively). Attribute information: (1) age of patient at time of operation (numerical), (2) patient's year of operation (year-1900, numerical), (3) number of positive axillary nodes

same blocs generated by the depicted 10-fold cross-validation process.

4.2. Presentation of data sets and comparison of classification results

stable estimate of the performance of a classification method. Then, it was used here.

10-fold cross-validation).

for diabetes.

attributes.

Finally in this section, it should be added that recent literature involving classifiers which are in some way based on the GMLC method makes no mention of an algorithm that works like MSGC. See [15–28].

The Multivariate-Stepwise Gaussian Classifier (MSGC) algorithm was implemented by means of The R Program for Statistical Computing [29] (version 2.14.0).

#### 4. Comparing MSGC with traditional GMLC method

#### 4.1. Methodology

Some real data sets from the UCI repository [2] (available from: http://archive.ics.uci.edu/ml/ datasets/) are used to compare MSGC to GMLC method.

A 10-fold cross validation is widely used in the related literature like [13, 30] to present a more stable estimate of the performance of a classification method. Then, it was used here.

So to calibrate the MSGC algorithm and define the best value for Δ to be applied to each validation bloc, a previous 10-fold cross-validation was performed for each of the 10 training blocs. In the calibration process, the chosen criterion was the greatest accuracy rate reached over the 10-fold cross-validation. In this process were considered Δs starting from 0 up to 1.1, with increments of 0.1 at each iteration. If there was a tie during the process of calibration, the chosen Δ was the lower one. Thereafter, MSGC was submitted to each validation bloc (once adjusted with the best Δ regarding the corresponding training bloc and its proper process of 10-fold cross-validation).

For comparison GMLC was also implemented in the R program and applied to exactly the same blocs generated by the depicted 10-fold cross-validation process.

#### 4.2. Presentation of data sets and comparison of classification results

(1.3.1) if ff <sup>1</sup> > ff <sup>2</sup> then do:

(1.3.2) if ff <sup>2</sup> > ff <sup>1</sup> then do:

(1.3.3) if f Us j μ1;Σ<sup>1</sup>

(1.3.4) if f Us j μ2;Σ<sup>2</sup>

algorithm will function strictly as a GMLC.

end if;

42 eHealth - Making Health Care Smarter

end if;

end if;

end if;

(1.4) s ¼ s þ 1; end do (referring to step 1);

end of the algorithm.

MSGC. See [15–28].

4.1. Methodology

else.

assign to sample Us the class h ¼ 1;

assign to sample Us the class h ¼ 2;

assign to sample Us the class h ¼ 1;

assign to sample Us the class h ¼ 2;

of The R Program for Statistical Computing [29] (version 2.14.0).

4. Comparing MSGC with traditional GMLC method

datasets/) are used to compare MSGC to GMLC method.

<sup>&</sup>gt; f Us <sup>j</sup> <sup>μ</sup>2;Σ<sup>2</sup>

<sup>&</sup>gt; f Us <sup>j</sup> <sup>μ</sup>1;Σ<sup>1</sup>

then do:

then do:

Note that for simplicity of exposition, the above pseudocode was written for c ¼ 2, but steps (1.2.4) and (1.3) can be expanded for any value for c. Another important observation to be made about the pseudocode is that if in steps (1.2.4) and (1.3) Δ is set to zero, then the new

Finally in this section, it should be added that recent literature involving classifiers which are in some way based on the GMLC method makes no mention of an algorithm that works like

The Multivariate-Stepwise Gaussian Classifier (MSGC) algorithm was implemented by means

Some real data sets from the UCI repository [2] (available from: http://archive.ics.uci.edu/ml/

Pima Indians Diabetes Data Set comprises 768 entries (8 medical and demographical attributes and a class variable), 550 of the entries classified as 0 and 268 classified as category 1. Attribute information: (1) number of times pregnant, (2) plasma glucose concentration a 2 hours in an oral glucose tolerance test, (3) diastolic blood pressure (mm Hg), (4) triceps skin fold thickness (mm), (5) 2-hour serum insulin (mu U/ml), (6) body mass index (weight in kg/(height in m)^2), (7) diabetes pedigree function, (8) age (years) and (9) class variable (0 or 1). Ten mutually exclusive folds were randomly sampled from Pima Indians Diabetes Data Set (9 validation folds including 77 entries and the tenth fold comprising 75). The key importance involved in the classification of Pima Indians Diabetes Data Set lies in the possibility of diagnosing diabetes disease, considering the numerical attributes, since class 1 is interpreted as tested positive for diabetes.

Breast Cancer Winsconsin (Original) Data Set comprises 699 entries (9 attributes and a class variable), 458 of them classified as category 2 "benign" and 241 classified as category 4 "malignant" (recoded as 0 and 1, respectively). Attribute information: (1) sample code number (id number), (2) clump thickness: 1–10, (3) uniformity of cell size: 1–10, (4) uniformity of cell shape: 1–10, (5) marginal adhesion: 1–10, (6) single epithelial cell size: 1–10, (7) bare nuclei: 1–10, (8) bland chromatin: 1–10, (9) normal nucleoli: 1–10, (10) mitoses: 1–10 and (11) class: (2 for benign and 4 for malignant). Ten mutually exclusive folds were randomly sampled from the Breast Cancer Winsconsin (Original) Data Set (9 validation folds including 69 entries and the tenth fold comprising 62). Sixteen original entries with missing data were removed. As for Breast Cancer Wisconsin (Original) Data Set, this data set can be used to predict the severity (benign or malignant) of a clump of cells in relation to the nine numerical attributes.

Haberman's Survival Data Set comprises 306 entries (three attributes and a class variable), 81 of them classified as category 2 and the remaining 225 classified as category 1 (recoded as 1 and 0, respectively). Attribute information: (1) age of patient at time of operation (numerical), (2) patient's year of operation (year-1900, numerical), (3) number of positive axillary nodes detected (numerical) and (4) survival status (class attribute), 1 = the patient survived 5 years or longer, 2 = the patient died within 5 years. Ten mutually exclusive folds were randomly sampled from the Haberman's Survival Data Set (9 validation folds including 31 entries and the tenth fold comprising 27). The main interest in the classification task involving the Haberman's Survival Data Set would be the attempt to predict the life expectancy of patients undergoing breast cancer surgery, taking into account their age at the time of surgery and the number of axillary nodes removed.

Table 2 shows synoptically the accuracy rate mean and standard error for all data sets and methods (the best results for each data sets are highlighted in bold). Both methods were profi-

Multivariate-Stepwise Gaussian Classifier (MSGC): A New Classification Algorithm Tested Over Real Disease…

http://dx.doi.org/10.5772/intechopen.74703

45

From Table 2,we can see that MSGC attained the greatest accuracy rate in two out of four data sets (PIMA and MAMMOGRAPHIC). For HABERMAN'S and BREAST, both methods achieved identical results since for these data sets MSGC was set with Δ = 0.0. It has been seen that the Mammographic Mass Data Set source [2] reports that because of the low positive predictive value of the exam, about 70% of the biopsies are actually unnecessary as they end up showing benign lesions. With the practical use of classification algorithms such as MSGC and GMLC, a favorable new situation is achieved, with levels of diagnostic accuracy above 80%. This rate means that patients can be treated adequately, while biopsies may be subsidi-

Note that accuracy rate was chosen as the criterion for comparison, but in medicine, sometimes the physician needs to know other criteria like sensitivity, specificity or precision; in this case,

We also have to remark the positive aspect that these results for MSGC algorithm are transcendent. Since GMLC is the basis on which other traditional classification methods (namely RDA, QDA and LDA) are based, an improvement made in GMLC, such as this obtained through the MSGC method, will probably imply improvements in performance also for RDA, QDA and

A new classification algorithm is presented in this chapter: The Multivariate-Stepwise Gauss-

MSGC theoretically works on the basis of the Gaussian Maximum Likelihood Classifier (GMLC) method. Its contribution is to treat individually a sample to be classified if this sample presents close values for its Mahalanobis distances with respect to the class means involved in the classification, so that the discrimination made by the classifier is, in thesis, inconclusive.

the data analyst should take care to also calculate them based on the algorithm results.

DATA MSGC GMLC PIMA 73.67 (2.05) 73.41 (2.25) BREAST 94.92 (0.75) 94.92 (0.75) HABERMAN'S 75.10 (2.42) 75.10 (2.42) MAMMOGRAPHIC 80.36 (1.27) 80.00 (1.11)

cient in classifying data and obtained relatively similar results.

arily requested.

5. Conclusion

ian Classifier (MSGC).

LDA. Future research shall prove this.

SE: Standard error for accuracy rate mean.

Table 2. Classification results for real data sets—accuracy rate mean % (SE).

Mammographic Mass Data Set presents discrimination of benign and malignant mammographic masses based on BI-RADS attributes and the patient's age. It comprises 961 entries of data (five attributes and a class variable). The class associated with each record is the field 'severity,' 0 or 1. Attribute information: (1) BI-RADS assessment: 1–5 (ordinal), (2) age: patient's age in years (integer), (3) shape: mass shape: round = 1, oval = 2, lobular = 3, irregular = 4 (nominal), (4) Margin: mass margin: circumscribed = 1, microlobulated = 2, obscured = 3, ill-defined = 4, spiculated = 5 (nominal), (5) Density: mass density: high = 1, iso = 2, low = 3, fatcontaining = 4 (ordinal) and (6) severity: benign = 0 or malignant = 1 (binominal). A total of 131 original entries with missing data were removed. Ten mutually exclusive folds were randomly sampled from the Mammographic Mass Data Set (all of them with 83 entries). In relation to the Mammographic Mass Data Set, [2] informs that "Mammography is the most effective method for breast cancer screening available today. However, the low positive predictive value of breast biopsy resulting from mammogram interpretation leads to approximately 70% unnecessary biopsies with benign outcomes. (…) This data set can be used to predict the severity (benign or malignant) of a mammographic mass lesion from BI-RADS attributes and the patient's age."

To illustrate the 10-fold cross-validation process for MSGC calibration, Table 1 summarizes the values for Δ that gave the greatest accuracy rate (%) for all the 10 training blocs. Remembering that there are two classes in all the data sets considered in the process. Afterward, these best settings for Δ (in Table 1) were applied to steps (1.2.4) and (1.3) in MSGC algorithm in order to classify the corresponding validation blocs.

From Table 1, it is possible to see that best values for Δ= 0.0 imply that MSGC optimally will work as a traditional GMLC for the Breast Cancer Winsconsin (Original) Data Set and Haberman's Survival Data Set classification.


Table 1. Summary of the 10-fold cross-validation calibration process - The Δ settings giving best accuracy rate concerning training blocs.

Table 2 shows synoptically the accuracy rate mean and standard error for all data sets and methods (the best results for each data sets are highlighted in bold). Both methods were proficient in classifying data and obtained relatively similar results.

From Table 2,we can see that MSGC attained the greatest accuracy rate in two out of four data sets (PIMA and MAMMOGRAPHIC). For HABERMAN'S and BREAST, both methods achieved identical results since for these data sets MSGC was set with Δ = 0.0. It has been seen that the Mammographic Mass Data Set source [2] reports that because of the low positive predictive value of the exam, about 70% of the biopsies are actually unnecessary as they end up showing benign lesions. With the practical use of classification algorithms such as MSGC and GMLC, a favorable new situation is achieved, with levels of diagnostic accuracy above 80%. This rate means that patients can be treated adequately, while biopsies may be subsidiarily requested.

Note that accuracy rate was chosen as the criterion for comparison, but in medicine, sometimes the physician needs to know other criteria like sensitivity, specificity or precision; in this case, the data analyst should take care to also calculate them based on the algorithm results.

We also have to remark the positive aspect that these results for MSGC algorithm are transcendent. Since GMLC is the basis on which other traditional classification methods (namely RDA, QDA and LDA) are based, an improvement made in GMLC, such as this obtained through the MSGC method, will probably imply improvements in performance also for RDA, QDA and LDA. Future research shall prove this.


SE: Standard error for accuracy rate mean.

Table 2. Classification results for real data sets—accuracy rate mean % (SE).

#### 5. Conclusion

detected (numerical) and (4) survival status (class attribute), 1 = the patient survived 5 years or longer, 2 = the patient died within 5 years. Ten mutually exclusive folds were randomly sampled from the Haberman's Survival Data Set (9 validation folds including 31 entries and the tenth fold comprising 27). The main interest in the classification task involving the Haberman's Survival Data Set would be the attempt to predict the life expectancy of patients undergoing breast cancer surgery, taking into account their age at the time of surgery and the

Mammographic Mass Data Set presents discrimination of benign and malignant mammographic masses based on BI-RADS attributes and the patient's age. It comprises 961 entries of data (five attributes and a class variable). The class associated with each record is the field 'severity,' 0 or 1. Attribute information: (1) BI-RADS assessment: 1–5 (ordinal), (2) age: patient's age in years (integer), (3) shape: mass shape: round = 1, oval = 2, lobular = 3, irregular = 4 (nominal), (4) Margin: mass margin: circumscribed = 1, microlobulated = 2, obscured = 3, ill-defined = 4, spiculated = 5 (nominal), (5) Density: mass density: high = 1, iso = 2, low = 3, fatcontaining = 4 (ordinal) and (6) severity: benign = 0 or malignant = 1 (binominal). A total of 131 original entries with missing data were removed. Ten mutually exclusive folds were randomly sampled from the Mammographic Mass Data Set (all of them with 83 entries). In relation to the Mammographic Mass Data Set, [2] informs that "Mammography is the most effective method for breast cancer screening available today. However, the low positive predictive value of breast biopsy resulting from mammogram interpretation leads to approximately 70% unnecessary biopsies with benign outcomes. (…) This data set can be used to predict the severity (benign or malignant) of a

To illustrate the 10-fold cross-validation process for MSGC calibration, Table 1 summarizes the values for Δ that gave the greatest accuracy rate (%) for all the 10 training blocs. Remembering that there are two classes in all the data sets considered in the process. Afterward, these best settings for Δ (in Table 1) were applied to steps (1.2.4) and (1.3) in MSGC algorithm in order to

From Table 1, it is possible to see that best values for Δ= 0.0 imply that MSGC optimally will work as a traditional GMLC for the Breast Cancer Winsconsin (Original) Data Set and

DATA Bloc 1 Bloc 2 Bloc 3 Bloc 4 Bloc 5 Bloc 6 Bloc 7 Bloc 8 Bloc 9 Bloc 10 PI 0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.1 BR 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 HB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 MA 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4

Table 1. Summary of the 10-fold cross-validation calibration process - The Δ settings giving best accuracy rate

mammographic mass lesion from BI-RADS attributes and the patient's age."

DATA SETS: PI = PIMA; BR = BREAST; HB = HABERMAN'S; MA = MAMMOGRAPHIC.

classify the corresponding validation blocs.

Haberman's Survival Data Set classification.

concerning training blocs.

number of axillary nodes removed.

44 eHealth - Making Health Care Smarter

A new classification algorithm is presented in this chapter: The Multivariate-Stepwise Gaussian Classifier (MSGC).

MSGC theoretically works on the basis of the Gaussian Maximum Likelihood Classifier (GMLC) method. Its contribution is to treat individually a sample to be classified if this sample presents close values for its Mahalanobis distances with respect to the class means involved in the classification, so that the discrimination made by the classifier is, in thesis, inconclusive. In this case, MSGC will work employing dimensionality reduction by disregarding, one by one, in a stepwise process, the p dimensions involved in the calculation of the Mahalanobis distances until the calculated distances are dissimilar enough to give greater accuracy to the classification made by the base method (GMLC).

obtained from the University of Wisconsin Hospitals, Madison, from Dr. William H. Wolberg,

Multivariate-Stepwise Gaussian Classifier (MSGC): A New Classification Algorithm Tested Over Real Disease…

http://dx.doi.org/10.5772/intechopen.74703

47

[1] Barreto AS. Multivariate statistical analysis for dermatological disease diagnosis. In: Proceedings of the IEEE International Conference On Biomedical And Health Informatics

[2] Dua D, Karra Taniskidou E. UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. 2017. http://archive.ics.uci.edu/ml

[3] Demiroz G, Guvenir HA, Ilter N. Learning differential diagnosis of Eryhemato-squamous diseases using voting feature intervals. Artificial Intelligence in Medicine. 1998;13(3):147-

[4] Pappa GL, Freitas AA, Kaestner CAA. Attribute selection with a multi-objective genetic algorithm. In: Proceedings of 16th Brazilian Symposium on Artificial Intelligence (SBIA);

[5] Fidelis MV, Lopes HS, Freitas AA. Discovering comprehensible classification rules with a genetic algorithm. In: Proceedings of the 2000 Congress on Evolutionary Computation

[6] Grandgeorge D. The Spirit of Homeopathic Medicines. Berkeley: North Atlantic Books;

(BHI '2014); 1-4 June 2014; Valencia. New York: IEEE; 2014. p. 500-504

and the author would like to thank him.

LDA linear discriminant analysis

QDA quadratic discriminant analysis RDA regularized discriminant analysis

GMLC Gaussian Maximum Likelihood Classifier

MSGC Multivariate-Stepwise Gaussian Classifier

Address all correspondence to: alexsbdr@gmail.com

165. DOI: 10.1016/S0933-3657(98)00028-1

11–14 November, 2002; Recife: Springer; 2002. p. 280-290

(CEC00). 16-19 July, 2000; La Jolla. New York: IEEE; 2000. p. 805-810

Ministry of Finance (Brazil), Brasília-DF, Brazil

Nomenclature

Author details

References

1998

Alexandre Serra Barreto

For better performance, MSGC may be previously calibrated by means of a training set. A 10 fold cross-validation process was used to calibrate the algorithm.

MSGC was applied for data classification and its performance was compared with the traditional GMLC method considering four real medical data sets available in the UCI data repository. These data represent a range of different types of data dependence structure and dimensionality. The results showed that the performance of the MSGC algorithm is at least as competitive as GMLC. MSGC attained the greatest accuracy rate in two of the data sets (PIMA and MAMMOGRAPHIC). For HABERMAN'S and BREAST data sets, both methods achieved identical results. It was concluded that MSGC can be used as an effective classification tool in a wide range of data sets.

The presented results for the MSGC algorithm are transcendent. Since GMLC is the basis on which other traditional classification methods (namely RDA, QDA and LDA) are based, an improvement made in GMLC, such as this obtained through the MSGC algorithm, will probably imply improvements in performance also for RDA, QDA and LDA.

After reaching the conclusions, an additional discussion arises. With the emergence of the big data, as a robust successor to data mining emerged from the exponential development of computers and storage media since the 1990s, it has been a tendency to think of the intensive use of multiple algorithms simultaneously, in supervised or nonsupervised approaches, to analyze data and discover patterns. This certainly makes sense, as it has already been mentioned in this chapter that there is no one classification method or algorithm better than another. Beyond to a greater robustness or scalability of some methods over others, what concrete exists is a dependence of the results against the target database.

Therefore, in this context, and considering a matter as important as the medical clinic, once a classification method has been tested and successfully validated considering a particular scope of data, the most recommended would be its use for the best diagnosis. Meanwhile, if possible, already known or new algorithms could be tested for various diseases and symptoms data until they proved to be robust and effective enough to be put into medical practice.

Finally, it is important to remark that mathematical classifier serves as an aid to the crucial medical diagnosis made by the physician.

#### Acknowledgements

The author would like to thank the UCI Machine Learning Repository and the data donors for putting real data sets at the disposal of the scientific community and would also like to thank The R Foundation for Statistical Computing and its contributors for developing and making the R program available to the public. The Breast Cancer Wisconsin (original) Data Set was obtained from the University of Wisconsin Hospitals, Madison, from Dr. William H. Wolberg, and the author would like to thank him.

#### Nomenclature

In this case, MSGC will work employing dimensionality reduction by disregarding, one by one, in a stepwise process, the p dimensions involved in the calculation of the Mahalanobis distances until the calculated distances are dissimilar enough to give greater accuracy to the

For better performance, MSGC may be previously calibrated by means of a training set. A 10-

MSGC was applied for data classification and its performance was compared with the traditional GMLC method considering four real medical data sets available in the UCI data repository. These data represent a range of different types of data dependence structure and dimensionality. The results showed that the performance of the MSGC algorithm is at least as competitive as GMLC. MSGC attained the greatest accuracy rate in two of the data sets (PIMA and MAMMOGRAPHIC). For HABERMAN'S and BREAST data sets, both methods achieved identical results. It was concluded that MSGC can be used as an effective classification tool in a

The presented results for the MSGC algorithm are transcendent. Since GMLC is the basis on which other traditional classification methods (namely RDA, QDA and LDA) are based, an improvement made in GMLC, such as this obtained through the MSGC algorithm, will prob-

After reaching the conclusions, an additional discussion arises. With the emergence of the big data, as a robust successor to data mining emerged from the exponential development of computers and storage media since the 1990s, it has been a tendency to think of the intensive use of multiple algorithms simultaneously, in supervised or nonsupervised approaches, to analyze data and discover patterns. This certainly makes sense, as it has already been mentioned in this chapter that there is no one classification method or algorithm better than another. Beyond to a greater robustness or scalability of some methods over others, what

Therefore, in this context, and considering a matter as important as the medical clinic, once a classification method has been tested and successfully validated considering a particular scope of data, the most recommended would be its use for the best diagnosis. Meanwhile, if possible, already known or new algorithms could be tested for various diseases and symptoms data

Finally, it is important to remark that mathematical classifier serves as an aid to the crucial

The author would like to thank the UCI Machine Learning Repository and the data donors for putting real data sets at the disposal of the scientific community and would also like to thank The R Foundation for Statistical Computing and its contributors for developing and making the R program available to the public. The Breast Cancer Wisconsin (original) Data Set was

until they proved to be robust and effective enough to be put into medical practice.

classification made by the base method (GMLC).

wide range of data sets.

46 eHealth - Making Health Care Smarter

fold cross-validation process was used to calibrate the algorithm.

ably imply improvements in performance also for RDA, QDA and LDA.

concrete exists is a dependence of the results against the target database.

medical diagnosis made by the physician.

Acknowledgements


#### Author details

Alexandre Serra Barreto

Address all correspondence to: alexsbdr@gmail.com

Ministry of Finance (Brazil), Brasília-DF, Brazil

#### References


[21] Halbe Z, Aladjem M. Regularized mixture discriminant analysis. Pattern Recognition

Multivariate-Stepwise Gaussian Classifier (MSGC): A New Classification Algorithm Tested Over Real Disease…

http://dx.doi.org/10.5772/intechopen.74703

49

[22] Ji S, Ye J. Kernel uncorrelated and regularized discriminant analysis: A theoretical and computational study. IEEE Transactions on Knowledge and Data Engineering. 2008;

[23] Licheng J et al. An organizational coevolutionary algorithm for classification. IEEE Trans-

[24] Liu J, Chen S, Tan X. A study on three linear discriminant analysis based methods in small sample size problem. Pattern Recognition. 2008;41:102-116. DOI: 10.1016/j.patcog.2007.

[25] Lu H, Plataniotis KN, Venetsanopoulos AN. Uncorrelated multilinear discriminant analysis with regularization and aggregation for tensor object recognition. IEEE Transactions

[26] Lu H, Plataniotis KN, Venetsanopoulos AN. Regularized discriminant analysis for the small sample size problem in face recognition. Pattern Recognition Letters. 2003;24:3079-

[27] Peng J, Zhang P, Riedel N. Discriminant learning analysis. IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics. 2008;38(6):1614-1625. DOI: 10.1109/TSMCB.

[28] Xu P, Brock G, Parrish R. Modified linear discriminant analysis approaches for classification of high-dimensional microarray data. Computational Statistics & Data Analysis.

[29] R Development Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN: 3-900051-07-0. URL:

[30] Jiao L, Liu J, Zhong W. An organizational coevolutionary algorithm for classification. IEEE Transactions on Evolutionary Computation. 2006;10(1):67-80. DOI: 10.1109/TEVC.

on Neural Networks. 2009;20(1):103-123. DOI: 10.1109/TNN.2008.2004625

Letters. 2007;28:2104-2115. DOI: 10.1016/j.patrec.2007.06.009

actions on Evolutionary Computation. 2006;10(1):67-80

20(10):1311-1321. DOI: 10.1109/TKDE.2008.57

3087. DOI 10.1016/j.patcog.2007.06.001

http://www.R-project.org/.2011

2009;53:1674-1687. DOI: 10.1016/j.csda.2008.02.005

06.001

2008.2002852

2005.856068


[21] Halbe Z, Aladjem M. Regularized mixture discriminant analysis. Pattern Recognition Letters. 2007;28:2104-2115. DOI: 10.1016/j.patrec.2007.06.009

[7] Freud S. The Essentials of Psichoanalisys. London: Penguin Books; 1986

375-383

8.10001-6.ch1

48 eHealth - Making Health Care Smarter

34.506799

2168568

jmva.2011.05.004

2002. 495 p. ISBN: 0-387-95457-0

[8] Ali N, Khuwaja A, Kausar S, Patients NK. Evaluations of family practice care and attributes of a good family physician. Radcliffe Publishing, Quality in Primary Care. 2012;20:

[9] Rakel R. Family physician. In: Rakel R, Rakel D, editors. Textbook of Family Medicine. 8th ed. Philadelphia, PA: Elsevier Saunders; 2011. p. 1-14. DOI: 10.1016/B978-1-4377-1160-

[10] Hoffbeck JP, Landgrebe DA. Covariance matrix estimation and classification with limited training data. Pattern Analysis and Machine Intelligence. 1994;18(7):763-767. DOI: 10.1109/

[11] Barreto AS. Weighted correlation matrix similarity: A new classification algorithm. In: Proceedings of the IADIS European Conference on Data Mining 2012: Part of the IADIS Multi Conference on Computer Science and Information Systems 2012 (IADIS DATA MINING 2012); 17–23 July 2012; Lisbon. Lisbon: IADIS; 2012. p. 79-90. ISBN: 978-972-8939-69-4 [12] Friedman JH. Regularized discriminant analysis. Journal of the American Statistical

[13] Venables WN, Ripley BD. Modern Applied Statistics with S. 4th ed. New York: Springer;

[14] Rpubs. Classification: Linear Discriminant Analysis [Internet]. 2014. Available from:

[15] Skolidis G, Sanguinetti G. Bayesian multitask classification with Gaussian process priors. IEEE Transactions on Neural Networks. 2011;22(12):2011-2021. DOI: 10.1109/TNN.2011.

[16] Maugis C, Celeux G, Martin-Magniette M-L. Variable selection in model-based discriminant analysis. Journal of Multivariate Analysis. 2011;102(10):1374-1387. DOI: 10.1016/j.

[17] Hyun-Chul Kim Z, Ghahramani Z. Bayesian Gaussian process classification with the EM-EP algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2006;

[18] Opper M, Winther O. Gaussian processes for classification: Mean-field algorithms. Neu-

[19] Ye J, Janardan R, Li Q, Park H. Feature reduction via generalized uncorrelated linear discriminant analysis. IEEE Transactions on Knowledge and Data Engineering. 2006;

[20] Guo P, Jia Y, Lyu MR. A study of regularized Gaussian classifier in high-dimension small sample set case based on MDL principle with application to spectrum recognition. Pat-

ral Computation. 2000;12(11):2655-2684. DOI: 10.1162/089976600300014881

tern Recognition. 2008;41:2842-2854. DOI: 10.1016/j.patcog.2008.02.004

Association. 1989;84(405):165-175. DOI: 10.1080/01621459.1989.10478752

https://rpubs.com/ryankelly/LDA-QDA [Accessed: 2017-12-24]

28(12):1948-1959. DOI: 10.1109/TPAMI.2006.238

18(10):1312-1321. DOI: 10.1109/TKDE.2006.160


**Chapter 4**

**Provisional chapter**

**Moving towards Sustainable Electronic Health**

**Moving towards Sustainable Electronic Health** 

DOI: 10.5772/intechopen.75040

Electronic healthcare applications, both web-based and mobile health (mHealth) provide new modalities for chronic disease. These tools allow patients to track their symptoms and help them manage their condition. The sustainability of these tools is often not considered during their development. To ensure these applications can be adopted and sustainable, where policy differs amongst states and provinces, we must present the benefits of our findings to highlight the justification for its development. For technology to be sustainable it has to utilize infrastructure that is secure, stable and to be agile so that it can be deployed quickly with minimal interruption to patients, family members and

**Keywords:** sustainability, self-care, eHealth, mHealth, technology, co-design

and accessible way to manage their health at the tip of their fingers [1].

Within the healthcare industry, innovation remains to be the leading force in the quest to balance health care quality and costcontainment [1]. Mobile health (mHealth) applications are one of the fastest growing segments in drive for innovation in the health care sector. With the rising use of mobile phones, mHealth applications (apps) provide individuals with a simple

Unfortunately, many mHealth interventions continue to be developed without the consideration of long term sustainability, which has left many apps with vast potential but nowhere to move forward. This is one of the growing problems with health app development, where in spite of the advances made with technology, apps fail to be used due to the methodological

> © 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

distribution, and reproduction in any medium, provided the original work is properly cited.

Sahr Wali, Karim Keshavjee and Catherine Demers

Sahr Wali, Karim Keshavjee and Catherine DemersAdditional information is available at the end of the chapter

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.75040

healthcare professionals.

**Applications**

**Abstract**

**1. Introduction**

**Applications**

#### **Moving towards Sustainable Electronic Health Applications Moving towards Sustainable Electronic Health Applications**

DOI: 10.5772/intechopen.75040

Sahr Wali, Karim Keshavjee and Catherine Demers Sahr Wali, Karim Keshavjee and Catherine DemersAdditional information is available at the end of the chapter

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.75040

**Abstract**

Electronic healthcare applications, both web-based and mobile health (mHealth) provide new modalities for chronic disease. These tools allow patients to track their symptoms and help them manage their condition. The sustainability of these tools is often not considered during their development. To ensure these applications can be adopted and sustainable, where policy differs amongst states and provinces, we must present the benefits of our findings to highlight the justification for its development. For technology to be sustainable it has to utilize infrastructure that is secure, stable and to be agile so that it can be deployed quickly with minimal interruption to patients, family members and healthcare professionals.

**Keywords:** sustainability, self-care, eHealth, mHealth, technology, co-design

#### **1. Introduction**

Within the healthcare industry, innovation remains to be the leading force in the quest to balance health care quality and costcontainment [1]. Mobile health (mHealth) applications are one of the fastest growing segments in drive for innovation in the health care sector. With the rising use of mobile phones, mHealth applications (apps) provide individuals with a simple and accessible way to manage their health at the tip of their fingers [1].

Unfortunately, many mHealth interventions continue to be developed without the consideration of long term sustainability, which has left many apps with vast potential but nowhere to move forward. This is one of the growing problems with health app development, where in spite of the advances made with technology, apps fail to be used due to the methodological

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

challenges associated with designing for sustainability [2]. In this chapter, we focus on addressing the main issues app publishers face during the design process. We then outline the key components that should be included to assure the sustainability of an electronic health app.

We define the issues with innovation by three main components that include (1) end-user usability, (2) clinician and informal caregiver (spouse, children, friend) input and (3) impact of agencies outside development. Many electronic health apps fail to consider these major factors in its design, which in turn is often what limits the sustainability of its use. We believe that these three factors are essential as it evaluates the health apps design according to the user, their main members of care and finally the environment it is used. If the health app being designed does not simplify or improve the current model of care, there is no incentive for its use. Instead, the benefits associated with the app will be overshadowed by its complications or drawbacks.

This leads to our section in the chapter on designing for sustainability. We start by signifying the importance of putting the end-user first and then introduce the information system research framework that help identify user needs, design preferences and potential barriers to increase health app adoption. This is followed by the next stage of designing for sustainability, where we outline the steps to get all the potential players on board in support for the new innovation. We highlight the inevitable resistance to change that will occur, and explain the concept of 'behavioral intention' to use a technology and how this will help improve health app sustainability.

Finally, for the purpose of long-term sustainability, we expand on preparing for the expected and unexpected, by evaluating change management plans and regulations in place during health app design. Towards the end of the chapter, we develop a market and feasibility analysis framework for the adoption and scalability of the health app on a national scale. This allows us to ensure all key factors have been addressed; leaving the app design and efficacy to become unquestionable.

#### **2. Issues with innovation**

Over the past decade, there have been a number of advancements within the healthcare industry, yet there is still a strong resistance present towards the implementation of health innovation [3]. The lack of certainty in the interventions independent sustainability is one of the leading factors responsible for this resistance [4]. Electronic health apps may hold great promise for better health tracking [5], providing education [6], changing and enforcing health behaviors [7], and monitoring treatment adherence [8], however despite these benefits, they are still not being used [9]. This can be attributed due to nine key design barriers that are outlined in Chindalo et al. literature review [9] (**Table 1**).

To ensure that resources being spent on an application are adequately being used and the above barriers are addressed, the needs, wants and expectations of the health apps primary stakeholders should be evaluated [10]. However, it is this lack of stakeholder consideration within app design that builds the three prime issues with innovation, which we describe below [1, 9].

to be meaningless

**Table 1.** Design barriers associated with decreased health app usage (adapted from [9]).

Patients/end-users are less likely to use an app when it conflicts with information from their clinician. They will not feel confident in the content

Patients/end-users often have lower levels of health literacy. They require technology to be adapted to their needs or the app will not be used

Moving towards Sustainable Electronic Health Applications

http://dx.doi.org/10.5772/intechopen.75040

53

end-users. Manual data input further complicates this process, as it is

If the app information has no beneficiary tie to the patient (e.g. cannot order diagnostic testing or prescribe medications) then the content

regimen should be used daily and in accordance with their prescribed treatment. It the health app does not require daily use, this can reduce treatment adherence as the patient will not get into the habit of using it

savings or social approval) for a patient/end-user to utilize the tool is

end-users care, the clinician would be more likely to promote its use. However, if there is no functional value for the data, both clinicians and

or analyze the meaning behind the data, it comes useless

If the data collected cannot be combined with previous medical information the context required for analysis will be lost, leaving the data

Health apps may collect large amounts of data, but if they cannot visualize

provided or its functionality

3. Manual data input required Treatment regimens are already perceived as complex by patients/

becomes useless

5. Daily app use not required Health apps aiming to help patients/end-users with their treatment

6. Lack of incentives to use Any source of change is viewed as burdensome, thus, if no incentive (cost

present, they are less likely to use it 7. Data collected not valued by clinician If the data collected brings forward information important for the patient/

end-users/patients will not use the tool

exhaustive and error-prone

sufficiently

The overall hype of innovation, and mHealth solutions, has led developers into a cycle where app ideas centered on addressing patient challenges seem to forget about the patient once in development [11]. Consequently, this lack of end-user engagement has led health app usage to fall to 2% amongst patients at hospitals in the US [11]. The low percentage for health app usage may seem surprising, but when a tool does not suit the needs or capabilities of the end-user, the percentage becomes less surprising and more understandable. Findings reveal that patients with chronic disease, such as heart failure and diabetes, have positive attitude towards using mobile technology if they are simple and effective [12].

**1. Poor end-user usability: who are you designing for?**

**Barrier Explanation**

1. Apps provide information conflicting from what is received from clinicians

2. Language used too complex for end-

4. Information provided has no value/

8. No way for physicians to use data

9. No way to integrate app data into electronic medical record (EMR) for

user health literacy level

meaningless data

collected

analysis or follow-up

These barriers have created a stigma around stakeholders investing in health app development. Currently, the perceived return on investment (ROI) with health apps remains low, as the issues with innovation remain high. The beneficial impact of health apps may seem promising, but from a sustainability standpoint, they fail to address the underlying question that is 'will these benefits outweigh the cost of its development?'


**Table 1.** Design barriers associated with decreased health app usage (adapted from [9]).

To ensure that resources being spent on an application are adequately being used and the above barriers are addressed, the needs, wants and expectations of the health apps primary stakeholders should be evaluated [10]. However, it is this lack of stakeholder consideration within app design that builds the three prime issues with innovation, which we describe below [1, 9].

#### **1. Poor end-user usability: who are you designing for?**

challenges associated with designing for sustainability [2]. In this chapter, we focus on addressing the main issues app publishers face during the design process. We then outline the key components that should be included to assure the sustainability of an electronic health app.

We define the issues with innovation by three main components that include (1) end-user usability, (2) clinician and informal caregiver (spouse, children, friend) input and (3) impact of agencies outside development. Many electronic health apps fail to consider these major factors in its design, which in turn is often what limits the sustainability of its use. We believe that these three factors are essential as it evaluates the health apps design according to the user, their main members of care and finally the environment it is used. If the health app being designed does not simplify or improve the current model of care, there is no incentive for its use. Instead, the benefits associated with the app will be overshadowed by its complications or drawbacks.

This leads to our section in the chapter on designing for sustainability. We start by signifying the importance of putting the end-user first and then introduce the information system research framework that help identify user needs, design preferences and potential barriers to increase health app adoption. This is followed by the next stage of designing for sustainability, where we outline the steps to get all the potential players on board in support for the new innovation. We highlight the inevitable resistance to change that will occur, and explain the concept of 'behavioral intention' to use a technology and how this will help improve health

Finally, for the purpose of long-term sustainability, we expand on preparing for the expected and unexpected, by evaluating change management plans and regulations in place during health app design. Towards the end of the chapter, we develop a market and feasibility analysis framework for the adoption and scalability of the health app on a national scale. This allows us to ensure all key factors have been addressed; leaving the app design and efficacy

Over the past decade, there have been a number of advancements within the healthcare industry, yet there is still a strong resistance present towards the implementation of health innovation [3]. The lack of certainty in the interventions independent sustainability is one of the leading factors responsible for this resistance [4]. Electronic health apps may hold great promise for better health tracking [5], providing education [6], changing and enforcing health behaviors [7], and monitoring treatment adherence [8], however despite these benefits, they are still not being used [9]. This can be attributed due to nine key design barriers that are out-

These barriers have created a stigma around stakeholders investing in health app development. Currently, the perceived return on investment (ROI) with health apps remains low, as the issues with innovation remain high. The beneficial impact of health apps may seem promising, but from a sustainability standpoint, they fail to address the underlying question

app sustainability.

52 eHealth - Making Health Care Smarter

to become unquestionable.

**2. Issues with innovation**

lined in Chindalo et al. literature review [9] (**Table 1**).

that is 'will these benefits outweigh the cost of its development?'

The overall hype of innovation, and mHealth solutions, has led developers into a cycle where app ideas centered on addressing patient challenges seem to forget about the patient once in development [11]. Consequently, this lack of end-user engagement has led health app usage to fall to 2% amongst patients at hospitals in the US [11]. The low percentage for health app usage may seem surprising, but when a tool does not suit the needs or capabilities of the end-user, the percentage becomes less surprising and more understandable.

Findings reveal that patients with chronic disease, such as heart failure and diabetes, have positive attitude towards using mobile technology if they are simple and effective [12]. However, the key issue here is that app developers seem to show greater motivation by the cleverness of the technology rather than the improvements in health outcomes, which often results in complicating the apps functionality [13]. Thus, in the eyes of the developer the app may seem effective, but they do not consider that the individual they are designing it for will not have the same understanding. As a result, apps will not meet user needs or capabilities, which in turn leads to the development of the first six barriers highlighted in **Table 1**. This poor product usability can be attributed due to the lack of end-user involvement or input during the development process [11]. Some would argue the most successful health apps are those that address real-life challenges in the context that the patient lives. Therefore, to assure the sustainability of a health application the question the developer must ask is not 'does it solve a problem', but rather 'does it help the patient directly'. If an application is in any way a burden, or adds more effort into their treatment, it will not be used.

increase app use and sustainability would the use of data integration amongst EMRs and the respective health app. This would allow the data collected to be combined with current medical information, which would potentially improve clinical outcomes and future diagnosis [9]. However, as this is not a required component within health app development, it

Moving towards Sustainable Electronic Health Applications

http://dx.doi.org/10.5772/intechopen.75040

55

Nevertheless, one area of regulation all health apps must adhere to involves privacy and security of personal health information. Ensuring that the introduction of a health app does not threaten the privacy of the data obtained is one of the pillars for a sustainable health intervention. However, as many health apps are very development-focused these pivotal components are not acknowledged early on. This results in complicating the design process and restraining development in the later stage [10]. Similarly, as many workflow operations may be changed with the introduction of an electronic health app, failing to consider the necessary requirements early in design process leads implementation to become more disruptive than beneficial [18]. An effective health app design would identify the key organizational barriers and resistance points that may occur prior to actual implementation. Thus, the problems associated with security and workflow operations are built upon the underlying issue that they are not considered until the start of implementation. Furthermore, electronic health apps may have undeniable potential to improve health outcomes in a cost-effective manner, but the underlying issues with innovation are preventing their potential from being fully utilized [15]. Stakeholders may have different objectives for the outcome of a health app, but regardless, the app designer must still address their needs regarding usability, content, safety, clinical and cost-effectiveness accordingly. There is already a resistance for innovation in healthcare; therefore, to build a model for health app sustainability, we outline a series of frameworks to minimize the occurrence of these issues.

Prior to designing an electronic health application or any innovative health tool, we must consider the who, what, where, when and how of the intervention. Who will be using it? What will it do? Where and when will it be used? Lastly, how will it work? These questions allow the designer of the tool to recognize its primary stakeholders, the risks and costs of innovation and whether it will work with current operations in place [10]. When designing for sustainability the goal should be to bridge a practical solution for a prominent issue. Therefore, addressing these common questions has framed our guideline for moving towards sustainable electronic health applications. In this section, we start by identifying the end-user and their needs, followed by an outline to creating a user-centered health app, and finally end

Whether older adults with heart failure or adolescents with diabetes will use a health application, identifying the end-user and their needs at the start of the design process is pivotal for

has become a barrier, instead of a benefit, for health app usage (**Table 1**).

**3. Designing for sustainability**

**3.1. Putting the end-user first**

with the steps required to gain support for app implementation.

#### **2. Lack of clinician and informal caregiver input: What are you designing it for and how will it improve clinical outcomes?**

The primary objective for a health application is to improve clinical outcomes and reduce the level of work required, clinicians and informal caregivers (spouse, children, friend) play a pivotal role in establishing the criteria for these improvements [14]. Clinicians provide the complete medical background surrounding patient care as well as clinical workflow operations, while informal caregivers allow developers to have a magnified look into the day-to day challenges that prevent adherence and worsen symptoms [3, 14]. Both key members of care contribute substantively in increasing adherence, improving self-care, quality of life and outcomes for patients [14]. However, the reality is, numerous apps are/ have been developed with either none or some clinician and/or caregiver feedback, but the inclusion of both are pivotal to assure sustainability.

As every tool must have an objective for its development, the use of clinician and caregiver input provides developers with the necessary content to build their objective around. The lack of clinician and caregiver feedback in current health apps limits app efficacy and is responsible for design barriers seven and eight in **Table 1**. Without the consideration of both the *physiological* and *social* factors provided by clinicians and caregivers, health apps will continue to be designed around the question 'does it a solve a problem', and developers will inevitably fall short in the effectiveness of their app design [10].

#### **3. Fail to consider impact of agencies outside development: does it effect current operations?**

Aside from the lack of usability considerations, other factors including, government regulations and organization operations, are also commonly neglected. The development and implementation of healthcare innovations is bounded by a set of regulations that must be followed [15]. These regulations are set in place as a standard to prevent public health risk and improve patient safety. In the United States, the Food and Drug Administration (FDA) issued a draft for the regulation of mobile medical applications [16]. However, as many of the standards currently in place are set to regulate medical devices, a large group of apps still do not fall within the categories for regulation. This leaves them to be generated without regulatory precaution or guidance, which in turn promotes the development of less effective and integrative health apps [17]. For example, one key element that would increase app use and sustainability would the use of data integration amongst EMRs and the respective health app. This would allow the data collected to be combined with current medical information, which would potentially improve clinical outcomes and future diagnosis [9]. However, as this is not a required component within health app development, it has become a barrier, instead of a benefit, for health app usage (**Table 1**).

Nevertheless, one area of regulation all health apps must adhere to involves privacy and security of personal health information. Ensuring that the introduction of a health app does not threaten the privacy of the data obtained is one of the pillars for a sustainable health intervention. However, as many health apps are very development-focused these pivotal components are not acknowledged early on. This results in complicating the design process and restraining development in the later stage [10]. Similarly, as many workflow operations may be changed with the introduction of an electronic health app, failing to consider the necessary requirements early in design process leads implementation to become more disruptive than beneficial [18]. An effective health app design would identify the key organizational barriers and resistance points that may occur prior to actual implementation. Thus, the problems associated with security and workflow operations are built upon the underlying issue that they are not considered until the start of implementation.

Furthermore, electronic health apps may have undeniable potential to improve health outcomes in a cost-effective manner, but the underlying issues with innovation are preventing their potential from being fully utilized [15]. Stakeholders may have different objectives for the outcome of a health app, but regardless, the app designer must still address their needs regarding usability, content, safety, clinical and cost-effectiveness accordingly. There is already a resistance for innovation in healthcare; therefore, to build a model for health app sustainability, we outline a series of frameworks to minimize the occurrence of these issues.

#### **3. Designing for sustainability**

However, the key issue here is that app developers seem to show greater motivation by the cleverness of the technology rather than the improvements in health outcomes, which often results in complicating the apps functionality [13]. Thus, in the eyes of the developer the app may seem effective, but they do not consider that the individual they are designing it for will not have the same understanding. As a result, apps will not meet user needs or capabilities, which in turn leads to the development of the first six barriers highlighted in **Table 1**. This poor product usability can be attributed due to the lack of end-user involvement or input during the development process [11]. Some would argue the most successful health apps are those that address real-life challenges in the context that the patient lives. Therefore, to assure the sustainability of a health application the question the developer must ask is not 'does it solve a problem', but rather 'does it help the patient directly'. If an application

is in any way a burden, or adds more effort into their treatment, it will not be used.

**will it improve clinical outcomes?**

54 eHealth - Making Health Care Smarter

inclusion of both are pivotal to assure sustainability.

will inevitably fall short in the effectiveness of their app design [10].

**2. Lack of clinician and informal caregiver input: What are you designing it for and how** 

The primary objective for a health application is to improve clinical outcomes and reduce the level of work required, clinicians and informal caregivers (spouse, children, friend) play a pivotal role in establishing the criteria for these improvements [14]. Clinicians provide the complete medical background surrounding patient care as well as clinical workflow operations, while informal caregivers allow developers to have a magnified look into the day-to day challenges that prevent adherence and worsen symptoms [3, 14]. Both key members of care contribute substantively in increasing adherence, improving self-care, quality of life and outcomes for patients [14]. However, the reality is, numerous apps are/ have been developed with either none or some clinician and/or caregiver feedback, but the

As every tool must have an objective for its development, the use of clinician and caregiver input provides developers with the necessary content to build their objective around. The lack of clinician and caregiver feedback in current health apps limits app efficacy and is responsible for design barriers seven and eight in **Table 1**. Without the consideration of both the *physiological* and *social* factors provided by clinicians and caregivers, health apps will continue to be designed around the question 'does it a solve a problem', and developers

**3. Fail to consider impact of agencies outside development: does it effect current operations?**

Aside from the lack of usability considerations, other factors including, government regulations and organization operations, are also commonly neglected. The development and implementation of healthcare innovations is bounded by a set of regulations that must be followed [15]. These regulations are set in place as a standard to prevent public health risk and improve patient safety. In the United States, the Food and Drug Administration (FDA) issued a draft for the regulation of mobile medical applications [16]. However, as many of the standards currently in place are set to regulate medical devices, a large group of apps still do not fall within the categories for regulation. This leaves them to be generated without regulatory precaution or guidance, which in turn promotes the development of less effective and integrative health apps [17]. For example, one key element that would Prior to designing an electronic health application or any innovative health tool, we must consider the who, what, where, when and how of the intervention. Who will be using it? What will it do? Where and when will it be used? Lastly, how will it work? These questions allow the designer of the tool to recognize its primary stakeholders, the risks and costs of innovation and whether it will work with current operations in place [10]. When designing for sustainability the goal should be to bridge a practical solution for a prominent issue. Therefore, addressing these common questions has framed our guideline for moving towards sustainable electronic health applications. In this section, we start by identifying the end-user and their needs, followed by an outline to creating a user-centered health app, and finally end with the steps required to gain support for app implementation.

#### **3.1. Putting the end-user first**

Whether older adults with heart failure or adolescents with diabetes will use a health application, identifying the end-user and their needs at the start of the design process is pivotal for the next steps towards development [19, 20]. Nevertheless, although the importance behind end-user evaluation has been signified, various studies confirm the lack of health apps available suiting their needs and capabilities [1, 21–23]. In the Delphi study, a literature review was conducted overviewing the determinants of innovation in health care organizations [24]. Their results indicated that many innovation studies failed to adjust their strategies according to feedback obtained, or that the data on the determinants was insignificant as it came from non-users [24]. In this case, the study highlights that it is not enough to simply obtain random feedback, but we must obtain useful input and apply it into the design [24].

Often times, the benefits of the end product overshadow the content required for adequate usability, leaving both the app developer and the end-user at a disadvantage. For example, by simplifying intervention processes and health education it is estimated that this will improve clinical outcomes. However, what is not considered is that the sustainability of these benefits will only be seen when the app is user-friendly and end-users can independently utilize it with confidence [25, 26]. To get to this stage, developers must recognize what components intended users need, so it becomes both easy to use and useful. Thus, to accomplish this, a user-centered framework has been developed which we summarize below.

*3.1.2. Creating a user-centered design: ISR and end-user co-design*

product over the long-term [29, 31].

(3) rigor cycle [28].

**3.2. How to get everyone on board**

The foundation of a user-centered design is centralized on three major components, (1) understanding how the device will be used, (2) curating information relevant to the end-user and (3) framing the tool in the user's environmental context and lifestyle [4]. The ISR framework allows developers to assess the needs of the end-user while evaluating current interventions in place [28]. However, the co-design method moves one step further by using a participatory approach where end-users and primary stakeholders work together on all aspects of the health apps development [30, 31]. By using the ISR framework in parallel with the co-design method, we believe this iterative process will lead towards a more effective user-centered end

**Figure 1.** The ISR framework divided into three design science research cycles, (1) relevance cycle, (2) design cycle and

Moving towards Sustainable Electronic Health Applications

http://dx.doi.org/10.5772/intechopen.75040

57

Many electronic health app interventions fail to engage users in the design and usability stages [31]. In a systematic review of co-designed mHealth interventions, studies included patients in the development stages, but none assessed the intervention's effectiveness afterwards [31]. Conversely, in another study, users evaluated the interventions usability, but were not involved in its design [32]. The lack of fluidity between mHealth development and user input reduces end-user empowerment and overall app usability. The healthcare system is already burdened with various pre-mature innovation investments that have fallen short in its beneficial return. Therefore, from a sustainability standpoint, by using the ISR framework, this will allow all factors surrounding the end-user and the current knowledge base to be

One of the greatest obstacles towards developing sustainable electronic health interventions involves getting primary stakeholders in support for its development and implementation [3]. This challenge has been shaped due to the three paradoxes of innovation [33]. First, the uptake of the dubious and rejection of the good. The explosion of electronic health apps created a consumer fad where a number 'breakthrough' apps left individuals in regret and stakeholders reluctant to invest again. Second, the wisdom and failings of democracy. Working with professional groups can be effective to ensure implementation of a new technology,

covered, whereas the co-design aspect will be pivotal to assure its usability.

#### *3.1.1. Information system research (ISR) framework*

As many electronic health interventions are designed according to the current healthcare system processes, this limits their impact potential compared to those that involve end-user input [27]. The ISR framework uses three research cycles, (1) relevance cycle, (2) design cycle and (3) rigor cycle, to identify user needs, design preferences as well as any barriers that will prevent the uptake or sustained use of the app [27, 28].

In the *Relevance Cycle*, developers or researchers seek out to understand the end-user in the context of their environment [19]. It is the environment that shapes the specificities behind the arising problems, the purpose of this cycle is to provide the requirements for the health app, as well as set of criteria for users to evaluate its functionality [28, 29]. Thus, to meet the goals of this cycle focus group style sessions with intended stakeholders and end-users are commonly used [27, 29]. By the end of this cycle, we should be able to answer the question, 'Does this app improve the user's environment, and how?' [28].

The heart of development occurs during the *Design Cycle*, as the content from the relevance cycle is used to build the health app and evaluate it accordingly [28]. This cycle continues in an iterative manner, where a series of designs will be generated and evaluated against the respective user requirements, until all key components are addressed. The design of the app can move relatively quickly, however, it is the continual evaluation and feedback for refinements that challenges developers [29]. Nevertheless, end-users often stop using apps that do not immediately engage them, so by repeatedly conducting prototype testing with key stakeholders, this increases the expected usability and sustainability of the end product.

Finally, the *Rigor Cycle* is the background check of the ISR framework [29]. It reviews and evaluates the current knowledge base present within the desired applications domain [27, 28]. This enhances the degree of innovation for the health apps design. In many cases, this cycle is conducted after the relevance cycle to increase the overall effectiveness of the apps design [27] (**Figure 1**).

**Figure 1.** The ISR framework divided into three design science research cycles, (1) relevance cycle, (2) design cycle and (3) rigor cycle [28].

#### *3.1.2. Creating a user-centered design: ISR and end-user co-design*

the next steps towards development [19, 20]. Nevertheless, although the importance behind end-user evaluation has been signified, various studies confirm the lack of health apps available suiting their needs and capabilities [1, 21–23]. In the Delphi study, a literature review was conducted overviewing the determinants of innovation in health care organizations [24]. Their results indicated that many innovation studies failed to adjust their strategies according to feedback obtained, or that the data on the determinants was insignificant as it came from non-users [24]. In this case, the study highlights that it is not enough to simply obtain random

Often times, the benefits of the end product overshadow the content required for adequate usability, leaving both the app developer and the end-user at a disadvantage. For example, by simplifying intervention processes and health education it is estimated that this will improve clinical outcomes. However, what is not considered is that the sustainability of these benefits will only be seen when the app is user-friendly and end-users can independently utilize it with confidence [25, 26]. To get to this stage, developers must recognize what components intended users need, so it becomes both easy to use and useful. Thus, to accomplish this, a

As many electronic health interventions are designed according to the current healthcare system processes, this limits their impact potential compared to those that involve end-user input [27]. The ISR framework uses three research cycles, (1) relevance cycle, (2) design cycle and (3) rigor cycle, to identify user needs, design preferences as well as any barriers that will

In the *Relevance Cycle*, developers or researchers seek out to understand the end-user in the context of their environment [19]. It is the environment that shapes the specificities behind the arising problems, the purpose of this cycle is to provide the requirements for the health app, as well as set of criteria for users to evaluate its functionality [28, 29]. Thus, to meet the goals of this cycle focus group style sessions with intended stakeholders and end-users are commonly used [27, 29]. By the end of this cycle, we should be able to answer the question, 'Does this app

The heart of development occurs during the *Design Cycle*, as the content from the relevance cycle is used to build the health app and evaluate it accordingly [28]. This cycle continues in an iterative manner, where a series of designs will be generated and evaluated against the respective user requirements, until all key components are addressed. The design of the app can move relatively quickly, however, it is the continual evaluation and feedback for refinements that challenges developers [29]. Nevertheless, end-users often stop using apps that do not immediately engage them, so by repeatedly conducting prototype testing with key stake-

Finally, the *Rigor Cycle* is the background check of the ISR framework [29]. It reviews and evaluates the current knowledge base present within the desired applications domain [27, 28]. This enhances the degree of innovation for the health apps design. In many cases, this cycle is conducted after the relevance cycle to increase the overall effectiveness of the apps design

holders, this increases the expected usability and sustainability of the end product.

feedback, but we must obtain useful input and apply it into the design [24].

user-centered framework has been developed which we summarize below.

*3.1.1. Information system research (ISR) framework*

56 eHealth - Making Health Care Smarter

prevent the uptake or sustained use of the app [27, 28].

improve the user's environment, and how?' [28].

[27] (**Figure 1**).

The foundation of a user-centered design is centralized on three major components, (1) understanding how the device will be used, (2) curating information relevant to the end-user and (3) framing the tool in the user's environmental context and lifestyle [4]. The ISR framework allows developers to assess the needs of the end-user while evaluating current interventions in place [28]. However, the co-design method moves one step further by using a participatory approach where end-users and primary stakeholders work together on all aspects of the health apps development [30, 31]. By using the ISR framework in parallel with the co-design method, we believe this iterative process will lead towards a more effective user-centered end product over the long-term [29, 31].

Many electronic health app interventions fail to engage users in the design and usability stages [31]. In a systematic review of co-designed mHealth interventions, studies included patients in the development stages, but none assessed the intervention's effectiveness afterwards [31]. Conversely, in another study, users evaluated the interventions usability, but were not involved in its design [32]. The lack of fluidity between mHealth development and user input reduces end-user empowerment and overall app usability. The healthcare system is already burdened with various pre-mature innovation investments that have fallen short in its beneficial return. Therefore, from a sustainability standpoint, by using the ISR framework, this will allow all factors surrounding the end-user and the current knowledge base to be covered, whereas the co-design aspect will be pivotal to assure its usability.

#### **3.2. How to get everyone on board**

One of the greatest obstacles towards developing sustainable electronic health interventions involves getting primary stakeholders in support for its development and implementation [3]. This challenge has been shaped due to the three paradoxes of innovation [33]. First, the uptake of the dubious and rejection of the good. The explosion of electronic health apps created a consumer fad where a number 'breakthrough' apps left individuals in regret and stakeholders reluctant to invest again. Second, the wisdom and failings of democracy. Working with professional groups can be effective to ensure implementation of a new technology, however, solely relying on their cooperation results in killing the product before it is even complete. Third, health systems are not able to keep up. Innovation results in causing change in an organization, but this creates challenges that innovators are often not prepared for and results in causing more disruption than improvement [33].

In order to move past these challenges we must be address the following questions:


The first question allows us to determine whether the electronic health app will be worth the investment. The second and third questions are key for its sustainability, as it recognizes components pivotal for a smooth implementation procedure [33]. Breaking down the barriers built by failed innovative interventions may be difficult, but it is beyond worthwhile to develop an effective health app. Answering these questions will be essential when developing a plan to obtain stakeholder support, thus we further discuss the specific steps to break down the resistance and prepare for the change below.

*3.2.1.2. Diffusion of innovation (DOI) theory*

tackling this challenge will be key for the apps long-term success.

technology to determine their behavioral intention and actual usage potential [34].

*3.2.1.3. Presenting the benefits: results to support longevity*

solution that highlights the justification for its development.

The DOI theory is used to increase the adoption of technology [3]. This is one of the oldest theories, yet it remains to be continually used during innovative design. The DOI theory states that an organization will consider a technology to be innovative if it is perceived as new and relevant. It proposes that four main elements contribute to the diffusion of an idea, (1) the idea itself, (2) communication channels, (3) time and (4) a social system [36]. Similar to the TAM, the DOI theory suggests highlighting the perceived advantage and relevance of the innovations development. Therefore, in the context of electronic health apps, the benefit the of the app should be communicated amongst various influencers, and then stakeholder support must be established before the app can be readily adopted. As health apps must be widely adopted amongst all stakeholders and end-users before it can become self-sustaining,

**Figure 2.** The technology acceptance model assessing the end-users perceived ease of use and perceived usefulness of

Moving towards Sustainable Electronic Health Applications

http://dx.doi.org/10.5772/intechopen.75040

59

In both the TAM and the DOI theory, the key message to help obtain stakeholder support was to simply present why the app is beneficial for them. Why should they care about what we are developing? The real question developers should ask is, 'How will it help them?' This leads into one of the key components towards breaking down the barrier of stakeholder resistance and moving towards designing a sustainable health app. To adequately present the benefits

In many cases, a health app may be the first of its kind or an idea may be an advancement of a previous intervention. Regardless of whether a pilot study has been conducted to support the benefits of its use or its benefits have yet to be evaluated, the success of the product can be supported by answering the same question mentioned above, 'How will it help them'. We must outline what the problem currently is and why the development of this health app will help address it. It is not enough to state that a problem exists; it is the reasoning behind the

of an application we must have the appropriate evidence to support our claim.

#### *3.2.1. Battling the resistance to change*

With any type of change there is an inevitable build-up of resistance that is formed. This resistance is derived from the fear of failure, similar to the first paradox of innovation; executives and end-users do not want to waste their time with another unbeneficial intervention [3, 33]. With this in mind, assimilating the idea of putting a new intervention into practice will be uphill road to climb. Nevertheless, two models described below help shape the key factors and steps involved towards achieving this goal.

#### *3.2.1.1. Technology acceptance model (TAM)*

The TAM was developed to drive the use of new technology and increase its acceptance by assessing the end-users *perceived ease of use* (*PEOU*) and *perceived usefulness* (*PU*) [33, 34]. This model suggests that by clarifying that a new source of innovation will reduce the amount of effort required (PEOU) and enhance performance it will be more likely to be accepted amongst end-users and other key stakeholders [35]. Thus, in the context of health care, executives, clinicians and patients will find an electronic health app more useful and user-friendly if they are familiar with the technology [3, 34].

With this in mind, when designing an electronic health app, developers must understand what the stakeholders needs, wants and expectations are. Once this is discovered, we can adequately highlight how the health app will benefit each of them specifically. Finally, when a foundation of acceptance for the app has been established, appropriate training protocols should be instilled to prevent any former resistance from re-establishing (**Figure 2**).

**Figure 2.** The technology acceptance model assessing the end-users perceived ease of use and perceived usefulness of technology to determine their behavioral intention and actual usage potential [34].

#### *3.2.1.2. Diffusion of innovation (DOI) theory*

however, solely relying on their cooperation results in killing the product before it is even complete. Third, health systems are not able to keep up. Innovation results in causing change in an organization, but this creates challenges that innovators are often not prepared for and

**1.** What evidence is there that it will improve outcomes and how will it effect current operations?

The first question allows us to determine whether the electronic health app will be worth the investment. The second and third questions are key for its sustainability, as it recognizes components pivotal for a smooth implementation procedure [33]. Breaking down the barriers built by failed innovative interventions may be difficult, but it is beyond worthwhile to develop an effective health app. Answering these questions will be essential when developing a plan to obtain stakeholder support, thus we further discuss the specific steps to break down

With any type of change there is an inevitable build-up of resistance that is formed. This resistance is derived from the fear of failure, similar to the first paradox of innovation; executives and end-users do not want to waste their time with another unbeneficial intervention [3, 33]. With this in mind, assimilating the idea of putting a new intervention into practice will be uphill road to climb. Nevertheless, two models described below help shape the key factors

The TAM was developed to drive the use of new technology and increase its acceptance by assessing the end-users *perceived ease of use* (*PEOU*) and *perceived usefulness* (*PU*) [33, 34]. This model suggests that by clarifying that a new source of innovation will reduce the amount of effort required (PEOU) and enhance performance it will be more likely to be accepted amongst end-users and other key stakeholders [35]. Thus, in the context of health care, executives, clinicians and patients will find an electronic health app more useful and user-friendly

With this in mind, when designing an electronic health app, developers must understand what the stakeholders needs, wants and expectations are. Once this is discovered, we can adequately highlight how the health app will benefit each of them specifically. Finally, when a foundation of acceptance for the app has been established, appropriate training protocols

should be instilled to prevent any former resistance from re-establishing (**Figure 2**).

In order to move past these challenges we must be address the following questions:

results in causing more disruption than improvement [33].

**3.** How should it be monitored during introduction?

the resistance and prepare for the change below.

and steps involved towards achieving this goal.

if they are familiar with the technology [3, 34].

*3.2.1.1. Technology acceptance model (TAM)*

*3.2.1. Battling the resistance to change*

58 eHealth - Making Health Care Smarter

**2.** Will any additional support be needed before it can be introduced?

The DOI theory is used to increase the adoption of technology [3]. This is one of the oldest theories, yet it remains to be continually used during innovative design. The DOI theory states that an organization will consider a technology to be innovative if it is perceived as new and relevant. It proposes that four main elements contribute to the diffusion of an idea, (1) the idea itself, (2) communication channels, (3) time and (4) a social system [36]. Similar to the TAM, the DOI theory suggests highlighting the perceived advantage and relevance of the innovations development. Therefore, in the context of electronic health apps, the benefit the of the app should be communicated amongst various influencers, and then stakeholder support must be established before the app can be readily adopted. As health apps must be widely adopted amongst all stakeholders and end-users before it can become self-sustaining, tackling this challenge will be key for the apps long-term success.

#### *3.2.1.3. Presenting the benefits: results to support longevity*

In both the TAM and the DOI theory, the key message to help obtain stakeholder support was to simply present why the app is beneficial for them. Why should they care about what we are developing? The real question developers should ask is, 'How will it help them?' This leads into one of the key components towards breaking down the barrier of stakeholder resistance and moving towards designing a sustainable health app. To adequately present the benefits of an application we must have the appropriate evidence to support our claim.

In many cases, a health app may be the first of its kind or an idea may be an advancement of a previous intervention. Regardless of whether a pilot study has been conducted to support the benefits of its use or its benefits have yet to be evaluated, the success of the product can be supported by answering the same question mentioned above, 'How will it help them'. We must outline what the problem currently is and why the development of this health app will help address it. It is not enough to state that a problem exists; it is the reasoning behind the solution that highlights the justification for its development.

Moreover, it is important to present the evidence supporting the benefits of the health app, but we must also present this support in the context of each stakeholder. Depending on the type of app being designed the stakeholders will differ, but to increase each claims value, we must understand the factors that will influence health app acceptance and evaluate them accordingly.

relationships. Thus, by incorporating the UTAUT within the health app design process, this will allow us to predict users' intention to adopt the app in an organizational context [38]. Nevertheless, as majority of health apps are focused on the users setting and the challenges

Moving towards Sustainable Electronic Health Applications

http://dx.doi.org/10.5772/intechopen.75040

61

Compared to the original UTAUT, the extended UTAUT2 has shown to improve the variance in behavioral intention and technology use [38]. The UTAUT2 builds on the core UTAUT principles around extrinsic motivation, and adds three other components to improve the pre-

**1. Hedonic motivation:** hedonic motivation is defined as how enjoyable the technology is to use. It is used to analyze the emotional and psychological aspect of the technology, which is often overlooked by most evaluative models. The functionality of a health app will only go so far in influencing technology acceptance; it is the user experience that ultimately determines its long-term use and sustainability. Therefore, by evaluating the users internal

**2. Price value:** the price value determines whether the benefit of the technology is greater than its monetary cost. If the price value is high then individuals feel the benefit of use is greater than the cost of investment. However, this is not always the case and it is pivotal to

**3. Habit:** habit is the degree individuals automatically perform tasks due to learnt behaviors [38]. This construct was added to the UTAUT2 as it helps assess whether user-activity will be sustained. Often times, there is a fall out in health app usage due to the tasks becoming burdensome. Thus, if we were able to make these tasks more like a reflex than an extra step, this would reduce the effort required and improve technology acceptance (**Figure 3**).

With the addition of these factors to the UTAUT2, this helps tailor the health app evaluation to users in their context, which in turn helps improve its overall acceptance (**Table 2**). In many cases, it is these factors that prevent the sustainability of a health app. App publishers are focused on eliciting a behavior change or improving clinical outcomes, so they tend forget about the individual in their context. To reduce the resistance to change and move towards acceptance, it is the responsibility of the app publisher to ensure that the tool they are introducing is not only effective, but also it is 'fun', affordable, easy and relatable, or else it will simply not be used. Therefore, we believe by using the UTAUT2 principles during health app design, this will allow for the development of a more effective product, and will lead into a

The implementation of any new source of innovation will come with many challenges that are both expected and unexpected, however, being prepared for both is what ensures optimal sustainability. We highlight below two of the major areas that result in impeding health app implementation, which is (1) regulations and (2) change management. Both are essential to incorporate during the design plan of a health app, we describe the detailed steps we recommend to tackle

they face, we recommend the usage of the more updated UTAUT2.

satisfaction, this will result in improving technology usage [38, 39].

assess this aspect of health app design to ensure its sustainability [38].

diction of behavioral intention, which we describe below.

smoother transition phase during implementation.

*3.2.2. Preparing for the expected and the unexpected*

both areas effectively.

#### *3.2.1.4. Testing health app acceptance: unified theory of acceptance and use of technology (UTAUT)*

After assessing stakeholder needs and presenting the advantages associated with the health app, one of the next steps towards sustainability is to evaluate whether the proposed solution will be accepted amongst various users [37]. The UTAUT is a technology acceptance model commonly used to predict a user's behavioral intention to use a technology [37]. This model is based on four key components, [38].


These four concepts have a direct influence on the behavioral intention to use a health app. Age, gender, experience and voluntariness are also associated with indirectly influencing behavioral intention and technology usage, as they moderate the four UTAUT component relationships. Thus, by incorporating the UTAUT within the health app design process, this will allow us to predict users' intention to adopt the app in an organizational context [38]. Nevertheless, as majority of health apps are focused on the users setting and the challenges they face, we recommend the usage of the more updated UTAUT2.

Compared to the original UTAUT, the extended UTAUT2 has shown to improve the variance in behavioral intention and technology use [38]. The UTAUT2 builds on the core UTAUT principles around extrinsic motivation, and adds three other components to improve the prediction of behavioral intention, which we describe below.


With the addition of these factors to the UTAUT2, this helps tailor the health app evaluation to users in their context, which in turn helps improve its overall acceptance (**Table 2**). In many cases, it is these factors that prevent the sustainability of a health app. App publishers are focused on eliciting a behavior change or improving clinical outcomes, so they tend forget about the individual in their context. To reduce the resistance to change and move towards acceptance, it is the responsibility of the app publisher to ensure that the tool they are introducing is not only effective, but also it is 'fun', affordable, easy and relatable, or else it will simply not be used. Therefore, we believe by using the UTAUT2 principles during health app design, this will allow for the development of a more effective product, and will lead into a smoother transition phase during implementation.

#### *3.2.2. Preparing for the expected and the unexpected*

Moreover, it is important to present the evidence supporting the benefits of the health app, but we must also present this support in the context of each stakeholder. Depending on the type of app being designed the stakeholders will differ, but to increase each claims value, we must understand the factors that will influence health app acceptance and evaluate them accordingly.

After assessing stakeholder needs and presenting the advantages associated with the health app, one of the next steps towards sustainability is to evaluate whether the proposed solution will be accepted amongst various users [37]. The UTAUT is a technology acceptance model commonly used to predict a user's behavioral intention to use a technology [37]. This model

**1. Performance expectancy:** providing an incentive to use a technology is key to ensure its acceptance. Performance expectancy is defined as the extent that the technology will benefit the end-user in completing a certain task. It is expected that increasing the health app's beneficial value, this will increase users behavioral intention. Therefore, to present these benefits, performance expectancy is constructed by four main evaluative criteria: (1) perceived usefulness: how much they will believe the technology will improve their performance, (2) extrinsic motivation: what other valued outcomes (money, fame) they will receive from its use, (3) job fit: how suitable technology is to increase performance and (4)

relative advantage: benefit of new technology compared to what it will cost [39].

**2. Effort expectancy:** ease of use is a critical component of technology acceptance. Effort expectancy refers to how easy or difficult it is viewed to use the technology. Past technology acceptance models, such as TAM and DOI, have signified how applications that are simpler to use are more often accepted [40]. Thus, to reduce effort expectancy and increase

acceptance rates, health apps should be less complex and instead easier to use [39].

a significant impact on its adoption and usage [40].

**3. Social influence:** in many cases, the decision making process is influenced by specific individuals or the social norm. Social influence is the degree a user perceives that key individuals (family, friends) believe the use of the technology is important. This can be caused by informational influence where information from other people impact a decision or it can be normative influence where a user conforms their decision according to what is defined as 'acceptable' according to a certain group or situation. Regardless, social influence can be an ultimate determinate regarding the overall acceptance and usage of a health app [39]. **4. Facilitating conditions:** to ensure technology acceptance, users must feel its implementation is both feasible and realistic. Facilitating conditions is the degree individuals believe that existing organization and technical infrastructure is present to support its usage. These conditions can vary depending on a health app's objective, but regardless, they have

These four concepts have a direct influence on the behavioral intention to use a health app. Age, gender, experience and voluntariness are also associated with indirectly influencing behavioral intention and technology usage, as they moderate the four UTAUT component

*3.2.1.4. Testing health app acceptance: unified theory of acceptance and use of technology* 

*(UTAUT)*

60 eHealth - Making Health Care Smarter

is based on four key components, [38].

The implementation of any new source of innovation will come with many challenges that are both expected and unexpected, however, being prepared for both is what ensures optimal sustainability. We highlight below two of the major areas that result in impeding health app implementation, which is (1) regulations and (2) change management. Both are essential to incorporate during the design plan of a health app, we describe the detailed steps we recommend to tackle both areas effectively.

To assure that only the approved individuals can access the data, an authentication procedure should be implemented. With most health data, data encryption and a respective

Depending on the level of consent and the data obtained, most personal data should remain confidential and possibly anonymous if used for public-health purposes [17].

Higher degrees of data security protocols can be implemented into health app designs, however, we believe that by incorporating these three aspects of privacy and security, this will make the app more desirable for both its end-users and its respective stakeholders during development. Ultimately, being prepared with the proper security measures gives stakeholders the confidence in the product, and will ease the process of change

> PE1 I find the health app useful in my daily life PE2 Using the health app increases my chances of achieving things that are important to me PE3 Using the health app helps me accomplish things

PE4 Using the health app increases my productivity

EE2 My interaction with the health app is clear and

EE4 It is easy for me to become skillful at using the

SI2 People who influence my behavior think that I

SI3 People whose opinions that I value prefer that I

FC1 I have the resources necessary to use the health

FC2 I have the knowledge necessary to use the health

FC4 I can get help from others when I have difficulties

FC3 The health app is compatible with other

should use the health app

should use the health app

more quickly

Effort expectancy EE1 Learning how to use the health app is easy for me.

understandable EE3 I find the health app easy to use

health app Social influence SI1 People who are important to me think that I

use the game

technologies I use

using the health app

app

app

**1 2 3 4 5**

Moving towards Sustainable Electronic Health Applications

http://dx.doi.org/10.5772/intechopen.75040

63

login passcode are usually required.

management.

Performance expectancy

Facilitating conditions

(3) Confidentiality and Anonymity—Who knows it's my data

**Figure 3.** The UTAUT2 model with addition of hedonic motivation motivation, price value and habit to determine behavioral intention and technology use. All factors contribute in influencing behavioral intention to use except for 'facilitating conditions' and 'habit' which directing effect use behavior [38].

#### *3.2.2.1. What regulations?*

When implementing any new intervention into the health care industry there are a series of regulations that must be reviewed [15]. The FDA and Health Canada have issued a set of restrictions when developing an electronic application used as a medical device; however, most health apps do not fall under this category [17]. Nevertheless, one aspect of regulation all health apps must oblige to involves privacy and security. As many health apps are mobile phone based, this creates a challenging situation where more data can be obtained but data privacy is not secure. Policymakers are still in the works of establishing specific criteria required for patient safety, however we have listed a series of components that should be included within the health apps design to protect data integrity and prevent any unexpected threats.

(1) Data Sharing and Consent Management—Who can share my data

All data shared must have consent, as well as meet the Health Insurance Portability and Accountability Act (HIPAA) standards for data sharing [17].

(2) Access Control and Authentication—Who can access data

To assure that only the approved individuals can access the data, an authentication procedure should be implemented. With most health data, data encryption and a respective login passcode are usually required.

(3) Confidentiality and Anonymity—Who knows it's my data

*3.2.2.1. What regulations?*

62 eHealth - Making Health Care Smarter

unexpected threats.

When implementing any new intervention into the health care industry there are a series of regulations that must be reviewed [15]. The FDA and Health Canada have issued a set of restrictions when developing an electronic application used as a medical device; however, most health apps do not fall under this category [17]. Nevertheless, one aspect of regulation all health apps must oblige to involves privacy and security. As many health apps are mobile phone based, this creates a challenging situation where more data can be obtained but data privacy is not secure. Policymakers are still in the works of establishing specific criteria required for patient safety, however we have listed a series of components that should be included within the health apps design to protect data integrity and prevent any

**Figure 3.** The UTAUT2 model with addition of hedonic motivation motivation, price value and habit to determine behavioral intention and technology use. All factors contribute in influencing behavioral intention to use except for

All data shared must have consent, as well as meet the Health Insurance Portability and

(1) Data Sharing and Consent Management—Who can share my data

Accountability Act (HIPAA) standards for data sharing [17].

(2) Access Control and Authentication—Who can access data

'facilitating conditions' and 'habit' which directing effect use behavior [38].

Depending on the level of consent and the data obtained, most personal data should remain confidential and possibly anonymous if used for public-health purposes [17].

Higher degrees of data security protocols can be implemented into health app designs, however, we believe that by incorporating these three aspects of privacy and security, this will make the app more desirable for both its end-users and its respective stakeholders during development. Ultimately, being prepared with the proper security measures gives stakeholders the confidence in the product, and will ease the process of change management.



we will need instill that the perceived usefulness will remain to be stronger. Phase 2, manage the change, focuses on supporting the individuals impacted by the change. With respect to the implementation of health apps, this phase would be heavily focused on the end-users and what additional training that may be required to increase its perceived ease of use and prevent resistance from re-establishing. Phase 3, reinforcing change, evaluates the current status of the intervention to identify any issues and address them accordingly. This phase is key for the long-term sustainability of the health app as it ensures the change is maintained and provides evidence to support its benefits [41]. By following this three-phase change management plan, the health app design process will move more efficiently and will lead to a higher adoption rate.

**Figure 4.** The change management process indicated by three phases (1) preparing for change, (2) managing change and

Moving towards Sustainable Electronic Health Applications

http://dx.doi.org/10.5772/intechopen.75040

65

Moving from end-user usability and primary stakeholder needs, app publishers must also consider the underlying factors for national adoption. State or province specific regulations, needs, and resources available are likely to differ across a country. Thus, for optimal scalability, technology should be agile enough to utilize the infrastructure that is in place without

For instance, in Canada, policies regarding home-care differ between provinces, leaving some provinces with funding and others with none. When introducing a health app with similar objectives to home-care, developers must consider how the app can be used independently and concurrently, without losing its value. To ensure long-term adoption and scalability, the ideal health app should be designed to seamlessly coincide with the current practices that are in place. We describe below what evaluative steps we recommend to maximize national health app potential.

We have described components should be included to please various stakeholders and ensure a smooth transition process. However, true sustainability stems from its capability to be seamlessly used in multiple settings. To accomplish this, we must conduct a market analysis to understand what interventions and regulations are currently in place, followed by a feasibility assessment to

determine if those factors will jeopardize its implementation on a national scale [2, 42].

**4. Ensuring adoption and scalability**

**4.1. Long-term sustainability: where will it work?**

causing disruption.

(3) reinforcing change [41].

**Table 2.** UTAUT2 questionnaire used to evaluate health app acceptance amongst end-users.

#### *3.2.2.2. Change management is key for smooth sailing*

With the introduction of any new intervention this will result in causing changes in workflow that in some cases may be disruptive. These challenges are expected, but to assure the implementation process runs smoothly, a set of change management plans can be pre-developed [41]. To develop a proper strategy, we must consider three primary levels of change management.


All three levels of change management can be addressed through the three-phase change management process (**Figure 4**) [41]. Phase 1, prepare for the change, we must determine who will be impacted by the change and what level of support we will need to smoothly move forward. During this phase, it will be key to understand all the challenges that will be in play, as

**Figure 4.** The change management process indicated by three phases (1) preparing for change, (2) managing change and (3) reinforcing change [41].

we will need instill that the perceived usefulness will remain to be stronger. Phase 2, manage the change, focuses on supporting the individuals impacted by the change. With respect to the implementation of health apps, this phase would be heavily focused on the end-users and what additional training that may be required to increase its perceived ease of use and prevent resistance from re-establishing. Phase 3, reinforcing change, evaluates the current status of the intervention to identify any issues and address them accordingly. This phase is key for the long-term sustainability of the health app as it ensures the change is maintained and provides evidence to support its benefits [41]. By following this three-phase change management plan, the health app design process will move more efficiently and will lead to a higher adoption rate.

## **4. Ensuring adoption and scalability**

*3.2.2.2. Change management is key for smooth sailing*

HM1 Using the health app is fun HM2 Using the health app is enjoyable

Price value PV1 The health app is reasonably priced

me

life

good value Habit HT1 The use of the health app has become a habit for

HT3 I must use the health app

Behavioral intention BI1 I intend to continue using the health app in the future

HM3 Using the health app is very entertaining

HT2 I am addicted to using the health app

**Table 2.** UTAUT2 questionnaire used to evaluate health app acceptance amongst end-users.

HT4 Using the health app has become natural to me

BI2 I will always try to use the health app in my daily

BI3 I plan to continue to use the health app frequently

PV2 The health app is a good value for the money PV3 At the current price, the health app provides a

are to successfully make the transition.

management.

Hedonistic motivation

64 eHealth - Making Health Care Smarter

With the introduction of any new intervention this will result in causing changes in workflow that in some cases may be disruptive. These challenges are expected, but to assure the implementation process runs smoothly, a set of change management plans can be pre-developed [41]. To develop a proper strategy, we must consider three primary levels of change

**1 2 3 4 5**

(1) Individual change management: how people experience the change and what their needs

(2) Organizational/initiative change management: what are the primary groups that will directly be impacted and what changes will need to be completed respectively.

(3) Enterprise change management capability: this is the overall organizational approach to managing change. It usually involves executive discussion, and reflects the organizations capability to allow and embrace change. This level is key as top-down support has a direct

All three levels of change management can be addressed through the three-phase change management process (**Figure 4**) [41]. Phase 1, prepare for the change, we must determine who will be impacted by the change and what level of support we will need to smoothly move forward. During this phase, it will be key to understand all the challenges that will be in play, as

relationship on how a change will be perceived at the lower levels.

Moving from end-user usability and primary stakeholder needs, app publishers must also consider the underlying factors for national adoption. State or province specific regulations, needs, and resources available are likely to differ across a country. Thus, for optimal scalability, technology should be agile enough to utilize the infrastructure that is in place without causing disruption.

For instance, in Canada, policies regarding home-care differ between provinces, leaving some provinces with funding and others with none. When introducing a health app with similar objectives to home-care, developers must consider how the app can be used independently and concurrently, without losing its value. To ensure long-term adoption and scalability, the ideal health app should be designed to seamlessly coincide with the current practices that are in place. We describe below what evaluative steps we recommend to maximize national health app potential.

#### **4.1. Long-term sustainability: where will it work?**

We have described components should be included to please various stakeholders and ensure a smooth transition process. However, true sustainability stems from its capability to be seamlessly used in multiple settings. To accomplish this, we must conduct a market analysis to understand what interventions and regulations are currently in place, followed by a feasibility assessment to determine if those factors will jeopardize its implementation on a national scale [2, 42].

Depending on the health app being developed, different factors contributing to the market will be evaluated. However, before evaluating any components, the objective of the health app must be distinguished. This will narrow the scope of market factors that we will need to consider. We will then need to determine whether the health app will be an improvement of a current intervention or if it will stand alone in its functionality. Once this has been decided, we can readily evaluate what market factors are at play by answering a set of questions as outlined in **Figure 5**. Moving forward, depending on the con-current usage of the health app the steps may differ. Nonetheless, both ends of the evaluation will move in the same direction towards evaluating app feasibility, by determining what regulations/policies are in place, how cost-effective the app will be and most importantly if there is any funding available to support its implementation (**Figure 5**). This evaluative framework regarding the market forces and feasibility will determine whether the app will be sustainable across national regulations, or if it requires substantial changes in its design construct.

important from a usability standpoint, but it shapes whether it will be sustainable across the country. Designing for sustainability may be a tiresome process, but if executed properly, the

[1] Research2guidance. App Developer Economics 2015. 2015. Available from: http://research-2guidance.com/r2g/r2g--App-Developer-Economics-2015.pdf [Accessed: November 27,

[2] Proctor E, Luke D, Calhoun A, McMillen C, Brownson R, McCrary S, Padek M.Sustainability of evidence-based healthcare: Research agenda, methodological advances, and infrastructure support. Implementation Science. 2015;**10**(88):1-13. DOI: 10.1186/s13012-015-0274-5

[3] Thakur R, Hsu SHY, Fontenot G. Innovation in healthcare: Issues and future trends. Journal

[4] Marshall D, Demers C, O'Brien B, Guyatt G. Economic evaluation. In: Dicenso A, Guyatt G, Ciliska D, editors. Evidence-Based Nursing: A Guide to Clinical Practice. St Louis: Elsevier

[5] Abril EP. Tracking myself: Assessing the contribution of mobile technologies for selftrackers of weight, diet, or exercise. Journal of Health Communication. 2016;**21**(6):638-646

[6] Hartin PJ, Nugent CD, McClean SI, Cleland I, Tschanz JT, Clark CJ, Norton MC. The empowering role of mobile apps in behavior change interventions: The gray matters

[7] Carter MC, Burley VJ, Nykjaer C, Cade JE. Adherence to a smartphone application for weight loss compared to website and paper diary: Pilot randomized controlled trial.

[8] Cook KA, Modena BD, Simon RA. Improvement in asthma control using a minimally burdensome and proactive smartphone application. The Journal of Allergy and Clinical

randomized controlled trial. JMIR mHealth uHealth. 2016;**4**(3):e93

Journal of Medical Internet Research. 2013;**15**(4):e32

Immunology. In Practice. 2016;**4**(4):730-737

\*

Moving towards Sustainable Electronic Health Applications

http://dx.doi.org/10.5772/intechopen.75040

67

and Catherine Demers1

end results will bring more value than anticipated.

\*Address all correspondence to: demers@hhsc.ca

of Business Research. 2012;**65**:562-569

Mosby; 2005. pp. 298-317

, Karim Keshavjee<sup>2</sup>

1 McMaster University, Hamilton, Canada

2 InfoClin Inc, Toronto, Canada

**Author details**

Sahr Wali1

**References**

2017]

In the health app industry, it is common for enthusiasm regarding innovation to overshadow the drive to tackle sustainability, let alone feasibility. In this chapter, we focused on the highlighting the importance beyond designing sustainable electronic health applications. We started by addressing what barriers regarding sustainability were present and outlined what steps were needed to avoid them. The importance of identifying the needs, wants and expectations of the health apps primary stakeholders were also signified, as we understand that it is not only

**Figure 5.** Market evaluation and feasibility analysis framework to help determine national sustainability of the intervention. The analysis begins by highlighting the health apps objective and determining whether it will improve current processes or introduce a new one. This leads to evaluating how the health app will function with current practices in place and if it will be suitable moving forward. In this framework, each block represents a form of analysis that is conducted, which leads to the final block that assesses overall feasibility and/or potential health app re-design/ modifications needed.

important from a usability standpoint, but it shapes whether it will be sustainable across the country. Designing for sustainability may be a tiresome process, but if executed properly, the end results will bring more value than anticipated.

#### **Author details**

Depending on the health app being developed, different factors contributing to the market will be evaluated. However, before evaluating any components, the objective of the health app must be distinguished. This will narrow the scope of market factors that we will need to consider. We will then need to determine whether the health app will be an improvement of a current intervention or if it will stand alone in its functionality. Once this has been decided, we can readily evaluate what market factors are at play by answering a set of questions as outlined in **Figure 5**. Moving forward, depending on the con-current usage of the health app the steps may differ. Nonetheless, both ends of the evaluation will move in the same direction towards evaluating app feasibility, by determining what regulations/policies are in place, how cost-effective the app will be and most importantly if there is any funding available to support its implementation (**Figure 5**). This evaluative framework regarding the market forces and feasibility will determine whether the app will be sustainable across national regu-

In the health app industry, it is common for enthusiasm regarding innovation to overshadow the drive to tackle sustainability, let alone feasibility. In this chapter, we focused on the highlighting the importance beyond designing sustainable electronic health applications. We started by addressing what barriers regarding sustainability were present and outlined what steps were needed to avoid them. The importance of identifying the needs, wants and expectations of the health apps primary stakeholders were also signified, as we understand that it is not only

**Figure 5.** Market evaluation and feasibility analysis framework to help determine national sustainability of the intervention. The analysis begins by highlighting the health apps objective and determining whether it will improve current processes or introduce a new one. This leads to evaluating how the health app will function with current practices in place and if it will be suitable moving forward. In this framework, each block represents a form of analysis that is conducted, which leads to the final block that assesses overall feasibility and/or potential health app re-design/

modifications needed.

lations, or if it requires substantial changes in its design construct.

66 eHealth - Making Health Care Smarter

Sahr Wali1 , Karim Keshavjee<sup>2</sup> and Catherine Demers1 \*

\*Address all correspondence to: demers@hhsc.ca

1 McMaster University, Hamilton, Canada

2 InfoClin Inc, Toronto, Canada

#### **References**


[9] Chindalo P, Keshavjee K, Karim A, Brahmbhatt R, Saha N. Health apps by design: A reference architecture for mobile engagement. International Journal of Handheld Computing Research. 2016;**7**(2):34-43

[24] Fleuren M, Wiefferink K, Paulussen T. Determinants of innovation within health care organizations: Literature review and Delphi study. International Journal for Quality in

Moving towards Sustainable Electronic Health Applications

http://dx.doi.org/10.5772/intechopen.75040

69

[25] Gandhi S, Chen S, Hong L, Sun K, Gong E, Li C, Yang LL, Schwalm JD. Effect of mobile health interventions on secondary prevention of cardiovascular disease: Systematic

[26] Bonoto BC, de Araujo VE, de Lemos LL, Godman B, Bennie M, Diniz LM, Junior AA.Efficacy of mobile apps to support the care of patients with analysis of randomized controlled trials.

[27] Schnall R, Rojas M, Bakken S, Brown W, Carballo-Dieguez A, Carry M, Gelaude D, Patterson Mosley D, Travers J. A user-centered model for designing consumer mobile health (mHealth) applications (apps). Journal of Biomedical Informatics. 2016;**60**:243-251

[28] Hevner ARA. Three cycle view of design science research. Scandinavian Journal of

[29] Cronholm S, Gobel H. Evaluation of the information systems research framework: Empirical evidence from a design science research project. The Electronic Journal

[30] Donetto S, Pierri P, Tsianakas V, Robert G. Experience-based co-design and healthcare improvement: Realising participatory design in the public sector. International Journal

[31] Eyles H, Jull A, Dobson R, Firestone R, Whittaker R, Te Morenga L, Goodwin D, Ni Mhurch C. Co-design of mhealth delivered interventions: A systematic review to assess

[32] Lloyd T, Buck H, Foy A, Black S, Pinter A, Pogash R, Eismann B, Balaban E, Chan J, Kunselman A, Smyth J, Boehmer J. The Penn state heart assistant: A pilot study of a webbased intervention to improve self-care of heart failure patients. Health Informatics

[33] Dixon-Woods M, Amalberti R, Goodman S, Bergman B, Glasziou P. Problems and promises of innovation: Why healthcare needs to rethink its love/hate relationship with

[34] Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of informa-

[35] Holden RJ, Karsh BT. The technology acceptance model: Its past and its future in health

[36] Rogers EM. Diffusion of Innovations. 5th ed. New York, NY: Free Press; August 2003.

[37] Venkatesh V, Morris MG, Davis GB, Davis FD. User acceptance of information technology:

key methods and processes. Current Nutrition Report. 2016;**5**(30):160-167

review and meta-analysis. Canadian Journal of Cardiology. 2017;**33**:210-231

Health Care. 2004;**16**(2):107-123

JMIR mHealth uHealth. 2017;**5**(3):e4

Information Systems. 2007;**19**(2):87-92

Information Systems Evaluation. 2016;**19**(3):158-168

for All Aspects of Design. 2015;**18**(2):227-248

the new. BMJ Quality and Safety. 2011;**20**(1):47-51

tion technology. MIS Quarterly. 1989;**13**(3):319-340

care. Journal of Biomedical Informatics. 2010;**43**(1):159-172

Toward a unified view. MIS Quarterly. 2003;**27**(3):425-478

Journal. 2017:1-12. Web

pp. xv-xxi


[24] Fleuren M, Wiefferink K, Paulussen T. Determinants of innovation within health care organizations: Literature review and Delphi study. International Journal for Quality in Health Care. 2004;**16**(2):107-123

[9] Chindalo P, Keshavjee K, Karim A, Brahmbhatt R, Saha N. Health apps by design: A reference architecture for mobile engagement. International Journal of Handheld Computing

[10] Omachonu VK. Innovation in healthcare delivery systems: A conceptual framework.

[11] Patient Empowerment Network. Designing with the patient in mind. 2016. Retrieved from: https://powerfulpatients.org/2016/01/26/designing-with-the-patient-in-mind/ [12] Kuerbis A, Mulliken A, Muench F, Moore AA, Gardner D. Older adults and mobile technology: Factors that enhance and inhibit utilization in the context of behavioral health.

[13] Patient Empowerment Network. Are we ready for mobile health. 2016. Retrieved from:

[14] Kotz D, Gunter CA, Kumar S, Weiner JP. Privacy and security in mobile health: A research

[15] Faulkner A, Kent J.Innovation and regulation in human implant technologies: Developing

[16] Hollis C, Moriss R, Martin J, Amani S, Cotton R, Denis M, Lewis S. Technological innovations in mental healthcare: Harnessing the digital revolution. British Journal of Psychiatry.

[17] Barton AJ. The regulation of mobile health applications. BMC Medicine. 2012;**10**(46):1-4

[18] Avgar AC, Litwin AS, Pronovost PJ. Drivers and barriers in health IT adoption. Applied

[19] Aidemark J, Asskenas L, Nygardh A, Stromberg A. User involvement in the co-design of self-care support systems for heart failure patients. Procedia Computer Science. 2015;**64**:

[20] McCurdie T, Taneva S, Casselman M, Yeung M, McDaniel C, Ho W, Cafazzo J. mHealth consumer apps: The case for user-centered design. Horizons: Technology and Design.

[21] Brahmbhatt R, Niakan S, Saha N, Tewari A, Pirani A, Keshavjee N, Mugambi D, Alavi N, Keshavjee K. Diabetes mHealth apps: Can they be effective. Studies in Health Technology

[22] Martinez-Perez B, Torre-Diez I, Lopez-Coronado M, Herreros-Gonzalex J. Mobile apps

[23] Creber RMM, Maurer MS, Reading M, Hiraldo G, Hickey KT, Iribarren S. Review and analysis of existing mobile phone apps to support heart failure symptom monitoring and self-care management using the mobile application rating scale (MARS). JMIR mHealth

in cardiology: Review. JMIR mhealth uhealth. 2013;**1**(2):e15

https://powerfulpatients.org/2016/01/04/are-we-ready-for-mobile-health/

comparative approaches. Social Science & Medicine. 2001;**53**(7):895-913

Research. 2016;**7**(2):34-43

68 eHealth - Making Health Care Smarter

Public Sector Innovation Journal. 2010;**15**(1):1-20

Mental Health and Addiction Research. 2017;**2**(2):1-11

agenda. IEEE Computer. 2016;**49**(6):22-30

Clinical Informatics. 2012;**3**:488-500

and Informatics. 2017;**234**:49-53

uHealth. 2016;**4**(2):e74

2015;**206**(4):263-265

118-124

2012;**46**:49-56


[38] Venkatesh V, Thong JYL, Xu X. Consumer acceptance and use of information technology: Extending the unified theory of acceptance and use of technology. MIS Quarterly. 2012;**36**(1):157-178

**Chapter 5**

**Provisional chapter**

**The Practice of Medicine in the Age of Information**

**The Practice of Medicine in the Age of Information** 

Regarding the practice of medicine, we have to face the chances and challenges of all aspects of e-Health; however, the term "digitalization" is broader and spanning all aspects. However, the digitalization of medicine offers solutions for pressing problem. We know the factors that lead to excellence in medicine. Without the right amount of experiences based on a solid ground of knowledge, no excellence is achievable. The problem, nowadays, is that due to restriction of working hours, to the goals of life ("life-work-balance") and the restrictions of Generation Y, almost no education in medicine is spanning the needed 10,000 h experiences in practical medicine for excellence. Therefore, we will see the fading of medical excellence, if we could not establish other systems. A solution can be searched in decision-support systems. However, a requirement before is the need of a digitalization of all health data. We surely do not have enough evidences for all aspects of the practice of medicine, the intuition is fading away and therefore, we have to look around for other solutions. Big data generated by the digitalization of all health data could be the problem solver. In combination, IT will help to improve the quality of care.

**Keywords:** quality, practice of medicine, digitalization, health care, intuition, big data,

Nowadays we found a lot of changes of the frame works for all professions. The terms "digitalization," "Internet of Things," "disruption," and "big data" cover some aspects of these changes on different hierarchical levels. Regarding the practice of medicine, we have to face the chances and challenges of all aspects of e-Health; however, the term "digitalization" is broader and spanning all aspects [1]. In the following chapter, I try to highlight some aspects

DOI: 10.5772/intechopen.75482

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

distribution, and reproduction in any medium, provided the original work is properly cited.

**Technology**

**Abstract**

**1. Introduction**

**Technology**

Mark Dominik Alscher and Nico Schmidt

Mark Dominik Alscher and Nico Schmidt

http://dx.doi.org/10.5772/intechopen.75482

randomized controlled trial (RCT)

especially in the face of practical medicine.

Additional information is available at the end of the chapter

Additional information is available at the end of the chapter


#### **The Practice of Medicine in the Age of Information Technology The Practice of Medicine in the Age of Information Technology**

DOI: 10.5772/intechopen.75482

Mark Dominik Alscher and Nico Schmidt Mark Dominik Alscher and Nico Schmidt

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.75482

#### **Abstract**

[38] Venkatesh V, Thong JYL, Xu X. Consumer acceptance and use of information technology: Extending the unified theory of acceptance and use of technology. MIS Quarterly.

[39] Huang CY, Kao YS. UTAUT2 based predictions of factors influencing the technology acceptance of phablets by DNP. Mathematical Problems in Engineering. 2015;**2015**:1-23

[40] Chang A.UTAUT and UTAUT2: A review and agenda for future research. The Winners. 2012;

[41] Prosci. What is change management. 2018. Retrieved from: https://www.prosci.com/

[42] Swerissen H, Crisp BR. The sustainability of health promotion interventions for different levels of social organization. Health Promotion International. 2004;**19**(1):123-130

change-management/what-is-change-management

2012;**36**(1):157-178

70 eHealth - Making Health Care Smarter

**13**(2):106-114

Regarding the practice of medicine, we have to face the chances and challenges of all aspects of e-Health; however, the term "digitalization" is broader and spanning all aspects. However, the digitalization of medicine offers solutions for pressing problem. We know the factors that lead to excellence in medicine. Without the right amount of experiences based on a solid ground of knowledge, no excellence is achievable. The problem, nowadays, is that due to restriction of working hours, to the goals of life ("life-work-balance") and the restrictions of Generation Y, almost no education in medicine is spanning the needed 10,000 h experiences in practical medicine for excellence. Therefore, we will see the fading of medical excellence, if we could not establish other systems. A solution can be searched in decision-support systems. However, a requirement before is the need of a digitalization of all health data. We surely do not have enough evidences for all aspects of the practice of medicine, the intuition is fading away and therefore, we have to look around for other solutions. Big data generated by the digitalization of all health data could be the problem solver. In combination, IT will help to improve the quality of care.

**Keywords:** quality, practice of medicine, digitalization, health care, intuition, big data, randomized controlled trial (RCT)

#### **1. Introduction**

Nowadays we found a lot of changes of the frame works for all professions. The terms "digitalization," "Internet of Things," "disruption," and "big data" cover some aspects of these changes on different hierarchical levels. Regarding the practice of medicine, we have to face the chances and challenges of all aspects of e-Health; however, the term "digitalization" is broader and spanning all aspects [1]. In the following chapter, I try to highlight some aspects especially in the face of practical medicine.

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### **2. Excellence in the practice of medicine**

We know the factors that lead to excellence in medicine. Without the right amount of experiences based on a solid ground of knowledge, no excellence is achievable. However, without knowledge and without the ability for processing the experiences, excellence cannot be found [2]. Therefore, experiences alone are not the key to excellence [3]. It is the combination of genius, knowledge base, and experiences. Simon and Chase have found for master chess player that 10 years of practice are necessary [4]. In that time period, around 50,000 different patterns are stored that are essential for the intuitive part of the game. Ericsson was able to extend these findings to musicians and physicians [2]. For excellence, you must be worked in practice for about 10,000 h.

However, let us have a closer look to excellence in the practice of medicine. Colleagues were asked what makes the difference regarding excellence in their colleagues [5]. They gave four factors:


the doctor will face a problem, he cannot solve by intuition, then he has to go to system 2 (analysis), that is time-consuming. Ideal would be an automatic adjustment of both systems

The Practice of Medicine in the Age of Information Technology

http://dx.doi.org/10.5772/intechopen.75482

73

The problem nowadays is, that due to restriction of working hours, to the goals of life ("lifework-balance") and the restrictions of Generation Y, almost no education in medicine is spanning the mentioned 10,000 h experiences in practical medicine. Therefore, we will see the fading of medical excellence, if we could not establish other systems to replace system 1. The United States was leading in guarantee excellence in medicine by education since the days of Flexner [7]. The base of the excellence, however, was the precise and holistic learning of the

A solution can be searched in decision-support systems. However, before we need a digitalization of all health data. Electronic patient records are the key to accomplish that task [12]. Since we do not have the holistic solution, interfaces, and standards for that interfaces are highly needed [13, 14]. The analysis of data from the health system, often called "big data," is confined on a solution to those issues. Algorithms should give help in a world of overwhelming information load for the doctor and should release him from the pain of long working

Measurement of quality in diagnostic reasoning and decision-making is the evidence-based decision, based on precise evidences generated, at best, in randomized controlled trial (RCTs) [15]. RCTs revolutionized medicine and yearly we get the evidences from 40,000 trials [16]. However, we surely do not have enough evidences for all aspects of the practice of medicine,

in all decision-making during the process of diagnostic reasoning (**Figure 1**).

hours. This is the promise of big data from the viewpoint of the practitioner.

**3. Big data versus randomized controlled trials (RCT)**

medical knowledge base [8]. This is under pressure [9–11].

**Figure 1.** Combination of System 1 and System 2 in diagnostic reasoning [6].

**4.** Continuous learning of clinical practice.

In internal medicine, the process of diagnostic reasoning is key to excellence [6]. This process can be divided into two parts:

System 1: Intuition

System 2: Analysis

Both have different properties (**Table 1**). For the doctor in the practice of health care, due to time pressure and the big number of patients and problems a typical doctor has to treat, handle, and solve in short time, the system 1 (intuition) is the only practical way, that is in most part confined to the amount of experiences made before. However, from time to time,


**Table 1.** Medical decision-making after Croskerry [6].

The Practice of Medicine in the Age of Information Technology http://dx.doi.org/10.5772/intechopen.75482 73

**Figure 1.** Combination of System 1 and System 2 in diagnostic reasoning [6].

**2. Excellence in the practice of medicine**

72 eHealth - Making Health Care Smarter

**1.** Extensive practical experiences.

can be divided into two parts:

System 1: Intuition System 2: Analysis

**4.** Continuous learning of clinical practice.

**2.** Master in taking the medical history from patients.

We know the factors that lead to excellence in medicine. Without the right amount of experiences based on a solid ground of knowledge, no excellence is achievable. However, without knowledge and without the ability for processing the experiences, excellence cannot be found [2]. Therefore, experiences alone are not the key to excellence [3]. It is the combination of genius, knowledge base, and experiences. Simon and Chase have found for master chess player that 10 years of practice are necessary [4]. In that time period, around 50,000 different patterns are stored that are essential for the intuitive part of the game. Ericsson was able to extend these findings to musicians and physicians [2]. For excellence, you must be worked in practice for about 10,000 h.

However, let us have a closer look to excellence in the practice of medicine. Colleagues were asked what makes the difference regarding excellence in their colleagues [5]. They gave four factors:

**3.** Precise and critical integration of all information into the process of diagnosis reasoning.

In internal medicine, the process of diagnostic reasoning is key to excellence [6]. This process

Both have different properties (**Table 1**). For the doctor in the practice of health care, due to time pressure and the big number of patients and problems a typical doctor has to treat, handle, and solve in short time, the system 1 (intuition) is the only practical way, that is in most part confined to the amount of experiences made before. However, from time to time,

**System 1: Intuition System 2: Analysis** Experimental-induction Hypothesis-deduction Rational limitations Rational unlimited

Heuristic Normative Pattern recognition Robust decisions Modular ("hard-wired") decisions Critical-logical thinking

Guidance by pattern recognition Decision trees Gut feeling Logical reasoning

**Table 1.** Medical decision-making after Croskerry [6].

the doctor will face a problem, he cannot solve by intuition, then he has to go to system 2 (analysis), that is time-consuming. Ideal would be an automatic adjustment of both systems in all decision-making during the process of diagnostic reasoning (**Figure 1**).

The problem nowadays is, that due to restriction of working hours, to the goals of life ("lifework-balance") and the restrictions of Generation Y, almost no education in medicine is spanning the mentioned 10,000 h experiences in practical medicine. Therefore, we will see the fading of medical excellence, if we could not establish other systems to replace system 1. The United States was leading in guarantee excellence in medicine by education since the days of Flexner [7]. The base of the excellence, however, was the precise and holistic learning of the medical knowledge base [8]. This is under pressure [9–11].

A solution can be searched in decision-support systems. However, before we need a digitalization of all health data. Electronic patient records are the key to accomplish that task [12]. Since we do not have the holistic solution, interfaces, and standards for that interfaces are highly needed [13, 14]. The analysis of data from the health system, often called "big data," is confined on a solution to those issues. Algorithms should give help in a world of overwhelming information load for the doctor and should release him from the pain of long working hours. This is the promise of big data from the viewpoint of the practitioner.

#### **3. Big data versus randomized controlled trials (RCT)**

Measurement of quality in diagnostic reasoning and decision-making is the evidence-based decision, based on precise evidences generated, at best, in randomized controlled trial (RCTs) [15]. RCTs revolutionized medicine and yearly we get the evidences from 40,000 trials [16]. However, we surely do not have enough evidences for all aspects of the practice of medicine,

today—Deep Neural Networks are applied, they generate their functionality by converting the information contained in thousands and even millions or billions of examples into a highly nonlinear mathematical model [23–25]. In some ways, this is similar to the human learning process and hence may be the appropriate tool for conserving human experience in

The Practice of Medicine in the Age of Information Technology

http://dx.doi.org/10.5772/intechopen.75482

75

With medical data becoming increasingly available in a digitalized form, not only by clinical trials but rather from every day medical practice, the databases grow in size and in depth [26]. This provides the possibility for algorithms not only to become more precise in their predictions but also to become more general in a sense that they can include a huge variety of factors in their decision process. Some aspects of a fading in intuition may so be replaceable by recommendation systems not based on thousands of hours of experience but of millions of decisions already made by experts in the past. Of course, so far, humans are still more efficient in their learning capabilities, but the pure scale of data contained in the algorithmic models may overcome this lack in efficiency. Nevertheless, recommendation systems driven by machine learning algorithms are more likely to complement a physician's intuition in the very near future, than to completely

In a similar manner, classical RCTs can benefit from big data. In semi-supervised learning, for example, datasets of known and unknown outcomes are considered. Here one task is to identify corner cases in the data in order to increase the model accuracy most efficiently [27]. Such methods could help to identify suitable candidates for clinical trials to make their results more robust. Another opportunity in this direction is the analysis of already trained models by feature extraction methods, which may generate promising hypotheses for further investigation in RCT. Finally it is to mention, that although RCTs form the scientific backbone of medicine, factors like publication bias and poor reproducibility rates show, that permanent monitoring of standard clinical practice is necessary [28, 29]. The ability of machine learning algorithms to constantly adapt to new situations, they seem to be predes-

The excellence in the practice of medicine was bonded to long working hours and a relative small knowledge base. Nowadays, the framework for the practice of medicine is protecting the rights of the individual regarding long working hours, however, in combination with the fast growing of the knowledge base, the practice of medicine is under pressure regarding quality and excellence. The big data approach could help find a solution; however, the digitalization of all data used in the practice of medicine before are warranted. In combination, IT

such models.

surrogate it.

tined for such a controlling task.

will help to improve the quality of care.

There is no "conflict of interest" declaration necessary.

**Conflict of interest**

**5. Conclusion**

**Figure 2.** Big data (yellow) and randomized controlled trials (RCT = blue) [17].

the intuition is fading away and therefore, we have to look around for other solutions. Big data could be the problem solver [16]. The combination of RCTs and big data could be key to assure the quality and excellence in medicine in the future (**Figure 2**) [17].

## **4. Impact of modern machine learning in todays and future medicine practice**

Machine learning (ML) technology already has a big impact on today's medical practice. From image classification in radiographs, over epidemic outbreak prediction to genome sequencing, computer algorithms become more and more prominent in modern medicine [18–21]. Projects of major companies like IBM Watson or Google's Deep Mind Health as well as numerous smaller privately and publicly funded research projects are pushing forward to close the gaps between medicine, mathematics, and computer science [22].

For narrow applications with clear regulations, ML algorithms already outperform human capabilities by far and even create new unseen strategies as recently shown by Google's Alpha Go Zero [22]. In this case, a self-trained algorithm mastered the game—more complex than chess—in less than 3 days. Although it has to be mentioned, that the reinforcement learning strategy used in this case might not be suitable in many medical applications, it is an indication of the potential of modern learning algorithms.

A closer look on the technology gives hope, that similar algorithms will not be limited to single and narrow applications, but rather can evolve in order to address the challenges of preserving the knowledge and intuition of experts and improve the quality of RCTs.

The reason for the great potential of machine learning lies in the nature of most of these new algorithms. Regardless whether Random Forests, Support Vector Machine or—most popular today—Deep Neural Networks are applied, they generate their functionality by converting the information contained in thousands and even millions or billions of examples into a highly nonlinear mathematical model [23–25]. In some ways, this is similar to the human learning process and hence may be the appropriate tool for conserving human experience in such models.

With medical data becoming increasingly available in a digitalized form, not only by clinical trials but rather from every day medical practice, the databases grow in size and in depth [26]. This provides the possibility for algorithms not only to become more precise in their predictions but also to become more general in a sense that they can include a huge variety of factors in their decision process. Some aspects of a fading in intuition may so be replaceable by recommendation systems not based on thousands of hours of experience but of millions of decisions already made by experts in the past. Of course, so far, humans are still more efficient in their learning capabilities, but the pure scale of data contained in the algorithmic models may overcome this lack in efficiency. Nevertheless, recommendation systems driven by machine learning algorithms are more likely to complement a physician's intuition in the very near future, than to completely surrogate it.

In a similar manner, classical RCTs can benefit from big data. In semi-supervised learning, for example, datasets of known and unknown outcomes are considered. Here one task is to identify corner cases in the data in order to increase the model accuracy most efficiently [27]. Such methods could help to identify suitable candidates for clinical trials to make their results more robust. Another opportunity in this direction is the analysis of already trained models by feature extraction methods, which may generate promising hypotheses for further investigation in RCT. Finally it is to mention, that although RCTs form the scientific backbone of medicine, factors like publication bias and poor reproducibility rates show, that permanent monitoring of standard clinical practice is necessary [28, 29]. The ability of machine learning algorithms to constantly adapt to new situations, they seem to be predestined for such a controlling task.

#### **5. Conclusion**

the intuition is fading away and therefore, we have to look around for other solutions. Big data could be the problem solver [16]. The combination of RCTs and big data could be key to

Machine learning (ML) technology already has a big impact on today's medical practice. From image classification in radiographs, over epidemic outbreak prediction to genome sequencing, computer algorithms become more and more prominent in modern medicine [18–21]. Projects of major companies like IBM Watson or Google's Deep Mind Health as well as numerous smaller privately and publicly funded research projects are pushing forward to close the gaps

For narrow applications with clear regulations, ML algorithms already outperform human capabilities by far and even create new unseen strategies as recently shown by Google's Alpha Go Zero [22]. In this case, a self-trained algorithm mastered the game—more complex than chess—in less than 3 days. Although it has to be mentioned, that the reinforcement learning strategy used in this case might not be suitable in many medical applications, it is an indica-

A closer look on the technology gives hope, that similar algorithms will not be limited to single and narrow applications, but rather can evolve in order to address the challenges of

The reason for the great potential of machine learning lies in the nature of most of these new algorithms. Regardless whether Random Forests, Support Vector Machine or—most popular

preserving the knowledge and intuition of experts and improve the quality of RCTs.

**4. Impact of modern machine learning in todays and future medicine** 

assure the quality and excellence in medicine in the future (**Figure 2**) [17].

**Figure 2.** Big data (yellow) and randomized controlled trials (RCT = blue) [17].

between medicine, mathematics, and computer science [22].

tion of the potential of modern learning algorithms.

**practice**

74 eHealth - Making Health Care Smarter

The excellence in the practice of medicine was bonded to long working hours and a relative small knowledge base. Nowadays, the framework for the practice of medicine is protecting the rights of the individual regarding long working hours, however, in combination with the fast growing of the knowledge base, the practice of medicine is under pressure regarding quality and excellence. The big data approach could help find a solution; however, the digitalization of all data used in the practice of medicine before are warranted. In combination, IT will help to improve the quality of care.

#### **Conflict of interest**

There is no "conflict of interest" declaration necessary.

#### **Author details**

Mark Dominik Alscher<sup>1</sup> \* and Nico Schmidt2


#### **References**

[1] Alscher MD. Medicine in the digital age: Let's take the opportunities! Deutsche Medizinische Wochenschrift. 2017;**142**(5):313

[14] Heinze O, Birkle M, Koster L, Bergh B. Architecture of a consent management suite and integration into IHE-based regional health information networks. BMC Medical

The Practice of Medicine in the Age of Information Technology

http://dx.doi.org/10.5772/intechopen.75482

77

[15] Windeler J, Antes G, Behrens J, Donner-Banzhoff N, Lelgemann M. Randomised controlled trials (RCTs). Zeitschrift für Evidenz, Fortbildung und Qualität im Gesundheitswesen.

[16] Angus DC. Fusing randomized trials with big data: The key to self-learning health care

[17] Sim I. Two ways of knowing: Big data and evidence based medicine. Annals of Internal

[18] DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data.

[19] Akgul CB, Rubin DL, Napel S, Beaulieu CF, Greenspan H, Acar B. Content-based image retrieval in radiology: Current status and future directions. Journal of Digital Imaging.

[20] Kircher M, Stenzel U, Kelso J. Improved base calling for the Illumina Genome Analyzer

[21] Penedo MG, Carreira MJ, Mosquera A, Cabello D. Computer-aided diagnosis: A neural-network-based approach to lung nodule detection. IEEE Transactions on Medical

[22] Gibney E. What Google's winning Go algorithm will do next. Nature. 2016;**531**(7594):

[23] Basu S, Kumbier K, Brown JB, Yu B. Iterative random forests to discover predictive and stable high-order interactions. Proceedings of the National Academy of Sciences of the United States of America. January 19, 2018;**2017**:11236. (published ahead of print) [24] Goodfellow I, Yoshua B, Courville A. Deep Learning. Cambridge: MIT press; Springer-

[25] Steinwart I, Andreas C. Support Vector Machines. New York: Springer-Verlag, Springer

[26] Bellazzi R, Blaz Z. Predictive data mining in clinical medicine: Current issues and guide-

[27] Cohn D, Rich C, McCallum A.Semi-supervised clustering with user feedback. Constrained Clustering: Advances in Algorithms, Theory, and Applications. 2003;**4**(1):17-32

[28] Mobley A, Linder SK, Braeuer R, Ellis LM, Zwelling L. A survey on data reproducibility in cancer research provides insights into our limited ability to translate findings from the

[29] Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. Publication bias in clinical research.

using machine learning strategies. Genome Biology. 2009;**10**(8):R83

Verlag New York. 2016. http://www.springer.com/978-0-387-77241-7

lines. International Journal of Medical Informatics. 2008;**77**(2):81-97

laboratory to the clinic. PLoS One. 2013;**8**(5):e63221

systems? Journal of the American Medical Association. 2015;**314**(8):767-768

Informatics and Decision Making. 2011;**11**:58

2008;**102**(5):321-325

2011;**24**(2):208-222

284-285

Medicine. 2016;**164**:262-263

Imaging. 1998;**17**(6):872-880

Science & Business Media; 2008

Lancet. 1991;**337**(8746):867-872

Nature Genetics. 2011;**43**(5):491-498


[14] Heinze O, Birkle M, Koster L, Bergh B. Architecture of a consent management suite and integration into IHE-based regional health information networks. BMC Medical Informatics and Decision Making. 2011;**11**:58

**Author details**

**References**

1022-1028

Mark Dominik Alscher<sup>1</sup>

76 eHealth - Making Health Care Smarter

\* and Nico Schmidt2

[1] Alscher MD. Medicine in the digital age: Let's take the opportunities! Deutsche

[2] Ericsson KA. Deliberate practice and the acquisition and maintenance of expert performance in medicine and related domains. Academic Medicine. 2004;**79**:S70-S81

[3] Galton F. Hereditary Genius: An Inquiry into Its Laws and Consequences. London:

[5] Mylopoulos M, Lohfeld L, Norman GR, Dhaliwal G, Eva KW. Renowned physicians' perceptions of expert diagnostic practice. Academic Medicine. 2012;**87**(10):1413-1417

[6] Croskerry P. A universal model of diagnostic reasoning. Academic Medicine. 2009;**84**:

[7] Flexner A. Medical education in the United States and Canada. Carnegie Foundation for

[8] Stern DT, Papadakis M. The developing physician—Becoming a professional. New

[9] Greenberg A, Verbalis JG, Amin AN, Burst VR, Chiodo 3rd JA, Chiong JR, et al. Current treatment practice and outcomes. Report of the hyponatremia registry. Kidney

[10] Bickel J, Brown AJ. Generation X: Implications for faculty recruitment and development

[11] Eckleberry-Hunt J, Tucciarone J. The challenges and opportunities of teaching "genera-

[12] Kühn S, Haas P. Elektronische Patientenakte: Von der Dokumentenakte zur feingranu-

[13] Henderson ML, Dayhoff RE, Titton CP, Casertano A. Using IHE and HL7 conformance to specify consistent PACS interoperability for a large multi-center enterprise. Journal of

in academic health centers. Academic Medicine. 2005;**80**:205-210

tion y". Journal of Graduate Medical Education. 2011;**3**(4):458-461

laren Akte. Dtsch Arztebl International. 2008;**105**(18):20

Healthcare Information Management. 2006;**20**(3):47-53

Originally published in 1869: Julian Friedman Publishers; 1979. p. 1869

[4] Simon HA, Chase WG. Skill in chess. American Scientist. 1973;**61**:394-403

\*Address all correspondence to: dominik.alscher@rbk.de

2 Bosch - Healthcare Solution GmbH, Waiblingen, Germany

Medizinische Wochenschrift. 2017;**142**(5):313

the Advancement of Teaching. 1910; Bulletin No. 4

England Journal of Medicine. 2006;**355**:1794-1799

International. 2015;**88**(1):167-177

1 Robert-Bosch-Hospital, Stuttgart, Germany


**Section 3**

**Applications**

**Section 3**
