**Meet the editor**

Professor Vladimir M. Koleshko is an esteemed scientist in the field of intelligent systems, solid state micro-nanoelectronics and integrated circuit technology. He made a fundamental scientific discovery in brain acoustoelectronic phenomena (USSR № 395, 18.02.1988), and his discovery on hyperconductivity of thin-film structures is protected by the patent "Thin-film struc-

tures of high-temperature superconductors" (Т=82K) in the Russian Federation. Professor Koleshko is the author of 9 scientific-and-innovative monographs (1978-1988) and 5 electronic books "Cognitive technology of consciousness" with 3D films: "Brain nanoelectronics" (Vol.1), "Control of objects by power of thought" (Vol.2), "Intelligent systems in biometrics" (Vol.3), "Intelligent sensory micro-nanosystems and networks" (Vol.4), "Cell phones, smartphones and ageing of organism" (Vol.5), as well as 35 booklets and educational-methodological textbooks for students, Master's and PhD students, 550 scientific articles in reviewed journals and 620 innovative patents.

Contents

**Preface IX** 

Kuodi Jian

Chapter 4 **Algorithm Selection:** 

Nhon Van Do

Chapter 7 **Logic of Integrity, Fuzzy Logic** 

Fatma Khanum Bunyatova

Chapter 8 **Morphosyntactic Linguistic Wavelets** 

Daniela López De Luise

**for Knowledge Management 167** 

Chapter 1 **Intelligent Systems in Technology** 

**of Precision Agriculture and Biosafety 3** 

Viacheslav A. Gulay and Yauhen A. Varabei

Chapter 3 **Efficiency of Knowledge Transfer by Hearing** 

Eiko Yamamoto and Hitoshi Isahara

Chapter 5 **Experiences and Obstacles in Industrial** 

Chapter 6 **Intelligent Problem Solvers in Education:** 

Chapter 2 **Knowledge Management in Bio-Information Systems 37** 

**a Conversation While Doing Something 67** 

**From Meta-Learning to Hyper-Heuristics 77**  Laura Cruz-Reyes, Claudia Gómez-Santillán, Joaquín Pérez-Ortega, Vanesa Landero, Marcela Quiroz and Alberto Ochoa

**Applications of Intelligent Systems 103**  Leonardo M. Reyneri and Valentina Colla

**Design Method and Applications 121** 

**and Knowledge Modeling for Machine Education 149** 

Vladimir M. Koleshko, Anatolij V. Gulay, Elena V. Polynkova,

### Contents

#### **Preface XI**



### Preface

Human progress is characterized by passing through information innovation intelligent society at present. Machines (systems) produced by the genius of man in the near future will not only be smarter than a man, but will also exceed its intelligent mind. Intelligent machines will be of different sizes, shapes and functionality, equipped with an initial program (technogene), and their ability to learn and perform operations will not only depend on the technogene, but also on what the machines will be trained for. All of this is conditioned by the intellectualization of all systems and technological processes that humankind realizes, using the paradigm of developing by which everything must become sensory and motoric, with the ability to make decisions. Smart machines help people to not only make use of their own intellect, but to also grow smarter themselves. Intelligent systems are able to self-train, make their own decisions to support management activities in financial institutions, economics, energetics, logistics, industrial, commercial and social systems, remotely piloted satellite monitoring systems of the broad-spectrum application and communication systems with remote distribution of intelligence for improvement of reliability of an intelligent system in whole. In addition to that, they can also be used for governing a state, control of a holding company, a concern or a firm, as well as for early recognition and prediction of sustainability and prolongation of life, for achieving the maximum increase of functional creative and cognitive human life activity and supporting personal and social safety.

Intelligent systems can be used as authoritative advisers/consultants for all sorts of questions, but will also be able to solve a large number of incipient problems that are a result of human interference, can acquire new knowledge operating with semantic, pragmatic, heuristic and hyper-heuristic features of intelligent information in the process of generating and approximating to a functional model of natural intelligence. They can also produce adaptive, self-learning, self-organizing cognitive systems making it possible to disclose new secrets of nature and produce even more intelligent devices, machines, technologies and productions.

An intelligent system is an automatic or automated system with a possibility of internal and external sensing, based on using artificial intelligence, and includes the following features:

	- Self-learning being able to not only execute underlying and designed-in functions and programs, but also have the ability to adapt them according to the task assigned
	- Self-organization an ability to change its structure and architecture according to the task assigned, or for the purpose of improvement in the process of selflearning, self-diagnostics and self-preservation
	- Capability of solving problems that standard methods and/or solution algorithms can not solve or are unknown

For the first time, the research presented in this book is that of scientists from many countries, like Argentina, Australia, Belarus, Brazil, Bulgaria, Czech Republic, Denmark, India, Iran, Italy, Japan, Greece, Mexico, Portugal, Russia, Slovakia, United Kingdom, USA and Vietnam. It will be useful to a wide range of readers, especially students, young scientists, engineers and businessmen/investors taking a great interest in innovations in the future.

#### **Prof. Vladimir Mikhailovich Koleshko**

Belarussian National Technical University, Mechanical Engineering Faculty, Department of Intelligent Systems, Minsk, The Republic of Belarus

### **Intelligent Systems in Technology of Precision Agriculture and Biosafety**

Vladimir M. Koleshko, Anatolij V. Gulay, Elena V. Polynkova, Viacheslav A. Gulay and Yauhen A. Varabei *Belarusian National Technical University / Dept. of Intelligent Systems Minsk, Belarus* 

#### **1. Introduction**

The XXI century is based on developments of up-to-date intelligent systems and selflearning wireless distributed sensory networks for different purposes of the application to make the whole of space surrounding us sensory and motoric but also for the health and human life maintenance, the improvement of a production status, an output quality, and the product biosafety. A bedrock principle underlying precision agriculture is a wide application of intelligent systems for the control and the assistance of decision making in technological operations of an agricultural production [1, 2]. Precise positioning of agricultural machines using satellite systems gives an opportunity to produce an intelligent system of the agrarian production with dosed applying fertilizers but also chemical weed and pest killers depending on information patterns in a specific spot of the tillable field for the sensory control. Microsensory intelligent systems on a chip "electronic eye" (e-eye) with a LED technology of the data acquisition let form soil light-colour information patterns fast to get a maximal quantity of quality products, foods or biomatters (blood, saliva, sweat, urine, tears, etc.) for the ecological, personal and social biosafety as well as real-time monitoring the human health. The LED technology represents an optical microtomography of functional states of bioobjects on a chip of the type e-eye. The intelligent control in the agro-industrial production offers an opportunity to generate information electronic maps, e.g., the distribution of nutrients and organic fertilizers applied in soil, virtual maps of crop yield taking into account the technological preparation of land for growing crops and micronutrients carried-out from this one with early taken crops, electronic satellite maps of field, electronic maps of the quality, the information-microbial biosafety of foodstuffs, the human health, and ecological environmental conditions. The distributed wireless sensory systems and networks with a self-learning software make for the development of intelligent precision agriculture including the information pattern recognition of an agrotechnical technology, agricultural products and external ecological conditions in a space of multidimensional sensory data. The use of intelligent information CIMLS (Continuous Intelligent Management and Life Cycle Support) technology with developed intelligent systems of data superprotection maintains and controls the life cycle of all the agricultural production.

Intelligent Systems in Technology of Precision Agriculture and Biosafety 3

Fig. 1. Function circuit of the electron-optical module "ISSE" for a sensory system "CDOT":

microdiodes control circuit; 3 - microphotodetector coupling; 4 – temperature monitoring

characterizes a reflection coefficient from a reference surface used for the calibration of the electronic optical module. The microprocessor-based device generating a soil sensory information pattern processes the output signal of the microphotodetector [4]. Using "ISSE" it is possible to analyze coefficients of absorption, refraction, light scattering, gradient change, and polarization but also coefficients of variation (intensity, amplitude, and phase) of the electromagnetic wave and a space-time field distribution. The obtained data of spectroscopic analysis enable to produce an information pattern of soil, agricultural products, foodstuff, and human biomatters. A gridded registrating unit periodic realizes the real-time satellite navigation and the control of soil parameters. The specifically developed software "ISSE" can be applied in an intelligent system "CDOT" (Control of Distribution of Organics and Temperature) on a chip "electronic eye" which is of interest in precision agriculture for the control of a soil humus-accumulative horizon at the depth of 20-30 to 180- 200 mm. A small intelligent sensory "mole" ("CDOT") includes "ISSE" placed in the metal sheathing with the stone and sunlight protection. The optical beam output to a controlled soil surface is realized by the use of the sapphire transparent coating as the extra hard material, so "CDOT" can be attached, e.g., to a mini-tractor or any other agricultural units. "ISSE" explores the ground at the depth of 5-10 cm for the detection of organic substances, moisture, temperature, colour, granulometric composition and for the analysis of the fertile topsoil and using the GPS (Global Positioning System) navigation defines rapidly how much exactly fertilizers have to be applied with the micromechatronic system in the specific field place in process of optimal motion of the mini-tractor with an attached drawbar hitch

The given depth of penetration of the multisensory system "CDOT" for topsoil copying is determined depending on structural features of the floor profile and on the location of the humus-accumulative horizon. The hydralift system of the mini-tractor is intended for the control of "CDOT" lifting and sinking actuators in soil. Positioning of the units is also based on data from ultrasonic, microwave, electrostatic sensory modules at the same time. The intelligent system "CDOT" fulfils data binding of a soil controlled information pattern to ground control points from a GPS receiver and stores obtained data in its memory for

1 – microcontroller for the control and information processing; 2 - light-emitting

circuit; 5 - secondary voltage source; 6 – COM-port connector

(Fig. 2) [1, 3, 4].

#### **2. Intelligent sensory systems and networks of precision agriculture**

#### **2.1 LED technology in precision agriculture**

The main principle of intelligent precision agriculture is the high-precision dosed fertilizer application in a specified small piece of the ground depending in a soil physical-chemical status (colour, structure, organics content, moisture, temperature) for an equal distribution of organic fertilizers and using controlled actuators, electronic, virtual and intellect-maps for the agro-industrial production, the foodstuff biosafety and the human life maintenance. The use of intelligent technologies in precision agriculture enables to achieve saving weed and pest killers, fertilizers, energy resources, ecological sustainability, raising the level of crop yield, the quality of fields, the biosafety of agricultural products, and the increased efficiency of the agricultural production. The most effective method for monitoring and the fast formation of soil information patterns consists in the estimation of its spectral reflectance as a set of optical parameters in the ultraviolet, visible and near infrared spectral ranges. The LED technology presented by us is intended for taking soil brightness coefficients in the broadband optical spectrum range (1011–1015 Hz) using a set of lightemitting and light-sensitive microelements for the illumination of a controlled small piece of soil and for recording the reflected optical signal. A wide application of intelligent sensory systems for precision agriculture and the fast control of soil information patterns in every spot of a cultivated agricultural field underlie the LED technology of precision agriculture with the differentiated fertilization [1, 3].

#### **2.2 Mobile microsensory system for precision agriculture**

A mobile microsensory system "ISSE" developed by us with the LED technology for the light-colour information pattern recognition can analyze a soil state from within and apply fertilizers on different spots of a field just that dosage which is required in a defined soil spot. The registration of soil optical characteristics is realized by means of light-emitting microdiodes with the emission wavelength 405 nm (violet), 460 nm (blue), 505 nm (green), 530 nm (green), 570 nm (yellow), 620 nm (orange), 660 nm (red) but also in spectral points of the sensory control of the infrared radiation (760-2400 nm) and white light (integrated index) [3, 4]. Light-emitting microdiodes irradiate the given electromagnetic waves in the broadband frequency range, but photosensitive microdiodes register a quantitative change of the reflected radiation. The optimal spectrum width corresponds to the wavelength range of 400-800 nm, so the oscillation spectrum effect of H2O molecules in soil begins to become apparent at the greater wavelength, and complementary errors are introduced in results of the diagnostics of a soil horizon. The multisensory system "ISSE" includes an electronic optical module for the formation and the registration of optical impulses consisting of the analog-digital transducer with a microcontroller and a pulse-shaping module (Fig. 1) but also for the comparison of obtained information sensory patterns with soil experimental characteristics on local field areas using a special self-learning software [3].

Light-emitting microdiodes are equispaced on a perimeter of circle in 20 mm over on the angle about 10° relative to the vertical line, so the placing height of these ones over a controlled surface is equal to 30 mm. Eight numbers in the binary-coded decimal notation in the range of 0…1000 corresponding to reflectivity factors of the radiation for each of eight spectrum lines are generated by the use of RS-232 or RS-485 interfaces. Then the value 1000

The main principle of intelligent precision agriculture is the high-precision dosed fertilizer application in a specified small piece of the ground depending in a soil physical-chemical status (colour, structure, organics content, moisture, temperature) for an equal distribution of organic fertilizers and using controlled actuators, electronic, virtual and intellect-maps for the agro-industrial production, the foodstuff biosafety and the human life maintenance. The use of intelligent technologies in precision agriculture enables to achieve saving weed and pest killers, fertilizers, energy resources, ecological sustainability, raising the level of crop yield, the quality of fields, the biosafety of agricultural products, and the increased efficiency of the agricultural production. The most effective method for monitoring and the fast formation of soil information patterns consists in the estimation of its spectral reflectance as a set of optical parameters in the ultraviolet, visible and near infrared spectral ranges. The LED technology presented by us is intended for taking soil brightness coefficients in the broadband optical spectrum range (1011–1015 Hz) using a set of lightemitting and light-sensitive microelements for the illumination of a controlled small piece of soil and for recording the reflected optical signal. A wide application of intelligent sensory systems for precision agriculture and the fast control of soil information patterns in every spot of a cultivated agricultural field underlie the LED technology of precision agriculture

A mobile microsensory system "ISSE" developed by us with the LED technology for the light-colour information pattern recognition can analyze a soil state from within and apply fertilizers on different spots of a field just that dosage which is required in a defined soil spot. The registration of soil optical characteristics is realized by means of light-emitting microdiodes with the emission wavelength 405 nm (violet), 460 nm (blue), 505 nm (green), 530 nm (green), 570 nm (yellow), 620 nm (orange), 660 nm (red) but also in spectral points of the sensory control of the infrared radiation (760-2400 nm) and white light (integrated index) [3, 4]. Light-emitting microdiodes irradiate the given electromagnetic waves in the broadband frequency range, but photosensitive microdiodes register a quantitative change of the reflected radiation. The optimal spectrum width corresponds to the wavelength range of 400-800 nm, so the oscillation spectrum effect of H2O molecules in soil begins to become apparent at the greater wavelength, and complementary errors are introduced in results of the diagnostics of a soil horizon. The multisensory system "ISSE" includes an electronic optical module for the formation and the registration of optical impulses consisting of the analog-digital transducer with a microcontroller and a pulse-shaping module (Fig. 1) but also for the comparison of obtained information sensory patterns with soil experimental

**2. Intelligent sensory systems and networks of precision agriculture** 

**2.1 LED technology in precision agriculture** 

with the differentiated fertilization [1, 3].

**2.2 Mobile microsensory system for precision agriculture** 

characteristics on local field areas using a special self-learning software [3].

Light-emitting microdiodes are equispaced on a perimeter of circle in 20 mm over on the angle about 10° relative to the vertical line, so the placing height of these ones over a controlled surface is equal to 30 mm. Eight numbers in the binary-coded decimal notation in the range of 0…1000 corresponding to reflectivity factors of the radiation for each of eight spectrum lines are generated by the use of RS-232 or RS-485 interfaces. Then the value 1000

Fig. 1. Function circuit of the electron-optical module "ISSE" for a sensory system "CDOT": 1 – microcontroller for the control and information processing; 2 - light-emitting microdiodes control circuit; 3 - microphotodetector coupling; 4 – temperature monitoring circuit; 5 - secondary voltage source; 6 – COM-port connector

characterizes a reflection coefficient from a reference surface used for the calibration of the electronic optical module. The microprocessor-based device generating a soil sensory information pattern processes the output signal of the microphotodetector [4]. Using "ISSE" it is possible to analyze coefficients of absorption, refraction, light scattering, gradient change, and polarization but also coefficients of variation (intensity, amplitude, and phase) of the electromagnetic wave and a space-time field distribution. The obtained data of spectroscopic analysis enable to produce an information pattern of soil, agricultural products, foodstuff, and human biomatters. A gridded registrating unit periodic realizes the real-time satellite navigation and the control of soil parameters. The specifically developed software "ISSE" can be applied in an intelligent system "CDOT" (Control of Distribution of Organics and Temperature) on a chip "electronic eye" which is of interest in precision agriculture for the control of a soil humus-accumulative horizon at the depth of 20-30 to 180- 200 mm. A small intelligent sensory "mole" ("CDOT") includes "ISSE" placed in the metal sheathing with the stone and sunlight protection. The optical beam output to a controlled soil surface is realized by the use of the sapphire transparent coating as the extra hard material, so "CDOT" can be attached, e.g., to a mini-tractor or any other agricultural units. "ISSE" explores the ground at the depth of 5-10 cm for the detection of organic substances, moisture, temperature, colour, granulometric composition and for the analysis of the fertile topsoil and using the GPS (Global Positioning System) navigation defines rapidly how much exactly fertilizers have to be applied with the micromechatronic system in the specific field place in process of optimal motion of the mini-tractor with an attached drawbar hitch (Fig. 2) [1, 3, 4].

The given depth of penetration of the multisensory system "CDOT" for topsoil copying is determined depending on structural features of the floor profile and on the location of the humus-accumulative horizon. The hydralift system of the mini-tractor is intended for the control of "CDOT" lifting and sinking actuators in soil. Positioning of the units is also based on data from ultrasonic, microwave, electrostatic sensory modules at the same time. The intelligent system "CDOT" fulfils data binding of a soil controlled information pattern to ground control points from a GPS receiver and stores obtained data in its memory for

Intelligent Systems in Technology of Precision Agriculture and Biosafety 5

out the computation of optimal motion and changes a control criteria preliminary programming a movement pattern and maintaining the power-saving engine behaviour. Solar energy converters can be used as an auxiliary supply source or the alternative energy one of the intelligent mobile system "ISSE" for the optical control of the soil quality. A soil light-colour information pattern is taken into account in the process of dosing introduced fertilizers and considered as a control parameter according to the model of the plants

 FIF=F–F0, (1) where FIF – cumulative dose of introduced fertilizers, F – plants nutrition level, F0 – initial

Metering microdevices of the mechatronic module are intended for the fertilizer application in soil or for the power feed of weed and pest killers with annular ultrasonic microactuators, so that acoustic vibrations of ones put a diaphragm mechanism in motion for the control of

Fig. 3. Metering microdevice of the free-flowing material and the nomogram for its parameterization: 1 – electroacoustic element; 2 – diaphragm mechanism; 3 – flow of the

The intelligent system fulfils, e.g., dosing of introduced mineral fertilizers depending on the organics content in a field specified point by controlling impulse characteristics of a highfrequency generator which supplies the ultrasonic microactuator. The volumetric capacity

 Pv = (S0 – P1·δ0 /2,3) ·V0 , (2) where S0, δ0, V0 – flow area, diameter of granules, flow velocity of dosed materials; P1 – part of the metering hole perimeter formed by fixed edges relative to the material flow. The form of a hole produced by blades of the dosing unit is presented as an approximate circle, so P1

P1=2·α·(π·S0)-0,5 , (3)

the metering microdevice-delivered material flow (Fig. 3) [1, 5].

Pv of the metering device with the presented design is equal to:

inorganic nutrition:

fertility of soil.

dosed material

can be written in this form:

postprocessing and the sensory information pattern recognition. Principal parameters of the mobile multisensory system "CDOT" developed by us for the control of soil in precision agriculture are presented in the table 1 [4].

Fig. 2. Multisensory system "CDOT" for the light-colour soil control: (a) intelligent mechatronic system for precision agriculture; (b) block diagram of the intelligent sensory system


Table 1. Principal parameters of mobile multisensory system "CDOT"

Fundamental purposes of the developed intelligent multisensory system "CDOT" for precision agriculture is to ensure the processing quality optimization, in particular, for the control of the developed mechatronic mechanism of the agricultural unit and its positioning mechatronic system. The intelligent system analyses sensory processing information, carries

postprocessing and the sensory information pattern recognition. Principal parameters of the mobile multisensory system "CDOT" developed by us for the control of soil in precision

Fig. 2. Multisensory system "CDOT" for the light-colour soil control: (a) intelligent mechatronic system for precision agriculture; (b) block diagram of the intelligent sensory

Technical characteristics of "CDOT" Data description controlled spectrum width 400-2400 nm spatial resolution of agricultural unit location 2-5 m

for the control of soil 0,5 m

of light-emitting microdiodes 40 mA duration of information pattern generating 120 ms space of time between impulses 5 ms depth of taken measurements 8-15 cm speed of the mini-tractor 2,83 m/s

humic-accumulative horizon 0,1–6 % control of soil moisture 0-20 % control of temperature 3-50 0C

Table 1. Principal parameters of mobile multisensory system "CDOT"

Fundamental purposes of the developed intelligent multisensory system "CDOT" for precision agriculture is to ensure the processing quality optimization, in particular, for the control of the developed mechatronic mechanism of the agricultural unit and its positioning mechatronic system. The intelligent system analyses sensory processing information, carries

control of organic matter content in the soil

agriculture are presented in the table 1 [4].

system

spatial resolution

maximal output current

out the computation of optimal motion and changes a control criteria preliminary programming a movement pattern and maintaining the power-saving engine behaviour. Solar energy converters can be used as an auxiliary supply source or the alternative energy one of the intelligent mobile system "ISSE" for the optical control of the soil quality. A soil light-colour information pattern is taken into account in the process of dosing introduced fertilizers and considered as a control parameter according to the model of the plants inorganic nutrition:

$$\mathbf{F}\_{\rm IF} = \mathbf{F} \cdot \mathbf{F}\_{\rm O} \tag{1}$$

where FIF – cumulative dose of introduced fertilizers, F – plants nutrition level, F0 – initial fertility of soil.

Metering microdevices of the mechatronic module are intended for the fertilizer application in soil or for the power feed of weed and pest killers with annular ultrasonic microactuators, so that acoustic vibrations of ones put a diaphragm mechanism in motion for the control of the metering microdevice-delivered material flow (Fig. 3) [1, 5].

Fig. 3. Metering microdevice of the free-flowing material and the nomogram for its parameterization: 1 – electroacoustic element; 2 – diaphragm mechanism; 3 – flow of the dosed material

The intelligent system fulfils, e.g., dosing of introduced mineral fertilizers depending on the organics content in a field specified point by controlling impulse characteristics of a highfrequency generator which supplies the ultrasonic microactuator. The volumetric capacity Pv of the metering device with the presented design is equal to:

$$\mathbf{P\_v = (S\_0 - P\_1 \cdot \mathbf{\hat{6}}\_0 / \mathbf{2}, \mathbf{\hat{3}}\_0) \cdot \mathbf{V\_{0v}}}\tag{2}$$

where S0, δ0, V0 – flow area, diameter of granules, flow velocity of dosed materials; P1 – part of the metering hole perimeter formed by fixed edges relative to the material flow. The form of a hole produced by blades of the dosing unit is presented as an approximate circle, so P1 can be written in this form:

$$\mathbf{P}\_1 \mathbf{=} 2 \cdot \mathbf{a} \cdot (\mathbf{n} \cdot \mathbf{S}\_0)^{\cdot 0.5},\tag{3}$$

Intelligent Systems in Technology of Precision Agriculture and Biosafety 7

 there are a quite strong correlation dependence between the organics content in soil and moisture of this one, so moisture is generally retained in organic components of soil,

 water makes changes in the reflection, and there is especially significant increased light scattering by soil particles in the visible spectrum, so a brightness coefficient falls

Fig. 4. Soil information patterns in the form of the triangle of the soil coloration: BL-black; W-white; R-red; reflected light: V-violet; B-blue; G-green; Y-yellow; O-orange; R-red; IR-

Fig. 5. Correlation of soil information patterns with the soil composition and moisture

absorption effect goes up if the organics content is more than 2 % (Fig. 6a,b);

 the ferric oxide content in soil considerably influences on reflection coefficients, so that there is the absorption with minimum energy in the range of 570-660 nm, but an

slowly, but soil becomes darker if the water content increases (Fig. 5b);

but soil mineral ones don't absorb water (Fig. 5a);

infrared radiation

where 1 ≥ α ≥ 0 – coefficient characterizing the dosing performance degradation because of the reduction of the flow area. A value of the coefficient P1 is taken into consideration on conditions that P1> 0,025·S0/δ0 and for the considered dosing unit:

$$a \succeq 6 \cdot 10^{\frac{n}{3}} \xi' \tag{4}$$

where ξ=D0/δ0, D0 – diameter of the metering hole.

Nomographic charts in the form of the S0-V0 relation for different values of granules sizes δ<sup>0</sup> and the coefficient α were calculated for the metering device developed by us. The given dependences have a linear character for relatively low values α and δ0, but these ones take the nonlinear form for α > 0,5 and δ0 > 0,5 mm especially in and around small values of the sectional area of the metering hole. The increase in α and δ0 requires rising in flow velocity of the dosed material to attain the same performance as for α=0.

#### **2.3 Recognition of soil light-colour information patterns**

Every soil information pattern is characterized by inhomogeneous agrochemical and agrophysical values. We investigated soil multicomponent information patterns using soil reference patterns with contrast colour tones in accordance with a triangle of the soil coloration. This one is produced from the assumption that soil humus colours in grey and dark-grey tones, iron compounds – in brown, reddish, yellowish ones, but many soil components (silicon dioxide, quartz, carbonates, and calcium sulphates) have a white colour. Light-colour information patterns were obtained as a set of values of brightness coefficients in this form:

$$\mathbb{R} = \mathbb{I} \;/\; \mathbb{I}\_{0\prime} \tag{5}$$

where I ,I0 – light intensity reflected from a soil controlled sample and a standard white surface, respectively.

At the same time, a set of brightness coefficients in the soil humus-accumulative horizon defines its information light-colour pattern (Fig. 4). Histograms of a size distribution of soil particles and the soil microstructure registered by a method of scanning electron microscopy supplement a soil information pattern. We developed a special software for the data visualization of reflection indexes of the optical radiation, preprocessing, the data transmission [3, 5].

The following conclusions result from undertaken experimental studies of the developed intelligent multisensory system "CDOT" [1-5]:


where 1 ≥ α ≥ 0 – coefficient characterizing the dosing performance degradation because of the reduction of the flow area. A value of the coefficient P1 is taken into consideration on

Nomographic charts in the form of the S0-V0 relation for different values of granules sizes δ<sup>0</sup> and the coefficient α were calculated for the metering device developed by us. The given dependences have a linear character for relatively low values α and δ0, but these ones take the nonlinear form for α > 0,5 and δ0 > 0,5 mm especially in and around small values of the sectional area of the metering hole. The increase in α and δ0 requires rising in flow velocity

Every soil information pattern is characterized by inhomogeneous agrochemical and agrophysical values. We investigated soil multicomponent information patterns using soil reference patterns with contrast colour tones in accordance with a triangle of the soil coloration. This one is produced from the assumption that soil humus colours in grey and dark-grey tones, iron compounds – in brown, reddish, yellowish ones, but many soil components (silicon dioxide, quartz, carbonates, and calcium sulphates) have a white colour. Light-colour information patterns were obtained as a set of values of brightness

where I ,I0 – light intensity reflected from a soil controlled sample and a standard white

At the same time, a set of brightness coefficients in the soil humus-accumulative horizon defines its information light-colour pattern (Fig. 4). Histograms of a size distribution of soil particles and the soil microstructure registered by a method of scanning electron microscopy supplement a soil information pattern. We developed a special software for the data visualization of reflection indexes of the optical radiation, preprocessing, the data

The following conclusions result from undertaken experimental studies of the developed

 reflection coefficients increase in the examined broadband wavelength range if the irradiation intensity goes up especially fast when the wavelength rises, but soil is

 the more soil fine particles, the higher the reflection coefficient which exponentially increases when sizes of soil particles reduce from 2500 μm to 25 μm, so large particles

reflect less energy of the optical radiation because of a long space between ones; there are significant changes of the organics content for the mixture with light soil, and there are especially more significant differences of information patterns in the range of

α ≥ 6·10-3·ξ , (4)

R = I / I0, (5)

conditions that P1> 0,025·S0/δ0 and for the considered dosing unit:

of the dosed material to attain the same performance as for α=0.

**2.3 Recognition of soil light-colour information patterns** 

coefficients in this form:

surface, respectively.

transmission [3, 5].

lighter;

intelligent multisensory system "CDOT" [1-5]:

620-660 nm in contrast to the one of 460-505 nm;

where ξ=D0/δ0, D0 – diameter of the metering hole.


Fig. 4. Soil information patterns in the form of the triangle of the soil coloration: BL-black; W-white; R-red; reflected light: V-violet; B-blue; G-green; Y-yellow; O-orange; R-red; IRinfrared radiation

Fig. 5. Correlation of soil information patterns with the soil composition and moisture

 the ferric oxide content in soil considerably influences on reflection coefficients, so that there is the absorption with minimum energy in the range of 570-660 nm, but an absorption effect goes up if the organics content is more than 2 % (Fig. 6a,b);

Intelligent Systems in Technology of Precision Agriculture and Biosafety 9

sensory modules occurs on the first network level. The experimental studies of the sensory pattern recognition are fulfilled using "CDOT" for the presented light-colour technology of the soil control underlying the operation of neural networks on the first level. A complex control parameter for the technological production process of an agricultural field is formed on the second level. The neural network on the third level enables to predict the value in a spot of the field based on generalized parameter changes to a point of time when the processing machine with its actuator is located at this one. To get reference colour patterns, a special palette is developed composed of 10×10 colour cells and primary polygraphic colours of the standard CMYK (C - cyan, M - magenta, Y - yellow, K - black) system are presented in corner palette cells, but all the other colour tones of ones can be got by primary colour mixing. Advantages of the used model of reference colour patterns consists in the precise identification of palette colours and soil colour tones, respectively, but also in the application for matching colours, e.g., Pantone (R). Surfaces of reflection coefficients for every colour of the optical radiation are produced using the developed palette (Fig. 8) [1, 5]. The minimum Euclidian distance is chosen as a decision rule for the nearest reference pattern (soil colour) in accordance with soil reflection coefficients registered by the sensory system "ISSE", but soil evaluation information is stored in the database of the intelligent

Fig. 8. Sensory modules and neural networks (NN) in precision agriculture with dependences of reflection coefficients for different wavelengths on the reference colour:

Having generated soil light-colour information patterns, the intelligent microsensory system "CDOT" can produce, e.g., electronic virtual maps of the fertility level of soil spots in some

1 – dark-grey soil sample; 2 – light-grey one

**2.4 Electronic virtual maps in precision agriculture** 

system "CDOT".

Fig. 6. Correlation of the organics content in soil and its reflection of the optical radiation: (a) well structured soil; (b) soil with a high content of sand

 using the self-learning intelligent system "ISSE" it is possible to determine the content of phosphorus and potassium in soil which is varied directly as the reflection coefficient, but the verification of an estimated model with experimental data of network outputs shows a high linear dependence.

The calculation of predictive models and special developed evaluation indicators in accordance with indexes of a soil physical state was used for the recognition of soil information patterns and for the comparison of ones with reference patterns in precision agriculture. Then algorithms of neural networks with the genetic optimization used by us enable to detect a set of basis information patterns of soil. These ones characterize not only the soil individual state (Fig. 7), but also its agrophysical state in general, increasing the level of crop yield, the quality and the biosafety of raising crops, foods, and a soil informationmicrobial state.

Fig. 7. Soil information patterns using "CDOT"

Sensory information processing and the control of agricultural operations in the intelligent system "CDOT" for precision agriculture is based on the self-learning ability of expert systems, e.g., by means of neural network modelling. Then the recognition of multiparameter information patterns generated by the output data transformation of

Fig. 6. Correlation of the organics content in soil and its reflection of the optical radiation: (a)

 using the self-learning intelligent system "ISSE" it is possible to determine the content of phosphorus and potassium in soil which is varied directly as the reflection coefficient, but the verification of an estimated model with experimental data of

The calculation of predictive models and special developed evaluation indicators in accordance with indexes of a soil physical state was used for the recognition of soil information patterns and for the comparison of ones with reference patterns in precision agriculture. Then algorithms of neural networks with the genetic optimization used by us enable to detect a set of basis information patterns of soil. These ones characterize not only the soil individual state (Fig. 7), but also its agrophysical state in general, increasing the level of crop yield, the quality and the biosafety of raising crops, foods, and a soil information-

Sensory information processing and the control of agricultural operations in the intelligent system "CDOT" for precision agriculture is based on the self-learning ability of expert systems, e.g., by means of neural network modelling. Then the recognition of multiparameter information patterns generated by the output data transformation of

well structured soil; (b) soil with a high content of sand

network outputs shows a high linear dependence.

Fig. 7. Soil information patterns using "CDOT"

microbial state.

sensory modules occurs on the first network level. The experimental studies of the sensory pattern recognition are fulfilled using "CDOT" for the presented light-colour technology of the soil control underlying the operation of neural networks on the first level. A complex control parameter for the technological production process of an agricultural field is formed on the second level. The neural network on the third level enables to predict the value in a spot of the field based on generalized parameter changes to a point of time when the processing machine with its actuator is located at this one. To get reference colour patterns, a special palette is developed composed of 10×10 colour cells and primary polygraphic colours of the standard CMYK (C - cyan, M - magenta, Y - yellow, K - black) system are presented in corner palette cells, but all the other colour tones of ones can be got by primary colour mixing. Advantages of the used model of reference colour patterns consists in the precise identification of palette colours and soil colour tones, respectively, but also in the application for matching colours, e.g., Pantone (R). Surfaces of reflection coefficients for every colour of the optical radiation are produced using the developed palette (Fig. 8) [1, 5]. The minimum Euclidian distance is chosen as a decision rule for the nearest reference pattern (soil colour) in accordance with soil reflection coefficients registered by the sensory system "ISSE", but soil evaluation information is stored in the database of the intelligent system "CDOT".

Fig. 8. Sensory modules and neural networks (NN) in precision agriculture with dependences of reflection coefficients for different wavelengths on the reference colour: 1 – dark-grey soil sample; 2 – light-grey one

#### **2.4 Electronic virtual maps in precision agriculture**

Having generated soil light-colour information patterns, the intelligent microsensory system "CDOT" can produce, e.g., electronic virtual maps of the fertility level of soil spots in some

where

iq i q 

/ , j 1,m , j q ;

points, so the bivariate data interpolation is used (Fig. 9d).

Fig. 9. Modelling soil electronic maps

square metres accurate to 5 cm.

Intelligent Systems in Technology of Precision Agriculture and Biosafety 11

A main advantage of modelling on the basis of the nugget-effect consists in its applicability even if a number of experimental points are scarce, so it is conditioned, e.g., by small sizes of an investigated spot of the agricultural field. Fig. 9 shows the process of generating soil electronic maps by means of the developed software "ISIDP" for data processing of the sensory control of soil and the realization of precision agriculture. The half-dispersion of distances is determined in accordance with the accepted modelling algorithm and interpolated curve fitting by the approximation of neighbouring values, bilinear, bicubic, and cubic splines is realized using the developed application and for generating maps of isolines of distributed initial data. If grid-point data are initial ones, then these ones can be presented in the form of a matrix. The visualization of every contour curve for initial data is realised after the introduction of a matrix of distributed values (Fig. 9a,b) [5]. To improve visual perception of the electronic map, spot colour filling is fulfilled according to the chosen colour legend (Fig. 9c). The isolines obtained at this stage are only precise in nodal

The developed microsensory system "ISED" can be used for farm enterprises, individual entrepreneurs, agricultural holdings getting users exactly to know where fertilizers have to be introduced and what crops should be produced in a defined spot. "ISED" includes a multichannel sensor for the detection of organic substances in soil, a receiver of the satellite navigation system, data processing and logging controller but also a special software for this one (Fig. 10). The microsensory system "ISED" can send information automatically to a home computer or mobile devices (smartphone, communicator, iPad, etc.) of farmers, and satellite positioning enables "ISED" to be applied not with hectares, but with some hundred

<sup>q</sup> , – mean values of the soil quality in agricultural

 <sup>2</sup> jq j q ( ) /2 .

 j 

spots j, q determining a semivariogram jump ξjq on their area boundary

spectral ranges including soil electronic maps of the organics content, moisture, temperature, granulometric composition, and colour. Forming electronic maps of a mineral fertilizers distribution on fields or virtual maps of planned crop yield using imaging data to estimate growth conditions and cropping are realized by dosed applying fertilizers in soil [1, 3, 5]. The optimal strategy of the agricultural production can be fast achieved by data overlapping of electronic virtual maps but also on the basis of current information about tillage, nutrients carry-over from soil with taken crops, characteristics of used agricultural units. Then it is possible to control operations of the agricultural machinery, to keep track of information how much fuel is consumed or whether fertilizers are applied. To produce electronic maps, we used a point krinning method for the estimation of the distributed random function in an arbitrary point as the linear combination of its values in initial ones. A variogram defines a form of the optimal interpolated hypersurface in the space between reference spots of the sensory control. According to the krinning method, the estimated value of the soil quality in the known spot p from a set of k neighbouring spots is calculated as weighed mean measured values in neighbouring spots in the form:

$$\boldsymbol{\nu}\_{\text{p}} = \sum\_{\mathbf{i}=1}^{\text{k}} \mathbf{W}\_{\mathbf{i}} \cdot \boldsymbol{\nu}\_{\text{i}'} \tag{6}$$

where Wi – weighting coefficient of an index i of the soil quality in relation to the estimated spot p from a set of neighbouring spots.

The krinning method provides for solving a set of equations:

$$\begin{aligned} \sum\_{j=1}^{k} \mathbf{W}\_{\mathbf{j}} \cdot \chi(\boldsymbol{\xi}\_{\mathrm{ij}}) + \lambda &= \chi(\boldsymbol{\xi}\_{\mathrm{ip}})\_{\prime} \\ \sum\_{i=1}^{k} \mathbf{W}\_{\mathbf{i}} &= \mathbf{1}\_{\prime} \end{aligned} \tag{7}$$

where γ (ξ i j ), γ (ξ i p) – semivariogram values for the distance ξ ij and ξ i p between a points i and estimated points j, p , i 1,k ; λ – Lagrange factor.

Unknown weighting coefficients Wi are computed by solving a set of equations (7), but a value of the controlled variable in the spot p is calculated using the formula (6). The semivariogram on the area boundary of spots with the different agricultural background in precision agriculture has the sharp difference in values; therefore, the considered mathematical model shows the nugget-effect. Having estimated a value of the soil quality in an agricultural spot q in accord with controlled values k1, k2… km of appropriate agricultural backgrounds m, the set of krinning equations for models with the nugget-effect can be presented as:

$$\begin{aligned} \boldsymbol{\beta\_{1q}} \cdot \sum\_{\mathbf{j}=1}^{\mathbf{k\_1}} \mathbf{W\_{1j}} \cdot \boldsymbol{\gamma}(\boldsymbol{\xi\_{\bar{\mathbf{i}}}}) + \boldsymbol{\beta\_{2q}} \cdot \sum\_{\mathbf{j}=1}^{\mathbf{k\_2}} \mathbf{W\_{2j}} \cdot \boldsymbol{\gamma}(\boldsymbol{\xi\_{\bar{\mathbf{i}}}}) + \dots + \boldsymbol{\beta\_{mq}} \cdot \sum\_{\mathbf{j}=1}^{\mathbf{k\_m}} \mathbf{W\_{mj}} \cdot \boldsymbol{\gamma}(\boldsymbol{\xi\_{\bar{\mathbf{i}}}}) + \boldsymbol{\lambda} = \boldsymbol{\gamma}(\boldsymbol{\xi\_{\bar{\mathbf{i}}}}), \\ \boldsymbol{\beta\_{1q}} \cdot \sum\_{i=1}^{\mathbf{k\_1}} \mathbf{W\_{1i}} + \boldsymbol{\beta\_{2q}} \cdot \sum\_{i=1}^{\mathbf{k\_2}} \mathbf{W\_{2i}} + \dots + \boldsymbol{\beta\_{mq}} \cdot \sum\_{i=1}^{\mathbf{k\_m}} \mathbf{W\_{mi}} = \mathbf{1}, \end{aligned} \tag{8}$$

spectral ranges including soil electronic maps of the organics content, moisture, temperature, granulometric composition, and colour. Forming electronic maps of a mineral fertilizers distribution on fields or virtual maps of planned crop yield using imaging data to estimate growth conditions and cropping are realized by dosed applying fertilizers in soil [1, 3, 5]. The optimal strategy of the agricultural production can be fast achieved by data overlapping of electronic virtual maps but also on the basis of current information about tillage, nutrients carry-over from soil with taken crops, characteristics of used agricultural units. Then it is possible to control operations of the agricultural machinery, to keep track of information how much fuel is consumed or whether fertilizers are applied. To produce electronic maps, we used a point krinning method for the estimation of the distributed random function in an arbitrary point as the linear combination of its values in initial ones. A variogram defines a form of the optimal interpolated hypersurface in the space between reference spots of the sensory control. According to the krinning method, the estimated value of the soil quality in the known spot p from a set of k neighbouring spots is calculated

as weighed mean measured values in neighbouring spots in the form:

j 1 k

k

 

1q 1j ij 2q 2 j ij mq mj ij ip

j 1 j 1 j 1

W W W 1,

kk k

12 m

   

W () W () W ( ) ( ),

   

(8)

W 1,

i i 1

The krinning method provides for solving a set of equations:

and estimated points j, p , i 1,k ; λ – Lagrange factor.

spot p from a set of neighbouring spots.

presented as:

 

 

kk k 1q 1i 2q 2i mq mi i 1 i 1 i 1

12 m

 k p ii i 1

where Wi – weighting coefficient of an index i of the soil quality in relation to the estimated

j ij ip

W γ ξ λ γ ξ ,

where γ (ξ i j ), γ (ξ i p) – semivariogram values for the distance ξ ij and ξ i p between a points i

Unknown weighting coefficients Wi are computed by solving a set of equations (7), but a value of the controlled variable in the spot p is calculated using the formula (6). The semivariogram on the area boundary of spots with the different agricultural background in precision agriculture has the sharp difference in values; therefore, the considered mathematical model shows the nugget-effect. Having estimated a value of the soil quality in an agricultural spot q in accord with controlled values k1, k2… km of appropriate agricultural backgrounds m, the set of krinning equations for models with the nugget-effect can be

W , (6)

(7)

where iq i q / , j 1,m , j q ; j <sup>q</sup> , – mean values of the soil quality in agricultural spots j, q determining a semivariogram jump ξjq on their area boundary <sup>2</sup> jq j q ( ) /2 .

A main advantage of modelling on the basis of the nugget-effect consists in its applicability even if a number of experimental points are scarce, so it is conditioned, e.g., by small sizes of an investigated spot of the agricultural field. Fig. 9 shows the process of generating soil electronic maps by means of the developed software "ISIDP" for data processing of the sensory control of soil and the realization of precision agriculture. The half-dispersion of distances is determined in accordance with the accepted modelling algorithm and interpolated curve fitting by the approximation of neighbouring values, bilinear, bicubic, and cubic splines is realized using the developed application and for generating maps of isolines of distributed initial data. If grid-point data are initial ones, then these ones can be presented in the form of a matrix. The visualization of every contour curve for initial data is realised after the introduction of a matrix of distributed values (Fig. 9a,b) [5]. To improve visual perception of the electronic map, spot colour filling is fulfilled according to the chosen colour legend (Fig. 9c). The isolines obtained at this stage are only precise in nodal points, so the bivariate data interpolation is used (Fig. 9d).

Fig. 9. Modelling soil electronic maps

The developed microsensory system "ISED" can be used for farm enterprises, individual entrepreneurs, agricultural holdings getting users exactly to know where fertilizers have to be introduced and what crops should be produced in a defined spot. "ISED" includes a multichannel sensor for the detection of organic substances in soil, a receiver of the satellite navigation system, data processing and logging controller but also a special software for this one (Fig. 10). The microsensory system "ISED" can send information automatically to a home computer or mobile devices (smartphone, communicator, iPad, etc.) of farmers, and satellite positioning enables "ISED" to be applied not with hectares, but with some hundred square metres accurate to 5 cm.

Intelligent Systems in Technology of Precision Agriculture and Biosafety 13

Fig. 11. Microbial maps of soil using the intelligent system "ISMP"

consumed foods.

An increase in the number of microbial communities and their vitality in soil are determined especially by the humus level in soils, pH values and distances from pollution sources. There is a natural microbiological biosphere in soil which is not worked and used for the agricultural application. The active pesticide use in precision agriculture leads to the reduction of specified microbial communities in the next few years (Fig. 12). The pesticide application makes for the accumulation of toxic and dangerous substances in cultivated plants, animal and human organisms. There is need for using intelligent systems for the protection of human health and the control of microbial biosafety of

The developed intelligent system "ISLB" is intended for the control of the personal and social biosafety and the prevention of long-term general toxic influences on the human organism, e.g., of allergic, mutagenic, teratogenic or carcinogenic factors. It is quite enough

Fig. 10. (a) Structure chart of the laboratory portable multisensory system "ISED" and its design (b) for farming and individual entrepreneurs

#### **2.5 Electronic intellect-maps for the maintenance of the human health and biosafety**

The top priorities of society in the XXI century are striving for a maximal prolongation of life and the continuous maintenance of the human activity. An object of research of intelligent systems in precision agriculture for a personal and social biosafety is information patterns of farming cultures and foods produced from them. Genetic features, culture conditions, soil contamination, and a tilling technology generally determine the biochemical composition of food products during agrotechnical operations but also by the quality of crops for animals, intensity of the fertilizer application in soil, radiation levels, environmental ecological states, etc. However, fertilizers introduced in soil for raising the level of the crop yield contain a lot of chemical toxic substances which can be accumulated with time in plant and animal foods and cause the development of dangerous diseases and spreading of infectious ones exposing to danger the human health. Organic microelements in soil are distributed nonuniformly and accumulated in separate spots forming regions with active microbial communities. A number of microbes in soil determine the synthesis of high-molecular compounds and the storage of nutrients in soil but also the productive capacity of soil, an increase in productivity, information-microbial maps, etc. An intelligent system "ISMP" developed by us enables to generate electronic microbial maps of soil for intelligent precision agriculture and maintaining the personal and social biosafety (Fig. 11).

Fig. 10. (a) Structure chart of the laboratory portable multisensory system "ISED" and its

**2.5 Electronic intellect-maps for the maintenance of the human health and biosafety**  The top priorities of society in the XXI century are striving for a maximal prolongation of life and the continuous maintenance of the human activity. An object of research of intelligent systems in precision agriculture for a personal and social biosafety is information patterns of farming cultures and foods produced from them. Genetic features, culture conditions, soil contamination, and a tilling technology generally determine the biochemical composition of food products during agrotechnical operations but also by the quality of crops for animals, intensity of the fertilizer application in soil, radiation levels, environmental ecological states, etc. However, fertilizers introduced in soil for raising the level of the crop yield contain a lot of chemical toxic substances which can be accumulated with time in plant and animal foods and cause the development of dangerous diseases and spreading of infectious ones exposing to danger the human health. Organic microelements in soil are distributed nonuniformly and accumulated in separate spots forming regions with active microbial communities. A number of microbes in soil determine the synthesis of high-molecular compounds and the storage of nutrients in soil but also the productive capacity of soil, an increase in productivity, information-microbial maps, etc. An intelligent system "ISMP" developed by us enables to generate electronic microbial maps of soil for intelligent precision agriculture

design (b) for farming and individual entrepreneurs

and maintaining the personal and social biosafety (Fig. 11).

Fig. 11. Microbial maps of soil using the intelligent system "ISMP"

An increase in the number of microbial communities and their vitality in soil are determined especially by the humus level in soils, pH values and distances from pollution sources. There is a natural microbiological biosphere in soil which is not worked and used for the agricultural application. The active pesticide use in precision agriculture leads to the reduction of specified microbial communities in the next few years (Fig. 12). The pesticide application makes for the accumulation of toxic and dangerous substances in cultivated plants, animal and human organisms. There is need for using intelligent systems for the protection of human health and the control of microbial biosafety of consumed foods.

The developed intelligent system "ISLB" is intended for the control of the personal and social biosafety and the prevention of long-term general toxic influences on the human organism, e.g., of allergic, mutagenic, teratogenic or carcinogenic factors. It is quite enough

Intelligent Systems in Technology of Precision Agriculture and Biosafety 15

Sod-podzolic soils predominate in a structure of agricultural ones in the Republic of Belarus. The effective fertilizer application is possible only based on information patterns of fields with the analysis of their agrochemical data and the soil acidity. There are some results for the recognition of information patterns of soil in the Republic of Belarus in the figure 13. The high humus concentration defines the productive capacity of soil, an increased microbial

Fig. 13. (a) Information patterns of different types of soils using "ISLB". (b) Presented sensory patterns of sod-podzolic soils for main regions of the Republic of Belarus

amount and their enhanced vitality.

even very few toxins with the concentration which is below the level of the adopted standard for the biosafety in order to bring to nonspecific changes in the human biosystem. It is necessary to use the intelligent system "ISLB" for generating electronic intellect-maps of the biosafety of farming cultures but also soil virtual information-microbial and food maps therefore.

Fig. 12. (a) Changes of the number of microbial communities during some years. (b) Information patterns of soil carrying out agrotechnical methods

even very few toxins with the concentration which is below the level of the adopted standard for the biosafety in order to bring to nonspecific changes in the human biosystem. It is necessary to use the intelligent system "ISLB" for generating electronic intellect-maps of the biosafety of farming cultures but also soil virtual information-microbial and food maps

Fig. 12. (a) Changes of the number of microbial communities during some years. (b)

Information patterns of soil carrying out agrotechnical methods

therefore.

Sod-podzolic soils predominate in a structure of agricultural ones in the Republic of Belarus. The effective fertilizer application is possible only based on information patterns of fields with the analysis of their agrochemical data and the soil acidity. There are some results for the recognition of information patterns of soil in the Republic of Belarus in the figure 13. The high humus concentration defines the productive capacity of soil, an increased microbial amount and their enhanced vitality.

Fig. 13. (a) Information patterns of different types of soils using "ISLB". (b) Presented sensory patterns of sod-podzolic soils for main regions of the Republic of Belarus

Intelligent Systems in Technology of Precision Agriculture and Biosafety 17

Fig. 14. Intelligent system in the wristwatch or the smartphone for non-invasive measuring

Fig. 15. Non-invasive LED analysis of blood information patterns for the hungry man and

the sated one at rest and after physical activity

#### **3. LED technology for the analysis of biological fluids**

#### **3.1 Optical microtomography for the pattern recognition of biomatters**

Human biological fluids (blood, saliva, sweat, urine, tears, etc.) are very sensitive to any external influences, but their information sensory pattern can be generated by means of our developed intelligent system "ISLB" with wireless mobile retransmitters. Using received sensory data of information patterns of biomatters it is possible to produce electronic virtual and intellect-maps of the quality of alimentary products or environmental conditions for the maintenance of the human health and the personal and social biosafety. The intelligent system "ISLB" can define spectral-response characteristics of biomatters, e.g., absorption, reflection, polarization factors, changes of intensity, phase, and amplitude of an electromagnetic wave in the broadband frequency range of 1011-1015 Hz. "ISLB" is suited to be used for the individual application, e.g., in wristwatches, watch and mobile phones, smartphones, communicators, iPads, PDAs with an embedded software for the purpose of the continuous maintenance and monitoring of the human health, the prolongation of life and the improvement of the vital activity [6]. An important advantage of "ISLB" is the fast recognition of information patterns of biomatters, so there is no need for special conditions of its functioning and for a remote costly laboratory.

#### **3.2 Blood**

If using a mobile device (smartphone, communicator, iPad, PDA, wristwatch, watch or mobile phone, etc.) with the microsensory system "ISLB" an electromagnetic wave emitted by the microlight-emitting diode falls on the human skin surface, it is absorbed, scattered and reflected by this one. (Fig. 14) [1, 6].

The absorption of the radiation arises from the photons interaction with different chromophores, but scattering is because of changes of the reflection coefficient. There are some disturbances of the human biosystem, functioning its separate organs and biochemical processes because of the consumption of poor farming cultures, natural form foods or food products. Biochemical and spectral characteristics of blood are changed a lot and individually depending on cognitive and functional states of the human organism [6, 7]. The intelligent system "ISLB" with the developed software enables to maintain in a real time the personal and social biosafety and the human health. It is known that the hormone ghrelin is produced in stomach of a hunger man, at its maximum before eating, and then this one is reduced gradual during a meal. The satiation hormone PPY3-36 affecting hypothalamus is at its highest point after eating, and then this one is decreased in some following hours slowly [7]. The blood lipidic and carbohydrate composition is varied because of the nutritive absorption from food after a meal. An increase in the concentration of glucose in blood during eating results in ceasing neurons with sensing membrane channels to send signals and generating the hormone orexin which forces the human organism being awake, eating moderately and self-learning fast. It explains essential differences of information patterns for a man being hungry and sated (Fig. 15), excessive somnolence after a meal and the risk taking behaviour of a hunger man. In this case changes of a level of leukocytes, glucose and whole protein are defined more clearly in the table 2.

Human biological fluids (blood, saliva, sweat, urine, tears, etc.) are very sensitive to any external influences, but their information sensory pattern can be generated by means of our developed intelligent system "ISLB" with wireless mobile retransmitters. Using received sensory data of information patterns of biomatters it is possible to produce electronic virtual and intellect-maps of the quality of alimentary products or environmental conditions for the maintenance of the human health and the personal and social biosafety. The intelligent system "ISLB" can define spectral-response characteristics of biomatters, e.g., absorption, reflection, polarization factors, changes of intensity, phase, and amplitude of an electromagnetic wave in the broadband frequency range of 1011-1015 Hz. "ISLB" is suited to be used for the individual application, e.g., in wristwatches, watch and mobile phones, smartphones, communicators, iPads, PDAs with an embedded software for the purpose of the continuous maintenance and monitoring of the human health, the prolongation of life and the improvement of the vital activity [6]. An important advantage of "ISLB" is the fast recognition of information patterns of biomatters, so there is no need for special conditions

If using a mobile device (smartphone, communicator, iPad, PDA, wristwatch, watch or mobile phone, etc.) with the microsensory system "ISLB" an electromagnetic wave emitted by the microlight-emitting diode falls on the human skin surface, it is absorbed, scattered

The absorption of the radiation arises from the photons interaction with different chromophores, but scattering is because of changes of the reflection coefficient. There are some disturbances of the human biosystem, functioning its separate organs and biochemical processes because of the consumption of poor farming cultures, natural form foods or food products. Biochemical and spectral characteristics of blood are changed a lot and individually depending on cognitive and functional states of the human organism [6, 7]. The intelligent system "ISLB" with the developed software enables to maintain in a real time the personal and social biosafety and the human health. It is known that the hormone ghrelin is produced in stomach of a hunger man, at its maximum before eating, and then this one is reduced gradual during a meal. The satiation hormone PPY3-36 affecting hypothalamus is at its highest point after eating, and then this one is decreased in some following hours slowly [7]. The blood lipidic and carbohydrate composition is varied because of the nutritive absorption from food after a meal. An increase in the concentration of glucose in blood during eating results in ceasing neurons with sensing membrane channels to send signals and generating the hormone orexin which forces the human organism being awake, eating moderately and self-learning fast. It explains essential differences of information patterns for a man being hungry and sated (Fig. 15), excessive somnolence after a meal and the risk taking behaviour of a hunger man. In this case changes of a level of leukocytes, glucose and whole protein are defined more clearly

**3. LED technology for the analysis of biological fluids** 

of its functioning and for a remote costly laboratory.

and reflected by this one. (Fig. 14) [1, 6].

**3.2 Blood** 

in the table 2.

**3.1 Optical microtomography for the pattern recognition of biomatters** 

Fig. 14. Intelligent system in the wristwatch or the smartphone for non-invasive measuring

Fig. 15. Non-invasive LED analysis of blood information patterns for the hungry man and the sated one at rest and after physical activity

Intelligent Systems in Technology of Precision Agriculture and Biosafety 19

Fig. 16. Non-invasive LED analysis of blood information patterns of the right and left human hands of young men at a rest state and after making twenty handclaps

Fig. 17. Non-invasive LED analysis of blood information patterns of the right and left

human hands at a rest state and after stamping during 10 sec


Table 2. Some most variable parameters of human blood during everyday life

The intensive glycolysis in human blood and the formation of adenosine triphosphoric acids are realized during a physical activity, so a man doesn't feel its being hungry, in danger or a state of the strong mental agitation. A short-time physical activity brings about the higher blood glucose level because of the amplifying glycogen mobilization, but this one determines low glucose content in human blood over a long period of time [7]. The physical activity of subjects not going in for sports can increase the insulin activity after eating and reduce the blood glucose level. The level of lactic acid rises from 1,1-1,5 mole/l to 5-20 mole/l, and the level of haemoglobin goes up from 7,5-10 mole/l to 13-15 mole/l (Table 3). Strong changes of blood information parameters are a result of intensive physical activities, human emotional states, humoral mechanisms, nutrition, and other factors therefore [8].


Table 3. Results of the clinical blood analysis during the human physical activity

There are explicit changes of blood information patterns in the right hand and the left one at rest and after clapping one's hands or stamping in the figures 16, 17.

It is connected with the variation of carbohydrate and protein metabolisms in blood, e.g., because of the increase of the lactic acid level, with the reduction of oxygen metabolism (Table 4). The lactic acid content in blood takes also place for a state of complete fatigue or unbalanced eating, for the lack of nourishment of animal proteins or vitamins. Then handclaps and stamping make it possible to improve human cognitive and motor skills, remove stress, influence positively on the blood hydrodynamic sanguimotion and enhance metabolic processes in the human organism.

leukocytes, ·109 /l 4-9 4,3-11,3 5,8-13,4 glucose, mmole/l 3,3-5,5 2,4-4,3 4,3-5,7 whole protein, g/l 65-85 56-74 68-80

The intensive glycolysis in human blood and the formation of adenosine triphosphoric acids are realized during a physical activity, so a man doesn't feel its being hungry, in danger or a state of the strong mental agitation. A short-time physical activity brings about the higher blood glucose level because of the amplifying glycogen mobilization, but this one determines low glucose content in human blood over a long period of time [7]. The physical activity of subjects not going in for sports can increase the insulin activity after eating and reduce the blood glucose level. The level of lactic acid rises from 1,1-1,5 mole/l to 5-20 mole/l, and the level of haemoglobin goes up from 7,5-10 mole/l to 13-15 mole/l (Table 3). Strong changes of blood information parameters are a result of intensive physical activities, human emotional states, humoral mechanisms, nutrition, and other

> Without physical activity

erythrocytes, ·1012/l 4-5 4,7 4,4 4,8 haemoglobin, % 13,8-18 15,5 14,3 15,7 hematocrit, % 40-48 44,6 38,2 40,7 reticulocytes, % 2-10 6,7 3,6 8,1

volume, μm3 75-95 83,2 83,7 88

There are explicit changes of blood information patterns in the right hand and the left one at

It is connected with the variation of carbohydrate and protein metabolisms in blood, e.g., because of the increase of the lactic acid level, with the reduction of oxygen metabolism (Table 4). The lactic acid content in blood takes also place for a state of complete fatigue or unbalanced eating, for the lack of nourishment of animal proteins or vitamins. Then handclaps and stamping make it possible to improve human cognitive and motor skills, remove stress, influence positively on the blood hydrodynamic sanguimotion and enhance

Table 3. Results of the clinical blood analysis during the human physical activity

rest and after clapping one's hands or stamping in the figures 16, 17.

metabolic processes in the human organism.

Table 2. Some most variable parameters of human blood during everyday life

(norm) Hunger man Sated man

Short-timed physical activity

24-33 27,3 29,6 35,3

Long-timed physical activity

Blood components Healthy man

factors therefore [8].

Values of blood

mean cell haemoglobin in erythrocyte , ·10-12g

mean corpuscular

parameters Norm

Fig. 16. Non-invasive LED analysis of blood information patterns of the right and left human hands of young men at a rest state and after making twenty handclaps

Fig. 17. Non-invasive LED analysis of blood information patterns of the right and left human hands at a rest state and after stamping during 10 sec

Intelligent Systems in Technology of Precision Agriculture and Biosafety 21

Fig. 19. Non-invasive LED analysis of saliva information patterns during the day

carbohydrate food intake (Fig. 20).

until 5 mg/ml, but its level doesn't exceed 2 mg/ml at rest.

Ferments of the serous secretion of salivary glands suppress a microflora determining an antimicrobic function of covered coating produced by saliva of a hunger man. The saliva pH level (8,5 pH after breakfast) of a sated man exceeds greatly the saliva pH value for the hunger one (6,5-6,8 pH for awakening, 7 pH before a meal) and especially distinctly after carbonaceous eating because of the acid-produced activity of an oral cavity microflora changing saliva structural properties [9]. An information pattern after tooth brushing of toothpaste (9,4 pH) is distinctly different from other conditions of saliva taking and denotes the impossibility of immunorestoration as a result from a

Saliva structural properties are impaired, and the application of such toothpastes will deteriorate biochemical saliva patterns in the future therefore. Toothpastes with a pH level being close to an initial saliva pattern with the normal pH level about 6-7,5 promote the recovery of saliva structural properties. Not only food, but also physical fatigue (5,5 pH) produces changes of saliva information patterns. At the same time, the saliva acidity is genetic individual for everyone and is varied according to the consumed nutrient composition. States of nervous excitement, mental or emotional strains produce an effect on saliva information patterns, so that there is the increase of a protein level in human saliva

The physical activity specifies the enhanced consumption of adenosine triphosphates in muscles, a strong oxygen need of human organism and an increase of lactic acid. A glycogen level is mainly consumed at the beginning of a physical work, but its consumption by organism is reduced during a continuous work. Saliva protein and enzymatic components characterize a human functional state during the physical activity therefore. There is the


Table 4. Results of the biochemical blood analysis before/after clapping and stamping

#### **3.3 Saliva**

The intelligent system "ISLB" can also analyse high-informative patterns of saliva for monitoring of the food, soil and human biosafety. Simplicity of saliva sampling gives an opportunity to monitor the human health and biosafety in real-time, e.g., for the recognition of physical and functional states of the human organism. There is a structure chart in the figure 18 with presented saliva basic components for intelligent monitoring systems.

Fig. 18. Saliva structural pattern of a man

An information sensory pattern of saliva is changed under the influence of different physical activities but also depending on the state of being sated during a meal (saliva of the hungry man and the sated one) [8, 9]. Besides, a saliva pattern is changed considerable during the daily variation and defined by characteristics of the physical activity of different intensity as appears from the figure 19.

Values of blood parameters Norm At rest After clapping one's hands

lactic acid, mmole/l 0,35-0,78 0,75 0,8 glucose, mmole/l 3,3-5,5 5,6 5,4 kreatine, mg/l 1-4 3,1 3,3 rest nitrogen, mmole/l 14-28 25 27 blood urea, mmole/l 2,5-8,3 6,5 6,8 сreatinine, mmole/l 0,09-0,17 0,11 0,12 indican, mmole/l 0,7-5,4 4,1 4,2 total lipids g/l 3,5-8 4,3 4,5

Table 4. Results of the biochemical blood analysis before/after clapping and stamping

figure 18 with presented saliva basic components for intelligent monitoring systems.

The intelligent system "ISLB" can also analyse high-informative patterns of saliva for monitoring of the food, soil and human biosafety. Simplicity of saliva sampling gives an opportunity to monitor the human health and biosafety in real-time, e.g., for the recognition of physical and functional states of the human organism. There is a structure chart in the

An information sensory pattern of saliva is changed under the influence of different physical activities but also depending on the state of being sated during a meal (saliva of the hungry man and the sated one) [8, 9]. Besides, a saliva pattern is changed considerable during the daily variation and defined by characteristics of the physical activity of different

**3.3 Saliva** 

Fig. 18. Saliva structural pattern of a man

intensity as appears from the figure 19.

and stamping

Fig. 19. Non-invasive LED analysis of saliva information patterns during the day

Ferments of the serous secretion of salivary glands suppress a microflora determining an antimicrobic function of covered coating produced by saliva of a hunger man. The saliva pH level (8,5 pH after breakfast) of a sated man exceeds greatly the saliva pH value for the hunger one (6,5-6,8 pH for awakening, 7 pH before a meal) and especially distinctly after carbonaceous eating because of the acid-produced activity of an oral cavity microflora changing saliva structural properties [9]. An information pattern after tooth brushing of toothpaste (9,4 pH) is distinctly different from other conditions of saliva taking and denotes the impossibility of immunorestoration as a result from a carbohydrate food intake (Fig. 20).

Saliva structural properties are impaired, and the application of such toothpastes will deteriorate biochemical saliva patterns in the future therefore. Toothpastes with a pH level being close to an initial saliva pattern with the normal pH level about 6-7,5 promote the recovery of saliva structural properties. Not only food, but also physical fatigue (5,5 pH) produces changes of saliva information patterns. At the same time, the saliva acidity is genetic individual for everyone and is varied according to the consumed nutrient composition. States of nervous excitement, mental or emotional strains produce an effect on saliva information patterns, so that there is the increase of a protein level in human saliva until 5 mg/ml, but its level doesn't exceed 2 mg/ml at rest.

The physical activity specifies the enhanced consumption of adenosine triphosphates in muscles, a strong oxygen need of human organism and an increase of lactic acid. A glycogen level is mainly consumed at the beginning of a physical work, but its consumption by organism is reduced during a continuous work. Saliva protein and enzymatic components characterize a human functional state during the physical activity therefore. There is the

Intelligent Systems in Technology of Precision Agriculture and Biosafety 23

Fig. 21. Metal content in human saliva for low and disturbed geomagnetic activities

Fig. 22. Information patterns of a carcinoma using non-invasive LED eye with "ISCR"

At the same time, intelligent systems equipped with "ISCR" can transfer information to mobile devices of users (mobile and watch phones, smartphones, communicators, wristwatches,

80 % (Fig. 22) [8].

A self-learning intelligent system "ISCR" for monitoring and the recognition of a carcinoma in the broadband spectral range is developed by us which makes it possible to predict disturbances in the human organism caused by this one with the forecast precision about

decrease in a number of antibodies depending on the physical activity, e.g., the immunoglobulin secretion (IgA) is reduced over a long period of time especially after coffee and alcohol. A simultaneous exposure to different ecological factors is known to have direct and indirect profound effects on the human organism.

Fig. 20. Non-invasive LED analysis of saliva information patterns cleaning teeth

A geomagnetic factor connected with the Earth's magnetic field variability because of the increased solar activity has the strongest impact on the human health in particular [10]. The solar variability changes emotional and functional human states and brings to chronic diseases of nervous, circulatory and respiratory systems. There is a significant increase of the metal content (K, Mg, P, Pb, Cu, and Zn) and a reduced concentration of Na in saliva of men and women under the influence of the solar radiation exposure (Fig. 21) [10]. It means that the solar radiation taken during sunbathing enables to change saliva information patterns but also these ones for other human biomatters (blood, sweat, urine, tears, etc.).

decrease in a number of antibodies depending on the physical activity, e.g., the immunoglobulin secretion (IgA) is reduced over a long period of time especially after coffee and alcohol. A simultaneous exposure to different ecological factors is known to have direct

Fig. 20. Non-invasive LED analysis of saliva information patterns cleaning teeth

tears, etc.).

A geomagnetic factor connected with the Earth's magnetic field variability because of the increased solar activity has the strongest impact on the human health in particular [10]. The solar variability changes emotional and functional human states and brings to chronic diseases of nervous, circulatory and respiratory systems. There is a significant increase of the metal content (K, Mg, P, Pb, Cu, and Zn) and a reduced concentration of Na in saliva of men and women under the influence of the solar radiation exposure (Fig. 21) [10]. It means that the solar radiation taken during sunbathing enables to change saliva information patterns but also these ones for other human biomatters (blood, sweat, urine,

and indirect profound effects on the human organism.

Fig. 21. Metal content in human saliva for low and disturbed geomagnetic activities

A self-learning intelligent system "ISCR" for monitoring and the recognition of a carcinoma in the broadband spectral range is developed by us which makes it possible to predict disturbances in the human organism caused by this one with the forecast precision about 80 % (Fig. 22) [8].

Fig. 22. Information patterns of a carcinoma using non-invasive LED eye with "ISCR"

At the same time, intelligent systems equipped with "ISCR" can transfer information to mobile devices of users (mobile and watch phones, smartphones, communicators, wristwatches,

Intelligent Systems in Technology of Precision Agriculture and Biosafety 25

Fig. 24. Non-invasive LED analysis of sweat information patterns depending on human

predict the state of health and the quality of human life using sweat patterns.

pH 6,2 7,3 (7 pH - pure water) 7,1 Cu 0,006 0,006 not defined Mn 0,006 0,1 0,1 Ca 8,7 56 1 Mg, 2,9 20,6 1,8 Fe 0,047 0,34 0,12 Na 134 120 12,72 Cl 161 0,7 0,2 K 39 180 53

Table 6. Comparison of a biochemical information pattern of sweat with some liquids

(norm), mg

The intelligent system "ISLB" can analyse sweat sensory patterns to recognize harmful and dangerous substances in the human organism. There are comparative estimating fluid parameters for sweat, tap water and filtered one in the table 6 which makes clear information patterns presented in the figure 24. It makes possible to use a sweat pattern for real-time monitoring of human emotions and an improvement of the emotional selfregulation. An emotion is worried feeling which motivates, regulates and orientates our perception, thinking, activity, can be super intellectual and generates new innovative ideas. "ISLB" controls, is trained in the host response to such and such but after that recognizes on the state of health (wet skin, temperature, etc.) whether a man is in good spirits or depressed. As soon as there is clammy sweat and temperature rises to 38,5°, it is indicative of preinfarction angina or worse than this one [8]. The intelligent system "ISLB" is able to

> Tap water, mg/dm3

Filtered water, mg/dm3

functional states

Fluid parameters Sweat

PDAs, iPads, etc.), and after that these data are processed to produce an electronic information map of diseases for an individual subject. The use of mobile systems with "ISCR" enables noninvasive to monitor human personal and social activities therefore (Fig. 23).

Fig. 23. Non-invasive recognition of information patterns of cells using the smartphone with LED eye

#### **3.4 Sweat**

The interest in sweat monitoring is increasing because of the sweat collection is convenient and non-invasive in comparison with traditional specimens (blood, urine, tears, etc.). The sweat chemical composition and the correlation of individual components depend on body perspiration (Table 5), the metabolism intensity, and the human health, emotional and functional states (Fig. 24) [11].


Table 5. Results of the sweat biochemical analysis in the morning

PDAs, iPads, etc.), and after that these data are processed to produce an electronic information map of diseases for an individual subject. The use of mobile systems with "ISCR" enables non-

Fig. 23. Non-invasive recognition of information patterns of cells using the smartphone with

The interest in sweat monitoring is increasing because of the sweat collection is convenient and non-invasive in comparison with traditional specimens (blood, urine, tears, etc.). The sweat chemical composition and the correlation of individual components depend on body perspiration (Table 5), the metabolism intensity, and the human health, emotional and

before taking a shower after taking a shower

Values of sweat parameters Biochemical information pattern of sweat

whole protein, g/l 0,8 0,38 albumens, g/l 0,41 0,11 urea, mmole/l 16,3 16,5 creatinine, μmole/l 52,5 35,9 ammonia, mmole/l 20,6 19,4 amino acids, mg/l 35 20,5 glucose, mmole/l 0,35 0,17 pH value 6,2 5,3

Table 5. Results of the sweat biochemical analysis in the morning

LED eye

**3.4 Sweat** 

functional states (Fig. 24) [11].

invasive to monitor human personal and social activities therefore (Fig. 23).

Fig. 24. Non-invasive LED analysis of sweat information patterns depending on human functional states

The intelligent system "ISLB" can analyse sweat sensory patterns to recognize harmful and dangerous substances in the human organism. There are comparative estimating fluid parameters for sweat, tap water and filtered one in the table 6 which makes clear information patterns presented in the figure 24. It makes possible to use a sweat pattern for real-time monitoring of human emotions and an improvement of the emotional selfregulation. An emotion is worried feeling which motivates, regulates and orientates our perception, thinking, activity, can be super intellectual and generates new innovative ideas. "ISLB" controls, is trained in the host response to such and such but after that recognizes on the state of health (wet skin, temperature, etc.) whether a man is in good spirits or depressed. As soon as there is clammy sweat and temperature rises to 38,5°, it is indicative of preinfarction angina or worse than this one [8]. The intelligent system "ISLB" is able to predict the state of health and the quality of human life using sweat patterns.


Table 6. Comparison of a biochemical information pattern of sweat with some liquids

Intelligent Systems in Technology of Precision Agriculture and Biosafety 27

An intelligent mobile hardware and software microsensory system "ISPB" with a broadband polarized "electronic eye" is developed by us to recognize information patterns, e.g., of biomatters, soil, food products. The sensor intended for measuring the light polarization consists of a send-emitting module and a virtual polarizer with a self-learning software. If a polarized light penetrates, e.g., in a human biomatter (blood, saliva, sweat, urine, tears), then the plane of polarization is turned through angle depending on the concentration of individual components in a biological fluid. A refractive index of blood components strongly depends on the polarizability of protein structures in particular. The use of the virtual polarization also gives an opportunity to determine polluting and foreign surface layers on an investigated matter, produce information patterns of objects, e.g., food products for the personal and social biosafety (Fig. 26a) and for generating information patterns of the human functional state. Fig. 26b,c,d,e shows obtained average reflection factors of a scattered light and the polarized one for the investigated food products (soda, salt, water, milk) with the rotation of the plane of polarization (0°, 30°, 60°, 90°) in a direction to the plane of light incidence. If there are some surfaces with refractive indexes being different from a refractive index of an investigated matter, then the part of light is thrown back, but the rest of this one passes through the matter partially being reflected from it and after that goes out again. The reflection coefficient differs from the reflection factor of the investigated matter. Thus, the intelligent system "ISPB" makes it possible to determine refraction and reflection indexes for any light angles and any degrees of its polarization but also to carry out the research and the control of the quality of matters with impure substances, surface foreign or polluting films. "ISPB" is especially very important for the application in biotechnology, ecology, food industry, precision agriculture or for a personal

**4.1 Intelligent system with virtual broadband polarized "electronic eye"** 

Fig. 26. Intelligent system "ISPB" with the virtual polarized "electronic eye" for the calculation of reflection coefficients of polarized light and forming information patterns

**4. Intelligent systems in technology of biosafety** 

and social biosafety [1, 2, 8].

#### **3.5 Urine**

Human urine is a complex component biomatter consisting of organic components. Information patterns of urine describe general functional well-being, so urine passes through the human organism many times. Any changes of urine information patterns are connected with its pH level especially (Fig. 25). There is a low pH level (5-6,8 pH) of urine in the morning, but urine is getting neutral two hours later after eating, then alkaline (7-8,5 pH).

Fig. 25. Non-invasive LED analysis of information patterns of urine during the day

The pH level of urine remains to be equal to 6,6-6,8 pH by day. Ketone bodies are produced in the liver of a hunger man or after the long-term physical activity and characterize fat oxidation. There is also no glucose in urine of a healthy man, but it can be present because of the carbohydrate hypernutrition during a meal and physical activities [11, 12].

#### **3.6 Tears**

A lacrimal fluid is a multicomponent secreta which, e.g., total protein is a uniform dysbolism identifier in. This one is varied considerable depending on functioning states of the human health: there is total protein for a healthy man an average of 5,4 g/l but increases in case of, e.g., cornea inflammation to 7,8 g/l. The glucose content in tears correlates with its level in human blood, so that information lacrimal fluid enables to recognize patterns of emotional and functional states of the human organism. Moreover, it is possible to diagnose a human state and the health using for the analysis the alpha amylase activity in tears that catalyzes hydrolysis of starch and glycogen [13]. The concentration of amylase in a lacrimal fluid is in 4 times more than in blood. There is the amylase activity in tears of the healthy men in the range of 130-250 unit/l, but, e.g., acute pancreatitis emerges if this one is more 300 unit/l [14].

#### **4. Intelligent systems in technology of biosafety**

26 Intelligent Systems

Human urine is a complex component biomatter consisting of organic components. Information patterns of urine describe general functional well-being, so urine passes through the human organism many times. Any changes of urine information patterns are connected with its pH level especially (Fig. 25). There is a low pH level (5-6,8 pH) of urine in the morning,

but urine is getting neutral two hours later after eating, then alkaline (7-8,5 pH).

Fig. 25. Non-invasive LED analysis of information patterns of urine during the day

the carbohydrate hypernutrition during a meal and physical activities [11, 12].

The pH level of urine remains to be equal to 6,6-6,8 pH by day. Ketone bodies are produced in the liver of a hunger man or after the long-term physical activity and characterize fat oxidation. There is also no glucose in urine of a healthy man, but it can be present because of

A lacrimal fluid is a multicomponent secreta which, e.g., total protein is a uniform dysbolism identifier in. This one is varied considerable depending on functioning states of the human health: there is total protein for a healthy man an average of 5,4 g/l but increases in case of, e.g., cornea inflammation to 7,8 g/l. The glucose content in tears correlates with its level in human blood, so that information lacrimal fluid enables to recognize patterns of emotional and functional states of the human organism. Moreover, it is possible to diagnose a human state and the health using for the analysis the alpha amylase activity in tears that catalyzes hydrolysis of starch and glycogen [13]. The concentration of amylase in a lacrimal fluid is in 4 times more than in blood. There is the amylase activity in tears of the healthy men in the range of 130-250 unit/l, but, e.g., acute pancreatitis emerges if this one is more

**3.5 Urine** 

**3.6 Tears** 

300 unit/l [14].

#### **4.1 Intelligent system with virtual broadband polarized "electronic eye"**

An intelligent mobile hardware and software microsensory system "ISPB" with a broadband polarized "electronic eye" is developed by us to recognize information patterns, e.g., of biomatters, soil, food products. The sensor intended for measuring the light polarization consists of a send-emitting module and a virtual polarizer with a self-learning software. If a polarized light penetrates, e.g., in a human biomatter (blood, saliva, sweat, urine, tears), then the plane of polarization is turned through angle depending on the concentration of individual components in a biological fluid. A refractive index of blood components strongly depends on the polarizability of protein structures in particular. The use of the virtual polarization also gives an opportunity to determine polluting and foreign surface layers on an investigated matter, produce information patterns of objects, e.g., food products for the personal and social biosafety (Fig. 26a) and for generating information patterns of the human functional state. Fig. 26b,c,d,e shows obtained average reflection factors of a scattered light and the polarized one for the investigated food products (soda, salt, water, milk) with the rotation of the plane of polarization (0°, 30°, 60°, 90°) in a direction to the plane of light incidence. If there are some surfaces with refractive indexes being different from a refractive index of an investigated matter, then the part of light is thrown back, but the rest of this one passes through the matter partially being reflected from it and after that goes out again. The reflection coefficient differs from the reflection factor of the investigated matter. Thus, the intelligent system "ISPB" makes it possible to determine refraction and reflection indexes for any light angles and any degrees of its polarization but also to carry out the research and the control of the quality of matters with impure substances, surface foreign or polluting films. "ISPB" is especially very important for the application in biotechnology, ecology, food industry, precision agriculture or for a personal and social biosafety [1, 2, 8].

Fig. 26. Intelligent system "ISPB" with the virtual polarized "electronic eye" for the calculation of reflection coefficients of polarized light and forming information patterns

Intelligent Systems in Technology of Precision Agriculture and Biosafety 29

Fig. 28. Noncontact LED e-eye for measuring information patterns of foods

generation of sensory information patterns (Fig. 29).

Fig. 29. Information patterns of maize using "ISLB"

Up-to-date techniques for the recognition of information sensory patterns of foods are rather labour-intensive, costly, extremely time-consuming and require highly skilled specialists. The intelligent microsensory system "ISLB" represents an unique microlaboratory on a chip of the type e-eye and is intended for solving important practical problems in intelligent precision agriculture, e.g., for the control of raised farming cultures (maize) and with the

Distinctive features of "ISLB" consist in the capability to function with the navigation satellite monitoring technology; therefore, "ISLB" can be presented as a mobile retransmitter and applied local with the use of pilotless vehicles or satellite mobile devices (Fig. 30) [8].

#### **4.2 Recognition of food information patterns for the human biosafety**

The developed intelligent system "ISSE" can be used for the recognition of information patterns of foods to maintain and control the human health and biosafety. Their physical and biochemical properties determine coefficients of the optical reflection and light scattering particularly, the quality and the biosecurity of produced food substances. There are optical reflection coefficients for light flour and dark one in the table 7. The flour is lighter, the one is more qualitative, so the reflection from lighter flour in the visible spectral range is higher for high quality one. The presented optical information patterns of different berries and foodstuffs (bread, butter, curds, flour, etc) in the figures 27, 28 can be applied as reference information for the detection of harmful or dangerous toxic components and the content of heavy metals which are accumulated in soil especially because of the nonuniform application of mineral fertilizers, microbial contamination and kept in harvested crops and produced foods [1, 3-5].


Table 7. Comparison of reflection coefficients for different kinds of flour

Fig. 27. Noncontact LED e-eye for measuring information patterns of berries

The developed intelligent system "ISSE" can be used for the recognition of information patterns of foods to maintain and control the human health and biosafety. Their physical and biochemical properties determine coefficients of the optical reflection and light scattering particularly, the quality and the biosecurity of produced food substances. There are optical reflection coefficients for light flour and dark one in the table 7. The flour is lighter, the one is more qualitative, so the reflection from lighter flour in the visible spectral range is higher for high quality one. The presented optical information patterns of different berries and foodstuffs (bread, butter, curds, flour, etc) in the figures 27, 28 can be applied as reference information for the detection of harmful or dangerous toxic components and the content of heavy metals which are accumulated in soil especially because of the nonuniform application of mineral fertilizers, microbial contamination and kept in harvested crops and

**4.2 Recognition of food information patterns for the human biosafety** 

Wavelength Reflection coefficient

405 (violet) 0,18 0,16-0,17 460 nm (blue) 0,84 0,72-0,76 505, 530 nm (green) 0,92 0,81-0,83 570 nm (yellow) 0,78-0,82 0,7-0,6 620 nm (orange) 0,87-0,82 0,81 660 nm (red) 0,87-0,93 0,85 405-650 (white colour) 0,77-0,81 0,7

(infrared light) 0,89 0,86-0,87

Table 7. Comparison of reflection coefficients for different kinds of flour

Fig. 27. Noncontact LED e-eye for measuring information patterns of berries

light flour dark flour

produced foods [1, 3-5].

760-2400 nm

Fig. 28. Noncontact LED e-eye for measuring information patterns of foods

Up-to-date techniques for the recognition of information sensory patterns of foods are rather labour-intensive, costly, extremely time-consuming and require highly skilled specialists. The intelligent microsensory system "ISLB" represents an unique microlaboratory on a chip of the type e-eye and is intended for solving important practical problems in intelligent precision agriculture, e.g., for the control of raised farming cultures (maize) and with the generation of sensory information patterns (Fig. 29).

Fig. 29. Information patterns of maize using "ISLB"

Distinctive features of "ISLB" consist in the capability to function with the navigation satellite monitoring technology; therefore, "ISLB" can be presented as a mobile retransmitter and applied local with the use of pilotless vehicles or satellite mobile devices (Fig. 30) [8].

Intelligent Systems in Technology of Precision Agriculture and Biosafety 31

chemical and toxigenic substances in natural plant and animal foods and food products which endanger the human health and have harmful effects on human life is very important for the recognition of offending foods and non-specific changes in a biosystem state [8, 12].

Fig. 31. Functional diagram of "ISLB": 1 – registration of information patterns;

products, crops or soil; 3 – authentication and identification of biometric patterns;

at the client software or the server; 13 – transmission channel; 14 – demodulator; 15 – antinoise decoding; 16 – deciphering the data packet transferred to the server;

transmission of decisions taken by intelligent system

2 –information sensory patterns of biomatters (blood, saliva, sweat, urine, tears, etc.), food

4 – generating biometric patterns (micro-nanostructure of investigated matters); 5 – drivers of sensory devices; 6 – sensory data acquisition; 7 – periodicity of the control; 8 – forming the data packet for the transfer to the server; 9 – encoding and cryptography of the data packet; 10 – antinoise coding; 11 – modulator; 12 – data transfer using the socket determined

17 – expert evaluation (statistical analysis); 18 – data pre-processing; 19 – using optimization criterions to define temporal information patterns on the basis of the minimization of intercluster centroid distances; 20 – database of stored reference patterns; 21 – calculation of distance functions in a multidimensional space; 22 – self-learning the intelligent system; 23 – multicore paralleling of data processing; 24 – neural networks, genetic algorithms; 25 – calculation of reference bioinformation patterns; 26 – generating prediction results (statistical probability, error of bioinformation pattern recognition); 27 – intelligent system of decision making for the transfer to the client software; 28 – encoding and cryptography of the data packet; 29 – deciphering the data packet transferred to the server; 30 – data

Fig. 30. (a) Design of the intelligent system "ISLB" with (b) the LED optical technology

After the registration of information patterns of biomatters (blood, saliva, sweat, urine, tears, etc.), food products or sensory patterns of soil a packet is generated, encrypted, cryptographic transformed, and antinoise encoded for the transfer to a server (Fig. 31). The transmission is realized through a socket defined at the client software or the server. Then information is conveyed to the server where the received data packet is decrypted and decoded, but data preprocessing, self-learning of the intelligent system based on expert judgements and previous obtained results are fulfilled. A high learning rate is achieved by means of the intelligent self-learning software "ISLB" based on multicore programming algorithms. Prediction results are transferred to the client software (watch and mobile phone, smartphone, communicator, iPad, PDA, wristwatch, etc.). There is displayed information on a screen with a full electronic intellect-map of functioning agricultural or farm enterprises during production steps of precision agriculture, an electronic map of the biosafety of raised cultures to control introduced fertilizers, toxic substances in soil or the level of crop yield. Moreover, an electronic virtual map of the human health with sensory information patterns of biomatters can be showed on the smartphone screen or other portable mobile devices. Intelligent client applications of Visial Studio with .NET Framework 3.5 ensure high adaptability and fast self-learning, but the data library Parallel Extensions gives an opportunity to accelerate data-handling and self-training procedures depending on a number of available system cores and SQL Server 2005 makes possible the development of Web-applications.

Owing to on-the-fly computing information patterns it is possible effective self-learning of the intelligent system "ISLB", the better opportunity to prevent the onset of human diseases at an early stage of their activity because of the consumption of poor food products or hazardous to the human health farming cultures which are raised in soils contaminated by dangerous viruses and bacteria. Thus, the scientific prognostication of the accumulation of

Fig. 30. (a) Design of the intelligent system "ISLB" with (b) the LED optical technology

development of Web-applications.

After the registration of information patterns of biomatters (blood, saliva, sweat, urine, tears, etc.), food products or sensory patterns of soil a packet is generated, encrypted, cryptographic transformed, and antinoise encoded for the transfer to a server (Fig. 31). The transmission is realized through a socket defined at the client software or the server. Then information is conveyed to the server where the received data packet is decrypted and decoded, but data preprocessing, self-learning of the intelligent system based on expert judgements and previous obtained results are fulfilled. A high learning rate is achieved by means of the intelligent self-learning software "ISLB" based on multicore programming algorithms. Prediction results are transferred to the client software (watch and mobile phone, smartphone, communicator, iPad, PDA, wristwatch, etc.). There is displayed information on a screen with a full electronic intellect-map of functioning agricultural or farm enterprises during production steps of precision agriculture, an electronic map of the biosafety of raised cultures to control introduced fertilizers, toxic substances in soil or the level of crop yield. Moreover, an electronic virtual map of the human health with sensory information patterns of biomatters can be showed on the smartphone screen or other portable mobile devices. Intelligent client applications of Visial Studio with .NET Framework 3.5 ensure high adaptability and fast self-learning, but the data library Parallel Extensions gives an opportunity to accelerate data-handling and self-training procedures depending on a number of available system cores and SQL Server 2005 makes possible the

Owing to on-the-fly computing information patterns it is possible effective self-learning of the intelligent system "ISLB", the better opportunity to prevent the onset of human diseases at an early stage of their activity because of the consumption of poor food products or hazardous to the human health farming cultures which are raised in soils contaminated by dangerous viruses and bacteria. Thus, the scientific prognostication of the accumulation of chemical and toxigenic substances in natural plant and animal foods and food products which endanger the human health and have harmful effects on human life is very important for the recognition of offending foods and non-specific changes in a biosystem state [8, 12].

Fig. 31. Functional diagram of "ISLB": 1 – registration of information patterns; 2 –information sensory patterns of biomatters (blood, saliva, sweat, urine, tears, etc.), food products, crops or soil; 3 – authentication and identification of biometric patterns; 4 – generating biometric patterns (micro-nanostructure of investigated matters); 5 – drivers of sensory devices; 6 – sensory data acquisition; 7 – periodicity of the control; 8 – forming the data packet for the transfer to the server; 9 – encoding and cryptography of the data packet; 10 – antinoise coding; 11 – modulator; 12 – data transfer using the socket determined at the client software or the server; 13 – transmission channel; 14 – demodulator; 15 – antinoise decoding; 16 – deciphering the data packet transferred to the server; 17 – expert evaluation (statistical analysis); 18 – data pre-processing; 19 – using optimization criterions to define temporal information patterns on the basis of the minimization of intercluster centroid distances; 20 – database of stored reference patterns; 21 – calculation of distance functions in a multidimensional space; 22 – self-learning the intelligent system; 23 – multicore paralleling of data processing; 24 – neural networks, genetic algorithms; 25 – calculation of reference bioinformation patterns; 26 – generating prediction results (statistical probability, error of bioinformation pattern recognition); 27 – intelligent system of decision making for the transfer to the client software; 28 – encoding and cryptography of the data packet; 29 – deciphering the data packet transferred to the server; 30 – data transmission of decisions taken by intelligent system

Intelligent Systems in Technology of Precision Agriculture and Biosafety 33

reflectors but then backwards to IDTs where is transformed into an electromagnetic signal emitted by the antenna to a rider. Thank to the low SAW velocity it is possible to get long time delays and prevent echo signals, but the sampling time exceeds 10-5 s, so the system can be used in driving objects (agriculture machines, portable devices of farmers, etc.). The intelligent system enables to extract the initial signal containing spectral reflection characteristics from the investigated biomatter. The principal advantages of the developed system on a chip e-eye with wireless passive SAW retransmitters are low-power consumption, high reliability and low cost and unlimited operation life. The intelligent LED microsensory system with the wireless active SAW extra has an energy source and semiconductor microcircuitries for the signal multiplication and the increase in operating distances, but the working life of such devices decreases. The identification of active SAW retransmitters can be realized by changing distances between two IDTs. The readout distance of the active SAW retransmitter with the power source (10 W) can achieve up to 50 km therefore [1, 2, 15]. The basic future tendency of mobile technologies requires the development of intelligent sensory systems and networks which are able to adapt to different conditions of real-time monitoring of the individual human biosafety flexible. If agrotechnical machines, farm enterprises with portable mobile analyzers and mobile devices are equipped with "CDOT" developed by us, then agriculture will be more precise and economic, but the food biosafety of countries will be improved (Fig. 33a,b). Information patterns of soil are presented in the form of electronic and virtual maps, e.g., an electronic map of applied nutrients and organic fertilizers, of the level of crop yield, of the yield of soil for last years or an information-microbial state of soil. The human biosafety is better controlled by intelligent systems using an electronic satellite map of field, electronic maps of the biosafety of crops, plant and animal foods along with electronic intellect-maps for

Fig. 33. (a) Intellect-map and virtual electronic ones for precison agriculture and the human biosafety. (b) Intelligent sensory system on a chip e-eye for the pattern recognition of soil

precision agriculture [3, 4, 8].

#### **4.3 E-eye with SAW retransmitter for wireless sensory networks**

The hardware and software complex on a chip of the type "electronic eye" developed by us can be technically improved using retransmitters on surface acoustic waves (SAWs) with the RFID technology of real-time monitoring for the individual human biosafety (Fig. 32a,b) [1].

The wireless SAW micro-nanosensory system consists of an antenna, interdigital transducers (IDTs) and a set of reflecting electrodes located on a piezoelectric crystal. If the antenna of the SAW sensor picks up a radio-frequency electromagnetic wave, then a received electromagnetic signal is transmitted to IDTs which are in the form of plane parallel electrodes on a substrate surface and are connected with each other through common buses and after that to a special control unit starting the optical information readout. An outcoming time-lagged signal of spectral reflection characteristics enhances an alternating electromagnetic field which has the significant impact on an anisotropic dielectric and harmonic mechanical oscillations, tensions, strains caused by the inverse piezoelectric effect and emerged in the SAW sensor substrate. Electric charges with unlike signs are produced on the crystal surface which conditions the onset of an electric potential between IDT electrodes and the electrostatic field. There is a field with the elliptically polarized component determining an arising acoustic wave as a result of the superposition of the source field and the complementary subfield. The acoustic wave after the reflection effects on IDTs and brings to the distribution of electric charges due to the direct piezoelectric effect between IDT electrodes and to the generation of an electromagnetic signal. The SAW velocity and frequency this one are changed depending on different conditions of the propagation medium. The SAW propagates from IDTs in the direction of

Fig. 32. (a) Wireless active (passive) SAW sensory micro-nanosystem with the LED technology e-eye. (b) Intelligent wristwatch with the RFID technology and LED e-eye. (c) Smartphone with the RFID technology for data processing from the wristwatch

The hardware and software complex on a chip of the type "electronic eye" developed by us can be technically improved using retransmitters on surface acoustic waves (SAWs) with the RFID technology of real-time monitoring for the individual human biosafety (Fig. 32a,b) [1].

Fig. 32. (a) Wireless active (passive) SAW sensory micro-nanosystem with the LED technology e-eye. (b) Intelligent wristwatch with the RFID technology and LED e-eye. (c) Smartphone with the RFID technology for data processing from the wristwatch

The wireless SAW micro-nanosensory system consists of an antenna, interdigital transducers (IDTs) and a set of reflecting electrodes located on a piezoelectric crystal. If the antenna of the SAW sensor picks up a radio-frequency electromagnetic wave, then a received electromagnetic signal is transmitted to IDTs which are in the form of plane parallel electrodes on a substrate surface and are connected with each other through common buses and after that to a special control unit starting the optical information readout. An outcoming time-lagged signal of spectral reflection characteristics enhances an alternating electromagnetic field which has the significant impact on an anisotropic dielectric and harmonic mechanical oscillations, tensions, strains caused by the inverse piezoelectric effect and emerged in the SAW sensor substrate. Electric charges with unlike signs are produced on the crystal surface which conditions the onset of an electric potential between IDT electrodes and the electrostatic field. There is a field with the elliptically polarized component determining an arising acoustic wave as a result of the superposition of the source field and the complementary subfield. The acoustic wave after the reflection effects on IDTs and brings to the distribution of electric charges due to the direct piezoelectric effect between IDT electrodes and to the generation of an electromagnetic signal. The SAW velocity and frequency this one are changed depending on different conditions of the propagation medium. The SAW propagates from IDTs in the direction of

**4.3 E-eye with SAW retransmitter for wireless sensory networks** 

reflectors but then backwards to IDTs where is transformed into an electromagnetic signal emitted by the antenna to a rider. Thank to the low SAW velocity it is possible to get long time delays and prevent echo signals, but the sampling time exceeds 10-5 s, so the system can be used in driving objects (agriculture machines, portable devices of farmers, etc.). The intelligent system enables to extract the initial signal containing spectral reflection characteristics from the investigated biomatter. The principal advantages of the developed system on a chip e-eye with wireless passive SAW retransmitters are low-power consumption, high reliability and low cost and unlimited operation life. The intelligent LED microsensory system with the wireless active SAW extra has an energy source and semiconductor microcircuitries for the signal multiplication and the increase in operating distances, but the working life of such devices decreases. The identification of active SAW retransmitters can be realized by changing distances between two IDTs. The readout distance of the active SAW retransmitter with the power source (10 W) can achieve up to 50 km therefore [1, 2, 15]. The basic future tendency of mobile technologies requires the development of intelligent sensory systems and networks which are able to adapt to different conditions of real-time monitoring of the individual human biosafety flexible. If agrotechnical machines, farm enterprises with portable mobile analyzers and mobile devices are equipped with "CDOT" developed by us, then agriculture will be more precise and economic, but the food biosafety of countries will be improved (Fig. 33a,b). Information patterns of soil are presented in the form of electronic and virtual maps, e.g., an electronic map of applied nutrients and organic fertilizers, of the level of crop yield, of the yield of soil for last years or an information-microbial state of soil. The human biosafety is better controlled by intelligent systems using an electronic satellite map of field, electronic maps of the biosafety of crops, plant and animal foods along with electronic intellect-maps for precision agriculture [3, 4, 8].

Fig. 33. (a) Intellect-map and virtual electronic ones for precison agriculture and the human biosafety. (b) Intelligent sensory system on a chip e-eye for the pattern recognition of soil

Intelligent Systems in Technology of Precision Agriculture and Biosafety 35

technology presents distributed data storage in a network computer system including many services and subdivisions of farm enterprises. There is a unified system of rules, the data representation, storage, coding, and communication in the CIMLS-technology. A main principle of CIMLS is the fact that information generated at any stages of the life cycle is stored in CIMLS and is made accessible to every participant of this stage and other ones in concordance with available access permissions to these data. It enables to avoid duplication, unauthorized data substitution, falsification, imitation, changing, and errors of the control system, to abridge the labour, cut time and finances. Actions of government officials are opened to public scrutiny, so there is an integrated logistic control process with the intelligent system of decision-, provision-, decree-, law making. Information in the CIMLStechnology is generated, transformed, encoded, stored, and transmitted using intelligent softwares, e.g., Agro, ADAMIS, ADMOS with a "electronic description" of all the life-cycle objects, and human embellishment or information hiding is eliminated, minimized

We developed mathematical models of a controlled process of the complex agricultural production for intelligent precision agriculture, but also the intelligent system "ISAG" is developed for the control of farm enterprises. One of the important directions of precision agriculture is the improvement of control algorithms, the use of steering functions including the nonlinear constraint, the acceleration of a controlled variable, information about

The intelligent microsensory systems presented in this chapter are intended for the recognition of optical information patterns of a technology, a product or environment external conditions in the space of multidimensional sensory data on the basis of the LED technology e-eye. These ones enable to solve important practical problems in an agro-industrial complex, e.g., in intelligent precision agriculture and for the control of the biosafety of farm products, soils, plant, and animal foods. The developed intelligent system "ISLB" can inform users about probable homeostatic threats and recognize timely any changes in functioning the human organism by means of self-learning on optical data of biomatters, e.g., using systems on a chip of the type "electronic eye". The application of the intelligent systems developed by us in mobile retransmitters (smartphones, watch and mobile phones, communicators, iPads, PDAs, wristwatches, etc.) enables the fast and individual monitoring of the human biosafety, to detect whether there is the departure from the norm and quality standards in the production process, give other important information about the quality of the agricultural production and environmental conditions. At the same time, the developed microsensory systems can function not only using satellite technologies, but also local in mobile retransmitters or in pilotless vehicles. It makes possible to produce electronic and virtual maps of soil, crop yield, foods, information-microbial patterns, intellect-maps for the maintenance of intelligent precision agriculture with the CIMLS-technology. The intelligent systems with the LED e-eye can be used in micro-nanoelectronics, biotechnology, agriculture, medicine, food industry, computer

[1] Koleshko V.M. (1979, 1980, 1982, 1983, 1984, 1985, 1987, 1988, 1989, 1992). *Certificate of* 

*USSR Authorship for Invention* № 491824, № 677586, № 683475, № 722449, №

characteristics of the controllable system, actuators, sensors, etc.

completely and identified [18].

and communication systems and networks.

**5. Conclusion** 

**6. References** 

#### **4.4 Data security for precision agriculture and biosafety**

To ensure the confidentiality of information and the data integrity during processing and transferring in wireless self-learning intelligent sensory systems and networks, an antinoise coding, the superencryption using a private key generated from biometric data, an individual cryptographic data protection with an intelligent technique of the personal authentication superprotection patented by us are used in the sensory systems "CDOT" and "ISLB" for the recognition of the falsification, the imitation and an unauthorized use of synthetic biometric data by means of the analysis of additional micro-nanoinformation patterns of investigated biomatters (Fig. 34a,b,c) [1, 3-5, 16, 17].

A mobile application with a quick response (QR) encoding technique for mobile and watch phones, smartphones, communicators, PDAs, iPads on the Android virtual platform is developed to protect information patterns of bioobjects. A photo taken with the scanning device, e.g., with an embedded microcamera is QR encoded parallel completely, stored in a database including its QR code and divided into spectra (RGB) which are also QR encoded in a transducer separately. More precise QR code can be generated by the analysis and the comparison between the completely QR encoded pattern and the spectral ones. To read QR code, the obtained QR codes of spectral images and the base QR code of the photo are compared to produce the precise image pattern (Fig. 34).

Fig. 34. Android virtual platform with QR encoding information patterns of bioobjects and the superprotection technology of biometric data: (a) nanostructure of the fingerprint; (b) bivariate / three-dimensional (с) cross-correlation function between the fingerprint and the imaged reference one

Intelligent precision agriculture requires a wide use of CIMLS-technologies for the continuous information support with an information security system at all the life-cycle stages of the agricultural production, services sectors and all the levels of personal and social activities in the process of mathematical and software modelling of perspectives and consequences of managerial and technological decision making [1, 18]. Then functioning systems are equipped with an intelligent interface and integral sensory micro-nanosystems for sensing and the adaptive control, with the wireless identification of objects, products, managerial decisions, services, etc. These principles are realized in accordance with International Standard requirements at the same time regulating an electronic data interchange with the nanotechnological biosafety and the information security. The CIMLS-

technology presents distributed data storage in a network computer system including many services and subdivisions of farm enterprises. There is a unified system of rules, the data representation, storage, coding, and communication in the CIMLS-technology. A main principle of CIMLS is the fact that information generated at any stages of the life cycle is stored in CIMLS and is made accessible to every participant of this stage and other ones in concordance with available access permissions to these data. It enables to avoid duplication, unauthorized data substitution, falsification, imitation, changing, and errors of the control system, to abridge the labour, cut time and finances. Actions of government officials are opened to public scrutiny, so there is an integrated logistic control process with the intelligent system of decision-, provision-, decree-, law making. Information in the CIMLStechnology is generated, transformed, encoded, stored, and transmitted using intelligent softwares, e.g., Agro, ADAMIS, ADMOS with a "electronic description" of all the life-cycle objects, and human embellishment or information hiding is eliminated, minimized completely and identified [18].

We developed mathematical models of a controlled process of the complex agricultural production for intelligent precision agriculture, but also the intelligent system "ISAG" is developed for the control of farm enterprises. One of the important directions of precision agriculture is the improvement of control algorithms, the use of steering functions including the nonlinear constraint, the acceleration of a controlled variable, information about characteristics of the controllable system, actuators, sensors, etc.

#### **5. Conclusion**

34 Intelligent Systems

To ensure the confidentiality of information and the data integrity during processing and transferring in wireless self-learning intelligent sensory systems and networks, an antinoise coding, the superencryption using a private key generated from biometric data, an individual cryptographic data protection with an intelligent technique of the personal authentication superprotection patented by us are used in the sensory systems "CDOT" and "ISLB" for the recognition of the falsification, the imitation and an unauthorized use of synthetic biometric data by means of the analysis of additional micro-nanoinformation

A mobile application with a quick response (QR) encoding technique for mobile and watch phones, smartphones, communicators, PDAs, iPads on the Android virtual platform is developed to protect information patterns of bioobjects. A photo taken with the scanning device, e.g., with an embedded microcamera is QR encoded parallel completely, stored in a database including its QR code and divided into spectra (RGB) which are also QR encoded in a transducer separately. More precise QR code can be generated by the analysis and the comparison between the completely QR encoded pattern and the spectral ones. To read QR code, the obtained QR codes of spectral images and the base QR code of the photo are

Fig. 34. Android virtual platform with QR encoding information patterns of bioobjects and the superprotection technology of biometric data: (a) nanostructure of the fingerprint; (b) bivariate / three-dimensional (с) cross-correlation function between the fingerprint

Intelligent precision agriculture requires a wide use of CIMLS-technologies for the continuous information support with an information security system at all the life-cycle stages of the agricultural production, services sectors and all the levels of personal and social activities in the process of mathematical and software modelling of perspectives and consequences of managerial and technological decision making [1, 18]. Then functioning systems are equipped with an intelligent interface and integral sensory micro-nanosystems for sensing and the adaptive control, with the wireless identification of objects, products, managerial decisions, services, etc. These principles are realized in accordance with International Standard requirements at the same time regulating an electronic data interchange with the nanotechnological biosafety and the information security. The CIMLS-

**4.4 Data security for precision agriculture and biosafety** 

patterns of investigated biomatters (Fig. 34a,b,c) [1, 3-5, 16, 17].

compared to produce the precise image pattern (Fig. 34).

and the imaged reference one

The intelligent microsensory systems presented in this chapter are intended for the recognition of optical information patterns of a technology, a product or environment external conditions in the space of multidimensional sensory data on the basis of the LED technology e-eye. These ones enable to solve important practical problems in an agro-industrial complex, e.g., in intelligent precision agriculture and for the control of the biosafety of farm products, soils, plant, and animal foods. The developed intelligent system "ISLB" can inform users about probable homeostatic threats and recognize timely any changes in functioning the human organism by means of self-learning on optical data of biomatters, e.g., using systems on a chip of the type "electronic eye". The application of the intelligent systems developed by us in mobile retransmitters (smartphones, watch and mobile phones, communicators, iPads, PDAs, wristwatches, etc.) enables the fast and individual monitoring of the human biosafety, to detect whether there is the departure from the norm and quality standards in the production process, give other important information about the quality of the agricultural production and environmental conditions. At the same time, the developed microsensory systems can function not only using satellite technologies, but also local in mobile retransmitters or in pilotless vehicles. It makes possible to produce electronic and virtual maps of soil, crop yield, foods, information-microbial patterns, intellect-maps for the maintenance of intelligent precision agriculture with the CIMLS-technology. The intelligent systems with the LED e-eye can be used in micro-nanoelectronics, biotechnology, agriculture, medicine, food industry, computer and communication systems and networks.

#### **6. References**

[1] Koleshko V.M. (1979, 1980, 1982, 1983, 1984, 1985, 1987, 1988, 1989, 1992). *Certificate of USSR Authorship for Invention* № 491824, № 677586, № 683475, № 722449, №

**1. Introduction** 

knowledge management:

define the knowledge management as follows:

document in the next section.

**2. Contributions** 

**2** 

Kuodi Jian

*USA* 

**Knowledge Management in** 

**Bio-Information Systems** 

*Metropolitan State University, Saint Paul, MN* 

Knowledge management is a broad topic. For different people, it may mean different things. For business people, this phrase means the accumulated procedures/processes and experiences (organizational assets), and the way to facilitate the use, and to retain these assets within an organization. The Wikipedia website has the following definition for the

**"Knowledge Management (KM)** comprises a range of strategies and practices used in an organization to identify, create, represent, distribute, and enable adoption of insights and experiences. Such insights and experiences comprise knowledge, either embodied in individuals or embedded in organizational processes or practice." (Internet resource:

For computer science people, especially for those expert system developers, the term "knowledge management" has different meaning. We, computer science scientists, are concerned with the knowledge representation, data mining, and the knowledge structure that facilitates knowledge storage and retrieval with computers in mind. Thus, we will

**Knowledge Management (KM)** comprises a wide range of methods/activities that extract information/knowledge from a body of unstructured raw data; organize the extracted information into structured form called knowledge; and design knowledge databases that

In the above definition, we mentioned several terms such as raw data, information, and knowledge. What are the differences and the relationships among them? And the more fundamental question: how do we reason when faced with these entities? In the following sections, we will address these questions. First, we will outline the contributions of this

In this chapter, we will introduce a computer reasoning method called "evidence theory" that is based on Bayes' theorem. We will describe relationships among raw data, knowledge, and information; we will implement a prototype of the evidence based reasoning software

http://en.wikipedia.org/wiki/Knowledge\_management. Retrieved on 5/30/2011)

are able to store and retrieve knowledge in an efficient way using computers.

780713, № 950145, № 1050511, № 1122160, № 1182939, № 1262317, № 1313292, № 1366018, № 1452423, № 1499691, № 1518985, № 1722068, № 1748575


## **Knowledge Management in Bio-Information Systems**

Kuodi Jian *Metropolitan State University, Saint Paul, MN USA* 

### **1. Introduction**

36 Intelligent Systems

[3] Koleshko V.M., Gulay A.V. & Luchenok S.A. (2006). Sensory System for Soil Express-

[4] Koleshko V.M., Gulay A.V. & Luchenok S.A. (2006). Intelligent Sensory Systems for

[5] Koleshko V.M., Gulay A.V. & Luchenok S.A. (2005). Network Technologies – a Basis for

[6] Koleshko V.M., Varabei Y.A. & Khmurovich N.A. (2010). Intelligent Sensory System for

[7] Vatansever-Ozen S., Tiryaki-Sonmez G., Bugdayci G. & Ozen G. (2011). The Effects of

[8] Koleshko V.M., Varabei Y.A. & Khmurovich N.A. (2011). *Cell Phones, Smartphones and Aging of Organism*, Belarusian National Technical University*,* Belarus, Minsk [9] Papacosta E. & Nassis G.P. (2011). Saliva As a Tool for Monitoring Steroid, Peptide and

[10] Khodonovich O.A. & Beljkov M.V. (2007). Influence of a Geomagnetic Factor on the Saliva Macro-and Microelemental Composition of Children, *Medical Journal*, Vol. 2, pp. 90 [11] Gleeson M. & Pyne D.B. (2000). Exercise Effects on Mucosal Immunity, *Immunology and* 

[12] Koleshko V.M., Varabei Y.A. & Khmurovich N.A. (2010). Multicore Intelligent System of

*and Methods in MEMS Design (MEMSTECH-2009)*, Ukraine, Lviv, April, 2010 [13] Gupta R., Gigras P., Mohapatra H., Goswami V.K. & Chauhan B. (2003). Microbial

[14] Terehina N.A., Khlebnikov V.V. & Krivcov A.V. (2002). *Certificate of Russian Federation* 

[15] Koleshko V.M., Polynkova E.V. & Pautino, A.A. (2005). Sensory Systems with Radio Frequency Identification, *Theoretical and Applied Mechanics,* Vol. 22, pp. 51-62 [16] Koleshko V.M., Varabei Y.A., Azizov P.M. & Khudnitsky A.A. (2009). Intelligent System for

[18] Koleshko V.M., Snigirev S.A. & Marushkevich E.V. (2010). CIMLS-technology for the

1366018, № 1452423, № 1499691, № 1518985, № 1722068, № 1748575 [2] Koleshko V.M., Goidenko P.P. & Buiko L.D. (1979). *Control in Technology of* 

*Microelectronics*, Science and technology, Belarus, Minsk

*Efficiency and Management*, Belarus, Minsk, February, 2006

*Belarusian Space Congress*, Belarus, Minsk, October, 2005

Leptin, *Sports Science and Medicine*, Vol. 10, pp. 283-291

Odessa, June, 2006

14, No. 5, pp. 424-434

pp. 1599-1616

*Cell Biology*, Vol. 78, pp. 536–544

*Authorship for Invention* № 2189044

National Technical University*,* Belarus, Minsk

funds, *Informational systems and technologies,* pp. 513-516

780713, № 950145, № 1050511, № 1122160, № 1182939, № 1262317, № 1313292, №

Diagnostics in Technology of Precision Agriculture, *Proceedings of Scientific and Practical Conference on Sensory Electronics and Microsystem Technologies*, Russia,

Technology of Precision Agriculture, *Proceedings of Scientific and Practical Conference on Scientific Innovative Activity and Entrepreneurship in Agricultural Sector: Problems of* 

the Making of Intelligent Systems for Precision Agriculture, *Proceedings of 2nd*

Optical Analysis of Biomatters, *Proceedings of Scientific and Practical Conference on Progressive Technologies and Development Prospects,* Russia, Tambov, November, 2010

Exercise on Food Intake and Hunger: Relationship with Acylated Ghrelin and

Immune Markers in Sport and Exercise Science, *Science and Medicine in Sport*, Vol.

Pattern Recognition, *Proceedings of International Conference on Perspective Technologies* 

alpha-amylases: a Biotechnological Perspective, *Process Biochemistry*, Vol. 38, No. 11,

Biotesting of Thoughts in Production Process. *Proceedings of the Samara Scientific Center of the Russian Academy of Sciences (Special Edition)*, Russia, Samara, April 2-3, 2009 [17] Koleshko V.M., Varabei Y.A., Azizov P.M., Khudnitsky A.A. & Snigirev S.A. (2009).

*Prospective techniques of biometrical authentication and identification*, Belarusian

control of electronic document management and flows of financial enterprises

Knowledge management is a broad topic. For different people, it may mean different things. For business people, this phrase means the accumulated procedures/processes and experiences (organizational assets), and the way to facilitate the use, and to retain these assets within an organization. The Wikipedia website has the following definition for the knowledge management:

**"Knowledge Management (KM)** comprises a range of strategies and practices used in an organization to identify, create, represent, distribute, and enable adoption of insights and experiences. Such insights and experiences comprise knowledge, either embodied in individuals or embedded in organizational processes or practice." (Internet resource: http://en.wikipedia.org/wiki/Knowledge\_management. Retrieved on 5/30/2011)

For computer science people, especially for those expert system developers, the term "knowledge management" has different meaning. We, computer science scientists, are concerned with the knowledge representation, data mining, and the knowledge structure that facilitates knowledge storage and retrieval with computers in mind. Thus, we will define the knowledge management as follows:

**Knowledge Management (KM)** comprises a wide range of methods/activities that extract information/knowledge from a body of unstructured raw data; organize the extracted information into structured form called knowledge; and design knowledge databases that are able to store and retrieve knowledge in an efficient way using computers.

In the above definition, we mentioned several terms such as raw data, information, and knowledge. What are the differences and the relationships among them? And the more fundamental question: how do we reason when faced with these entities? In the following sections, we will address these questions. First, we will outline the contributions of this document in the next section.

#### **2. Contributions**

In this chapter, we will introduce a computer reasoning method called "evidence theory" that is based on Bayes' theorem. We will describe relationships among raw data, knowledge, and information; we will implement a prototype of the evidence based reasoning software

Knowledge Management in Bio-Information Systems 39

As human, we often want to make sense out of raw data. When encountered a piece of data, we usually try to assign meaning to it, and try to find relationships for it. This is done by associating it with other things (or other data points). For example, consider the shapes in

e

b

a

d

When seeing the items in Figure 2, we will automatically assign an "equal" relationship to item e and item b, assign the same relationship to item c and item d. We probably will

Take another example, if seeing the digits "3 4 5 …", most likely we will assign the meaning of "a sequence of positive integer numbers starting from 3 and extending to infinity." The point is that when there is no context, these raw data have no meaning. When we try to assign meanings to raw data, we are trying to create context for them. When raw data are

The new things are information. There are some differences and relationships between raw data and information. First, information is not just a bunch of raw data piled together. Second, information is the interaction between the raw data and something we called "knowledge." Information depends on the understanding of the person perceiving the data. For example, the symbol "网 页**"**means nothing to an English speaker (to him, it is just raw data), but it conveys some information to a Chinese (to him, the same symbol is information and means a web page). The point is that whether some raw data represent meaningful information depends on the context. And the context is our prior experiences (often, we call these prior experiences "knowledge"). There is no guarantee that the information we extracted from raw data is correct. The correctness and usefulness of raw data depend on the knowledge of the person receiving the data. Another thing to point out is that the

Figure 2.

Fig. 2. Data points in data space

assign a "similarity relationship" to item b and item a.

c

put into context, some new things will happen.

component in the context of the bio-information system framework. The prototype is implemented with Java language and is applied to a medical case example: colorectal cancer. The evidence based reasoning theory proposed in this chapter will have significant impact on computer reasoning and artificial intelligence research.

With the increase of raw power in computer hardware, the search for better intelligent systems never ends. The research topics cover a wide range of areas. For example, some studies focus on the emotional aspect of an intelligent system (Fujita & et al., 2010), while others use statistical reasoning method in classifying news articles (Asy'arie, A. & Pribadi, A., 2009). Compare to existing literatures and reasoning methods, our presentation on the topic of computer reasoning (evidence based reasoning) is thorough. In addition, the reasoning method we proposed is generic in nature, thus it can be used in any domain. One key feature of our method is its simple calculation. Especially when number of evidence gets large, this simplicity becomes more important. Of course, you do not get this for free. You have to do the preparation by calculating degrees for evidences. But the saving you get is well worth the effort.

#### **3. Background for data, information, knowledge, and wisdom**

We can view data, information, knowledge, and wisdom as hierarchical in the context of knowledge management. With the data at the bottom and the wisdom at the top, we journey through concrete to abstract, and through no relationship to strong relationship as we move from left bottom to right top. This phenomenon is shown in Figure 1.

Fig. 1. Hierarchy of data to wisdom

Raw data are just meaningless points in data space. There are no references or relationships among these points. Raw data are like a phrase out of context. By themselves, they mean nothing. (referenced Bellinger, 2004)

component in the context of the bio-information system framework. The prototype is implemented with Java language and is applied to a medical case example: colorectal cancer. The evidence based reasoning theory proposed in this chapter will have significant

With the increase of raw power in computer hardware, the search for better intelligent systems never ends. The research topics cover a wide range of areas. For example, some studies focus on the emotional aspect of an intelligent system (Fujita & et al., 2010), while others use statistical reasoning method in classifying news articles (Asy'arie, A. & Pribadi, A., 2009). Compare to existing literatures and reasoning methods, our presentation on the topic of computer reasoning (evidence based reasoning) is thorough. In addition, the reasoning method we proposed is generic in nature, thus it can be used in any domain. One key feature of our method is its simple calculation. Especially when number of evidence gets large, this simplicity becomes more important. Of course, you do not get this for free. You have to do the preparation by calculating degrees for evidences. But the saving you get is

We can view data, information, knowledge, and wisdom as hierarchical in the context of knowledge management. With the data at the bottom and the wisdom at the top, we journey through concrete to abstract, and through no relationship to strong relationship as we move

knowledge

abstraction concrete

wisdom

Raw data are just meaningless points in data space. There are no references or relationships among these points. Raw data are like a phrase out of context. By themselves, they mean

impact on computer reasoning and artificial intelligence research.

**3. Background for data, information, knowledge, and wisdom** 

from left bottom to right top. This phenomenon is shown in Figure 1.

information

well worth the effort.

strong relationship

Fig. 1. Hierarchy of data to wisdom

no relationship

data

nothing. (referenced Bellinger, 2004)

As human, we often want to make sense out of raw data. When encountered a piece of data, we usually try to assign meaning to it, and try to find relationships for it. This is done by associating it with other things (or other data points). For example, consider the shapes in Figure 2.

Fig. 2. Data points in data space

When seeing the items in Figure 2, we will automatically assign an "equal" relationship to item e and item b, assign the same relationship to item c and item d. We probably will assign a "similarity relationship" to item b and item a.

Take another example, if seeing the digits "3 4 5 …", most likely we will assign the meaning of "a sequence of positive integer numbers starting from 3 and extending to infinity." The point is that when there is no context, these raw data have no meaning. When we try to assign meanings to raw data, we are trying to create context for them. When raw data are put into context, some new things will happen.

The new things are information. There are some differences and relationships between raw data and information. First, information is not just a bunch of raw data piled together. Second, information is the interaction between the raw data and something we called "knowledge." Information depends on the understanding of the person perceiving the data. For example, the symbol "网 页**"**means nothing to an English speaker (to him, it is just raw data), but it conveys some information to a Chinese (to him, the same symbol is information and means a web page). The point is that whether some raw data represent meaningful information depends on the context. And the context is our prior experiences (often, we call these prior experiences "knowledge"). There is no guarantee that the information we extracted from raw data is correct. The correctness and usefulness of raw data depend on the knowledge of the person receiving the data. Another thing to point out is that the

Knowledge Management in Bio-Information Systems 41

different. With current technology, we can delegate computers to reason with low level entities (in the low-left of Figure 1) in the knowledge management hierarchy. This is because at lower levels, reasoning is more objective and concrete. The theoretical foundations of the

The insight that we get from the above discussion on raw data, information, knowledge, and wisdom tells us that reasoning at lower levels is easier than reasoning at the highest level. Since at the lower levels, we only need to deal with knowledge finding (data mining) and the application of the appropriate knowledge to some evidences. At the highest level (the killer idea level), we even do not know the mechanism that produces creative ideas; therefore, it will be much harder to reason at this level. As mentioned in the previous section, the theoretical foundation of computer is Bayes' theorem. Let's investigate what is

The notation P(A|X) means the probability (or the chance) that the event A will happen given the evidence (or the observation) of X. In probability theory, this is called conditional probability. Depending on the quality of evidence X, the probability of event A happening

The symbol "~" means complement, that is, the opposite of what follows it. For example, if P(A) means the probability of event A will happen, then P(~A) means the probability of event A will not happen. One thing to point out is that there are three pieces in Formula 1: the reasoning about the occurrence of an event A (the left side of the equation), the evidence (X), and the causality relationship between the evidence X and the event A (embodied by

In a nutshell, Formula 1 says that if we see a piece of evidence X, we can reason about the chance of event A's occurrence given that the evidence X and the event A has a causality relationship. This is exactly the behavior that a rational person will display given a piece of evidence related to the event. Formula 1 can be extended to include two, three, …, and many pieces of evidence. All we need to do is to apply the formula multiple times. For example, if both X and Y contribute to the occurrence of event A, we can calculate the final probability of event A by applying Formula 1 to get the probability of A given evidence X. Then, we use the result to apply Formula 1 again. Only this time, we should use the result in the first iteration to substitute the prior probability P(A), and P(~A). Actually, we can

To get a better handle on how the Bayes' theorem works, let's work through a concrete

**Example 1:** "Lung cancer is the leading cause of cancer death in the United States." (Williams, 2003, p. 463) Suppose that about 0.2% of the population living in US with age above 20 has lung cancer. When doing an annual check, suppose that 85% of the people with lung cancer will show positive for the chest x-ray test. On the other hand, chest x-ray will have false alarms: 6% of the people without lung cancer will also show positive for the chest

�(�|�)��(�)��(�|��)��(��) (Formula 1)

reasoning at lower levels can be captured by the Bayes' theorem.

the Bayes' theorem? Bayes' theorem can be expressed as Formula 1:

�(�|�) � �(�|�)��(�)

may be heavily affected by the presence of the evidence X.

repeatedly apply Formula 1 to reason any number of evidences.

example. Suppose that we have the following problem statement:

P(X|A) and P(X|~A)).

**4. Theoretical foundation of computer reasoning** 

experiences/knowledge will have influence on the interpretation of the data. The same piece of data may carry different meanings under different contexts (knowledge).

Knowledge is our prior experience. In other words, knowledge is the accumulated relationships and patterns that a person perceives among raw data. For example, if a layman sees the blood glucose test result of 230 mg/dL, he may have no clue as what this means. But for a trained doctor's eye, it means the person had the test is diabetic. The only difference here is the pattern. In the doctor's mind, from his prior training, a series of patterns such as:

Glucose level of 230 mg/dL -> diabetic

Diabetic -> risk of blindness

Diabetic -> risk of kidney failure

exist. On the other hand, there are no such patterns in the layman's mind. In essence, knowledge is the factoring of patterns (this includes summarization, abstraction, and crystallization of patterns). In the world of knowledge management in computer science, the knowledge is accumulated and crystallized patterns and relationships; and the information is the product of the interaction between data and knowledge. In other words, when connecting the dots, you are producing information.

Wisdom is the highest form of deep patterns. Usually, we only attribute wisdom to intelligent beings. Bellinger (2004) has the following description about wisdom:

"Wisdom arises when one understands the foundational principles responsible for the patterns representing knowledge being what they are. And wisdom, even more so than knowledge, tends to create its own context. I have a preference for referring to these foundational principles as eternal truths, yet I find people have a tendency to be somewhat uncomfortable with this labeling. These foundational principles are universal and completely context independent. Of course, this last statement is sort of a redundant word game, for if the principle was context dependent, then it couldn't be universally true now could it?"

In this documentation, we will focus on data, information, and knowledge. We will leave the topic of wisdom to philosophers. Particularly, we will deal with computer reasoning and knowledge management using knowledge databases. Before presenting our methods for the knowledge representation, reasoning, and knowledge management, we need to answer the philosophical question: is there any difference between human reasoning and computer reasoning? Our answer is "Yes."

Computer reasoning and human reasoning are different. One of the biggest differences has something to do with creative ideas. Often, we see someone with so called "killer ideas." "Killer ideas" refer to those ideas that are revolutionary, creative, and not conform to the norms of the contemporary generation. For example, Sir Isaac Newton's law of gravity, Albert Einstein's theory of relativity, and the idea of ten dimensional UNIVERSE are all examples of killer ideas. How exactly these "killer ideas" are produced is still open for debate. However, we do know computers are incapable of producing these ideas (at least for the time being); because, we haven't seen any computer that can produce any meaningful killer ideas yet. Thus, we conclude that computers reasoning and human reasoning are

experiences/knowledge will have influence on the interpretation of the data. The same

Knowledge is our prior experience. In other words, knowledge is the accumulated relationships and patterns that a person perceives among raw data. For example, if a layman sees the blood glucose test result of 230 mg/dL, he may have no clue as what this means. But for a trained doctor's eye, it means the person had the test is diabetic. The only difference here is the pattern. In the doctor's mind, from his prior training, a series of

exist. On the other hand, there are no such patterns in the layman's mind. In essence, knowledge is the factoring of patterns (this includes summarization, abstraction, and crystallization of patterns). In the world of knowledge management in computer science, the knowledge is accumulated and crystallized patterns and relationships; and the information is the product of the interaction between data and knowledge. In other words, when

Wisdom is the highest form of deep patterns. Usually, we only attribute wisdom to

"Wisdom arises when one understands the foundational principles responsible for the patterns representing knowledge being what they are. And wisdom, even more so than knowledge, tends to create its own context. I have a preference for referring to these foundational principles as eternal truths, yet I find people have a tendency to be somewhat uncomfortable with this labeling. These foundational principles are universal and completely context independent. Of course, this last statement is sort of a redundant word game, for if the principle was context dependent, then it couldn't be universally true now

In this documentation, we will focus on data, information, and knowledge. We will leave the topic of wisdom to philosophers. Particularly, we will deal with computer reasoning and knowledge management using knowledge databases. Before presenting our methods for the knowledge representation, reasoning, and knowledge management, we need to answer the philosophical question: is there any difference between human reasoning and computer

Computer reasoning and human reasoning are different. One of the biggest differences has something to do with creative ideas. Often, we see someone with so called "killer ideas." "Killer ideas" refer to those ideas that are revolutionary, creative, and not conform to the norms of the contemporary generation. For example, Sir Isaac Newton's law of gravity, Albert Einstein's theory of relativity, and the idea of ten dimensional UNIVERSE are all examples of killer ideas. How exactly these "killer ideas" are produced is still open for debate. However, we do know computers are incapable of producing these ideas (at least for the time being); because, we haven't seen any computer that can produce any meaningful killer ideas yet. Thus, we conclude that computers reasoning and human reasoning are

intelligent beings. Bellinger (2004) has the following description about wisdom:

piece of data may carry different meanings under different contexts (knowledge).

patterns such as:

could it?"

reasoning? Our answer is "Yes."

Glucose level of 230 mg/dL -> diabetic

connecting the dots, you are producing information.

Diabetic -> risk of blindness

Diabetic -> risk of kidney failure

different. With current technology, we can delegate computers to reason with low level entities (in the low-left of Figure 1) in the knowledge management hierarchy. This is because at lower levels, reasoning is more objective and concrete. The theoretical foundations of the reasoning at lower levels can be captured by the Bayes' theorem.

#### **4. Theoretical foundation of computer reasoning**

The insight that we get from the above discussion on raw data, information, knowledge, and wisdom tells us that reasoning at lower levels is easier than reasoning at the highest level. Since at the lower levels, we only need to deal with knowledge finding (data mining) and the application of the appropriate knowledge to some evidences. At the highest level (the killer idea level), we even do not know the mechanism that produces creative ideas; therefore, it will be much harder to reason at this level. As mentioned in the previous section, the theoretical foundation of computer is Bayes' theorem. Let's investigate what is the Bayes' theorem? Bayes' theorem can be expressed as Formula 1:

$$P(A|X) = \frac{P(\mathbf{X}|\mathbf{A}) \ast P(\mathbf{A})}{P(\mathbf{X}|\mathbf{A}) \ast P(\mathbf{A}) + P(\mathbf{X}|\neg A) \ast P(\neg A)} \tag{\text{Formula 1}}$$

The notation P(A|X) means the probability (or the chance) that the event A will happen given the evidence (or the observation) of X. In probability theory, this is called conditional probability. Depending on the quality of evidence X, the probability of event A happening may be heavily affected by the presence of the evidence X.

The symbol "~" means complement, that is, the opposite of what follows it. For example, if P(A) means the probability of event A will happen, then P(~A) means the probability of event A will not happen. One thing to point out is that there are three pieces in Formula 1: the reasoning about the occurrence of an event A (the left side of the equation), the evidence (X), and the causality relationship between the evidence X and the event A (embodied by P(X|A) and P(X|~A)).

In a nutshell, Formula 1 says that if we see a piece of evidence X, we can reason about the chance of event A's occurrence given that the evidence X and the event A has a causality relationship. This is exactly the behavior that a rational person will display given a piece of evidence related to the event. Formula 1 can be extended to include two, three, …, and many pieces of evidence. All we need to do is to apply the formula multiple times. For example, if both X and Y contribute to the occurrence of event A, we can calculate the final probability of event A by applying Formula 1 to get the probability of A given evidence X. Then, we use the result to apply Formula 1 again. Only this time, we should use the result in the first iteration to substitute the prior probability P(A), and P(~A). Actually, we can repeatedly apply Formula 1 to reason any number of evidences.

To get a better handle on how the Bayes' theorem works, let's work through a concrete example. Suppose that we have the following problem statement:

**Example 1:** "Lung cancer is the leading cause of cancer death in the United States." (Williams, 2003, p. 463) Suppose that about 0.2% of the population living in US with age above 20 has lung cancer. When doing an annual check, suppose that 85% of the people with lung cancer will show positive for the chest x-ray test. On the other hand, chest x-ray will have false alarms: 6% of the people without lung cancer will also show positive for the chest

$$\mathbf{P(A \& B) = P(A \mid B) \* P(B)}\tag{\text{Formula 3}}$$


Knowledge Management in Bio-Information Systems 45

P(positive x-ray | cancer) = 85%

P(positive x-ray | ~cancer) = 6%

P(cancer | positive x-ray) = 85%\* 2.6% / (85%\*2.6%+6%\*97.4%)

As you can see, comparing to the non-risky population (the probability of having cancer 0.028), the probability value of 0.274 of a person in the risky group is much higher. This makes sense since the prior probability of getting lung cancer is higher in this high risk group. In this new example, the quality of the x-ray equipment does not change. The only thing changed is the prior cancer rate, from 0.2% to 2.6%. At first look to the new problem, most people will give the same wrong answer of 85%. But Bayes' reasoning gives us more objective and correct

Bayes' reasoning can be used in situations that have multiple evidences. Let's use Example

**Example 2:** "Lung cancer is the leading cause of cancer death in the United States." (Williams, 2003, p. 463) Suppose that about 0.2% of the population living in US with age above 20 has lung cancer. When doing an annual check, assume that 85% of the people with lung cancer will show positive for the chest x-ray test. On the other hand, chest x-ray will have false alarms: 6% of the people without lung cancer will also show positive for the chest x-ray test. Suppose that a hospital will do two lung cancer screen tests for each annual check patient (assume the two tests are independent). The second test called CT scan is done to improve the accuracy of diagnosis. Suppose that the CT scan has the following characteristics: it returns positive for 85% of the people with lung cancer; it has a lower false rate than the x-ray test and will return false positive for one out of one thousand people without lung cancer. If a person went through the annual check and had positives on both the chest x-ray and the CT scan, what is the probability that he/she has the lung cancer?

**Answer:** We can solve this problem by using the Bayes' theorem twice. We already know that the probability of a person has cancer given that he has positive x-ray is 2.8%; the probability of a person has no cancer given that he has positive x-ray is 97.2%. We can use

3. And plug in the above data into the above Bayes' theorem, we will get:

answer. Here is an example that computer reasoning can be better than a human!

2, which is the extension of Example 1, to illustrate how this is done.

this result and continue to solve the problem as follows:

= 0.0221 / (0.0221+0.0584)

= 0.0221 / 0.0805

= 0.274

P(cancer) = 2.6% (260 out of 10,000 have cancer) (5)

P(~cancer) = 97.4% (9740 out of 10,000 have no cancer) (6)

(85% of people with lung cancer have positive x-ray) (7)

(6% of people without lung cancer have positive x-ray) (8)

2. And plug in the following data:


$$\begin{aligned} \text{(\text{\\_}Power\\_ 1\\_power\\_x\\_y\\_y\\_x\\_y\\_y\\_x\\_y\\_y\\_x\\_x\\_x\\_y)} \\ = 0.0017 & \text{ (\\_}0.0017+0.06) \\ = 0.0017 & \text{ (\\_}0.0617) \\ = 0.028 \end{aligned}$$

This is exactly the same answer we got in the previous section.

Bayes' reasoning needs three pieces of information (all appear on the right of the equation at the beginning of step 5): the percentage of people with lung cancer, the percentage of people without lung cancer who have false alarms, and the percentage of people with lung cancer who show positive on the test. The first piece of information which is part of the priors is the baseline knowledge. The second and third pieces of information which also belong to the priors are the measurement of the quality of evidence. Bayes' reasoning is to use the evidence to change the belief/knowledge (shifting the baseline upwards with positive evidence or downwards with negative evidence). We will use more examples to show how this change of belief (the machine reasoning) happens. The left-side probability is the posterior probability. It is the revised view of the world in the light of evidence which is on the right-side of the equation.

To see how the first piece of information affects the Bayes' result, let's assume that the batch of people doing the annual check is high risk smokers. According to Williams (Williams, 2003, p. 464), smoker's chance of getting lung cancer is 13 times higher than non-smokers. Now, let's ask the same question: what is the probability of the person has lung cancer if he/she has the positive x-ray test given that the cancer rate in this group is 2.6% (2.6% is getting from 0.2\* 13)? Sure enough, the final answer should be different. Actually, the new answer is 27.4%. The following is the analysis and steps showing how we get the correct answer:

1. We use the Bayes' theorem:

�(������|���������� � ���)

<sup>=</sup> P(���������� � ����|�������) � �(������) P(���������� � ����|�������) � �(������) � �(���������� � ���|�������) � �(�������) 2. And plug in the following data:

44 Intelligent Systems

P(positive x-ray | cancer) = 85%

P(positive x-ray | ~cancer) = 6%

P(cancer | positive x-ray) = 85%\* 0.2% / (85%\*0.2%+6%\*99.8%)

Bayes' reasoning needs three pieces of information (all appear on the right of the equation at the beginning of step 5): the percentage of people with lung cancer, the percentage of people without lung cancer who have false alarms, and the percentage of people with lung cancer who show positive on the test. The first piece of information which is part of the priors is the baseline knowledge. The second and third pieces of information which also belong to the priors are the measurement of the quality of evidence. Bayes' reasoning is to use the evidence to change the belief/knowledge (shifting the baseline upwards with positive evidence or downwards with negative evidence). We will use more examples to show how this change of belief (the machine reasoning) happens. The left-side probability is the posterior probability. It is the revised view of the world in the light of evidence which is on

To see how the first piece of information affects the Bayes' result, let's assume that the batch of people doing the annual check is high risk smokers. According to Williams (Williams, 2003, p. 464), smoker's chance of getting lung cancer is 13 times higher than non-smokers. Now, let's ask the same question: what is the probability of the person has lung cancer if he/she has the positive x-ray test given that the cancer rate in this group is 2.6% (2.6% is getting from 0.2\* 13)? Sure enough, the final answer should be different. Actually, the new answer is 27.4%. The following is the analysis and steps showing how we get the correct

P(���������� � ����|�������) � �(������) � �(���������� � ���|�������) � �(�������)

And plug in the above data into the above expression, we will get:

= 0.0017 / (0.0017+0.06)

This is exactly the same answer we got in the previous section.

<sup>=</sup> P(���������� � ����|�������) � �(������)

= 0.0017 / 0.0617

= 0.028

the right-side of the equation.

1. We use the Bayes' theorem:

�(������|���������� � ���)

answer:

P(cancer) = 0.2% (20 out of 10,000 have cancer) (1)

P(~cancer) = 99.8% (9980 out of 10,000 have no cancer) (2)

(85% of people with lung cancer have positive x-ray) (3)

(6% of people without lung cancer have positive x-ray) (4)


P(~cancer) = 97.4% (9740 out of 10,000 have no cancer) (6)

P(positive x-ray | cancer) = 85%

(85% of people with lung cancer have positive x-ray) (7)

P(positive x-ray | ~cancer) = 6%

(6% of people without lung cancer have positive x-ray) (8)

3. And plug in the above data into the above Bayes' theorem, we will get:

P(cancer | positive x-ray) = 85%\* 2.6% / (85%\*2.6%+6%\*97.4%) = 0.0221 / (0.0221+0.0584) = 0.0221 / 0.0805 = 0.274

As you can see, comparing to the non-risky population (the probability of having cancer 0.028), the probability value of 0.274 of a person in the risky group is much higher. This makes sense since the prior probability of getting lung cancer is higher in this high risk group. In this new example, the quality of the x-ray equipment does not change. The only thing changed is the prior cancer rate, from 0.2% to 2.6%. At first look to the new problem, most people will give the same wrong answer of 85%. But Bayes' reasoning gives us more objective and correct answer. Here is an example that computer reasoning can be better than a human!

Bayes' reasoning can be used in situations that have multiple evidences. Let's use Example 2, which is the extension of Example 1, to illustrate how this is done.

**Example 2:** "Lung cancer is the leading cause of cancer death in the United States." (Williams, 2003, p. 463) Suppose that about 0.2% of the population living in US with age above 20 has lung cancer. When doing an annual check, assume that 85% of the people with lung cancer will show positive for the chest x-ray test. On the other hand, chest x-ray will have false alarms: 6% of the people without lung cancer will also show positive for the chest x-ray test. Suppose that a hospital will do two lung cancer screen tests for each annual check patient (assume the two tests are independent). The second test called CT scan is done to improve the accuracy of diagnosis. Suppose that the CT scan has the following characteristics: it returns positive for 85% of the people with lung cancer; it has a lower false rate than the x-ray test and will return false positive for one out of one thousand people without lung cancer. If a person went through the annual check and had positives on both the chest x-ray and the CT scan, what is the probability that he/she has the lung cancer?

**Answer:** We can solve this problem by using the Bayes' theorem twice. We already know that the probability of a person has cancer given that he has positive x-ray is 2.8%; the probability of a person has no cancer given that he has positive x-ray is 97.2%. We can use this result and continue to solve the problem as follows:

$$\begin{array}{c} \text{P(positive CT scan \mid cancer)} = 85\%\\ \text{(85\% of people with lung cancer have positive CT scan)} \end{array} \tag{11}$$

Knowledge Management in Bio-Information Systems 49

Solution), the reasoning occurs by applying the knowledge from the Knowledge database to

Inferencing (using knowledgebase and evidence theory)

evidence

Solution

Stage 2

Model building (data mining)

Stage 1

One of the important components in Figure 4 is evidence. In computer reasoning, evidence is the main factor that influences a computer's judgment. One of the important characteristics of evidence is its quality. In this abstract view, reasoning is persuaded by the presence of evidence. For example, without any evidence, our view of the initial world about the probability of a person with lung cancer is 0.2%, with the presence of first piece of evidence, the positive result of x-ray test, our view of the modified world about the probability of a person with lung cancer is 2.8%, with the presence of two pieces of evidence, the positive result of x-ray and the positive result of CT scan, our view of the new world

Raw data Knowledge database Intelligent system

HasLungCancer(positive x-ray, 0.028)

HasLungCancer(smoker, positive x-ray, 0.274)

The main role of a piece of evidence is its influence on a rational mind. To see how this influence is realized, we need to investigate the properties about evidence. In this section, we will give definition of evidence; and will describe properties of evidence. These

about the probability of a person with lung cancer is changed again to 96%.

**6.1 The properties and the definition of evidence (or a test)** 

definition and properties are given in the following highlight box.

the evidence.

Fig. 4. Abstract view on computer reasoning

cancer is reduced only from 20 to 17. Thus, the proportion of 17 within 616 (the total number of people with positive x-ray) is much larger than the proportion of 20 within 10,000.

#### **5.2 Conditional probabilities play the role as shifters**

From Example 2, you may have already seen the role played by the two conditional probabilities: P(positive x-ray | cancer) and P(positive x-ray | healthy). They are the shifters: P(positive x-ray | cancer) shifts our view positively and P(positive x-ray | healthy) shifts our view negatively. In other words, large value of P(positive x-ray | cancer) will increase our confidence in predicting a person has cancer given that he has a positive test. On the other hand, small value of P(positive x-ray | healthy) will increase our confidence in predicting a person has cancer given that he has a positive test. The quality measurement of a test in altering our view to the world is the inter-play of these two conditional probabilities. They map the number of cancerous people and the number of healthy people in one world into another world. Their ratio can be used as a measurement of effectiveness for a test to be evidence.

We will show later that for a test to be effective, its positive conditional probability cannot have the same value as its negative conditional probability. Otherwise, the test will shift our view to the same amount and the net effect is nil.

The second application of the Bayes' theorem alters the ratio of number of healthy people to the number of cancer people in the universe even further. In the second mapped world, the number of people who have cancer to be included is 14, and the number of people who are healthy to be included is 0.6. In the second new world, seeing both positive evidences (a positive x-ray and a positive CT scan) is convincing evidence that the person has lung cancer (96% probability).

Bayes' theorem is important in understanding the basic statistical reasoning mechanism. In its original form, it is not easy to use, especially in the face of multiple evidences. In the next section, we will introduce a computer reasoning theory: evidence theory that is based on the Bayes' theory.

#### **6. The evidence theory of computer reasoning**

In this section, we are going to present a computer reasoning method called evidence theory that is more convenient and easier to use than the Bayes' theorem. To help our presentation, we will define some terms and use some mathematical formulas along the way.

If we take an abstract view, the computer reasoning can be summarized as: capture the causality relationships from raw data, build a knowledge database using these relationships, and make a judgment (or inference) on pieces of evidence based on existing knowledge database. The essence of the summary is shown in Figure 4.

The computer reasoning mechanism shown in Figure 4 can be explained as having two stages: the knowledge/pattern building and the application of knowledge to the new evidence. In the first stage (indicated by arrows from the Raw data to the Knowledge database), knowledge is produced either by data mining from raw data or by direct human insertion; in the second stage (indicated by arrows from the Knowledge database to the

cancer is reduced only from 20 to 17. Thus, the proportion of 17 within 616 (the total number

From Example 2, you may have already seen the role played by the two conditional probabilities: P(positive x-ray | cancer) and P(positive x-ray | healthy). They are the shifters: P(positive x-ray | cancer) shifts our view positively and P(positive x-ray | healthy) shifts our view negatively. In other words, large value of P(positive x-ray | cancer) will increase our confidence in predicting a person has cancer given that he has a positive test. On the other hand, small value of P(positive x-ray | healthy) will increase our confidence in predicting a person has cancer given that he has a positive test. The quality measurement of a test in altering our view to the world is the inter-play of these two conditional probabilities. They map the number of cancerous people and the number of healthy people in one world into another world. Their ratio can be used as a measurement of effectiveness

We will show later that for a test to be effective, its positive conditional probability cannot have the same value as its negative conditional probability. Otherwise, the test will shift our

The second application of the Bayes' theorem alters the ratio of number of healthy people to the number of cancer people in the universe even further. In the second mapped world, the number of people who have cancer to be included is 14, and the number of people who are healthy to be included is 0.6. In the second new world, seeing both positive evidences (a positive x-ray and a positive CT scan) is convincing evidence that the person has lung

Bayes' theorem is important in understanding the basic statistical reasoning mechanism. In its original form, it is not easy to use, especially in the face of multiple evidences. In the next section, we will introduce a computer reasoning theory: evidence theory that is based on the

In this section, we are going to present a computer reasoning method called evidence theory that is more convenient and easier to use than the Bayes' theorem. To help our presentation,

If we take an abstract view, the computer reasoning can be summarized as: capture the causality relationships from raw data, build a knowledge database using these relationships, and make a judgment (or inference) on pieces of evidence based on existing knowledge

The computer reasoning mechanism shown in Figure 4 can be explained as having two stages: the knowledge/pattern building and the application of knowledge to the new evidence. In the first stage (indicated by arrows from the Raw data to the Knowledge database), knowledge is produced either by data mining from raw data or by direct human insertion; in the second stage (indicated by arrows from the Knowledge database to the

we will define some terms and use some mathematical formulas along the way.

of people with positive x-ray) is much larger than the proportion of 20 within 10,000.

**5.2 Conditional probabilities play the role as shifters** 

view to the same amount and the net effect is nil.

**6. The evidence theory of computer reasoning** 

database. The essence of the summary is shown in Figure 4.

for a test to be evidence.

cancer (96% probability).

Bayes' theory.

Solution), the reasoning occurs by applying the knowledge from the Knowledge database to the evidence.

#### Fig. 4. Abstract view on computer reasoning

One of the important components in Figure 4 is evidence. In computer reasoning, evidence is the main factor that influences a computer's judgment. One of the important characteristics of evidence is its quality. In this abstract view, reasoning is persuaded by the presence of evidence. For example, without any evidence, our view of the initial world about the probability of a person with lung cancer is 0.2%, with the presence of first piece of evidence, the positive result of x-ray test, our view of the modified world about the probability of a person with lung cancer is 2.8%, with the presence of two pieces of evidence, the positive result of x-ray and the positive result of CT scan, our view of the new world about the probability of a person with lung cancer is changed again to 96%.

#### **6.1 The properties and the definition of evidence (or a test)**

The main role of a piece of evidence is its influence on a rational mind. To see how this influence is realized, we need to investigate the properties about evidence. In this section, we will give definition of evidence; and will describe properties of evidence. These definition and properties are given in the following highlight box.

Knowledge Management in Bio-Information Systems 51

**Definition of evidence strength:** we define strength of evidence (or a test) as the probability that the evidence gives true positive divided by the probability that the evidence gives a

One thing to point out is that the summation of the probability of P(Posi|Cause) and the probability of P(Posi|~Cause) is not necessarily 1. Once defined evidence strength, we can divide evidence two categories: positive evidence and negative evidence. When the value of strength is greater than 1, the evidence will shift our belief in the positive way, thus we name it positive evidence; on the other hand, when the value of strength is smaller than 1, the evidence has the effect of shift our belief in the negative way, thus we name it negative

The probability P(Posi | Cause) on the right side of Formula 5 captures the causality relationship in the real world. It means the probability of something causes the evidence (test) to be positive. In our Example 1, it will take the form: P(positive x-ray | cancer), and it means that the probability of lung cancer causes the x-ray to be positive; and P(positive x-

Now, let's give some observations about evidence. First, as mentioned before, to be effective evidence, the value of a test's positive conditional probability cannot have the same value as its negative conditional probability. Thus, in terms of strength, we have the following

**Observation 1:** when the evidence strength is 1, it is not good evidence. Using the above definition, the effectiveness of a test (or a piece of evidence) is measured in terms of its strength. If the value of strength is 1, then the test is useless as a piece of evidence (it is neutral). When the value of strength is greater than 1, it is positive evidence (seeing the evidence will shift our view regarding the trueness of the event "Cause" to the positive side); when the value of strength is smaller than 1 and greater than 0, it is negative evidence

For example, if we are asked whether flipping a fair coin is a good test for predicting a person has lung cancer (assume that a head means the person has cancer and a tail means

strength(flipping a coin) = P(head | cancer) / P(head | ~cancer) = 0.5 / 0.5 = 1 **Note:** the reason that P(head | cancer) = 0.5 is the fact that the information of a patient has cancer has nothing to do with the outcome of flipping a coin. The chance of getting a head is still governed by its old chance of 50%. We will have the same argument for the probability

2. Based on our evidence theory, we know it shifts our belief to the same distance for

positive and negative direction. Thus, we conclude that it's not a good test.

With regard to the cause of strong evidence, we have the following observation:

strength(evidence) = P(Posi|Cause) / P(Posi|~Cause) (Formula 5)

false positive. In other words, it can be represented as the following formula:

ray | ~cancer) means the probability of a false alarm.

(diminishes our view about the trueness of the "Cause").

the person has no cancer)? We can proceed like the following:

1. First, we calculate the strength of flipping a coin as a test and it will be:

evidence.

observation:

P(head | ~ cancer).

#### **Evidence Theory and Evidence Properties**

**The Main Interest:** Suppose that A represents an event of interest; E represents the a piece of envidence. The main interest of the evidence theory is to calculate the probability:

P(A|E) (Formula 4)

**Definition of Evidence:** we define evidence (or a test) E as a piece of information that has the ability to change the value of probability defined in Formula 4. The underlining reason for this ability is the causality relationship existed between the event A and the evidence E.

**Assumption about Event A:** in the absence of any evidence, we will assume that the probability of event A occurring is the same as the probability of its not occurring. That is, P(A) = 50%.

**Properties of Evidence:** evidence has following three properties:

**Property 1:** if evidence E increases the probability of event A, then the evidence E is positive evidence relative to event A.

**Property 2:** if evidence E decreases the probability of event A, then the evidence E is negative evidence relative to event A.

**Property 3:** the quality of evidence E is measured in terms of evidence strength (which will be defined in the next section).

#### **6.2 The quality of evidence (evidence strength)**

As mentioned before, one important function of a piece of evidence is its influence on a rational mind. Thus, the quality measurement of a piece of evidence should also be based on its ability to influence. For example, if evidence A convinced us an event (or goal achievability) will happen with 80 percent certainty while evidence B convinced us the same event will happen with 90 percent certainty, then we would say evidence B is better. We can quantify the quality of evidence by introducing the concept of evidence strength. With this measurement criterion in mind, try to answer the following question:

**Question 1:** With regard to the two tests mentioned in Example 2: the x-ray test, and the CT scan test, which one is better in swaying us to believe that the person in question has lung cancer?

Here is the repeat of some statistics for the two evidences (a medical test can be regarded as evidence from Bayes' theorem's point of view):


Before answering the above question, let's define some terms. In the following, "Posi|Cause" means that the existence of "Cause" causes the evidence "Posi" to appear; "Posi|~Cause" means that the absence of "Cause" causes the evidence "Posi" to appear. Now, we will define the strength of evidence as follows:

**The Main Interest:** Suppose that A represents an event of interest; E represents the a piece of envidence. The main interest of the evidence theory is to calculate the probability:

**Definition of Evidence:** we define evidence (or a test) E as a piece of information that has the ability to change the value of probability defined in Formula 4. The underlining reason for this ability is the causality relationship existed between the event A and the evidence E.

**Assumption about Event A:** in the absence of any evidence, we will assume that the probability of event A occurring is the same as the probability of its not occurring. That is,

**Property 1:** if evidence E increases the probability of event A, then the evidence E is

**Property 2:** if evidence E decreases the probability of event A, then the evidence E is

**Property 3:** the quality of evidence E is measured in terms of evidence strength (which will

As mentioned before, one important function of a piece of evidence is its influence on a rational mind. Thus, the quality measurement of a piece of evidence should also be based on its ability to influence. For example, if evidence A convinced us an event (or goal achievability) will happen with 80 percent certainty while evidence B convinced us the same event will happen with 90 percent certainty, then we would say evidence B is better. We can quantify the quality of evidence by introducing the concept of evidence strength. With this

**Question 1:** With regard to the two tests mentioned in Example 2: the x-ray test, and the CT scan test, which one is better in swaying us to believe that the person in question has lung

Here is the repeat of some statistics for the two evidences (a medical test can be regarded as

X-ray test: 85% of the people with lung cancer will show positive; 6% of the people

CT scan test: 85% of the people with lung cancer will show positive; 0.1% of the people

Before answering the above question, let's define some terms. In the following, "Posi|Cause" means that the existence of "Cause" causes the evidence "Posi" to appear; "Posi|~Cause" means that the absence of "Cause" causes the evidence "Posi" to appear.

**Properties of Evidence:** evidence has following three properties:

measurement criterion in mind, try to answer the following question:

P(A|E) (Formula 4)

**Evidence Theory and Evidence Properties** 

positive evidence relative to event A.

negative evidence relative to event A.

**6.2 The quality of evidence (evidence strength)** 

evidence from Bayes' theorem's point of view):

without lung cancer will also show positive.

without lung cancer will also show positive.

Now, we will define the strength of evidence as follows:

be defined in the next section).

P(A) = 50%.

cancer?

**Definition of evidence strength:** we define strength of evidence (or a test) as the probability that the evidence gives true positive divided by the probability that the evidence gives a false positive. In other words, it can be represented as the following formula:

$$\text{strength(evidence)} = \text{P(Posi | Cause)} \mid \text{P(Posi | \sim Cause)} \qquad \text{(Formula 5)}$$

One thing to point out is that the summation of the probability of P(Posi|Cause) and the probability of P(Posi|~Cause) is not necessarily 1. Once defined evidence strength, we can divide evidence two categories: positive evidence and negative evidence. When the value of strength is greater than 1, the evidence will shift our belief in the positive way, thus we name it positive evidence; on the other hand, when the value of strength is smaller than 1, the evidence has the effect of shift our belief in the negative way, thus we name it negative evidence.

The probability P(Posi | Cause) on the right side of Formula 5 captures the causality relationship in the real world. It means the probability of something causes the evidence (test) to be positive. In our Example 1, it will take the form: P(positive x-ray | cancer), and it means that the probability of lung cancer causes the x-ray to be positive; and P(positive xray | ~cancer) means the probability of a false alarm.

Now, let's give some observations about evidence. First, as mentioned before, to be effective evidence, the value of a test's positive conditional probability cannot have the same value as its negative conditional probability. Thus, in terms of strength, we have the following observation:

**Observation 1:** when the evidence strength is 1, it is not good evidence. Using the above definition, the effectiveness of a test (or a piece of evidence) is measured in terms of its strength. If the value of strength is 1, then the test is useless as a piece of evidence (it is neutral). When the value of strength is greater than 1, it is positive evidence (seeing the evidence will shift our view regarding the trueness of the event "Cause" to the positive side); when the value of strength is smaller than 1 and greater than 0, it is negative evidence (diminishes our view about the trueness of the "Cause").

For example, if we are asked whether flipping a fair coin is a good test for predicting a person has lung cancer (assume that a head means the person has cancer and a tail means the person has no cancer)? We can proceed like the following:

1. First, we calculate the strength of flipping a coin as a test and it will be:

strength(flipping a coin) = P(head | cancer) / P(head | ~cancer) = 0.5 / 0.5 = 1

**Note:** the reason that P(head | cancer) = 0.5 is the fact that the information of a patient has cancer has nothing to do with the outcome of flipping a coin. The chance of getting a head is still governed by its old chance of 50%. We will have the same argument for the probability P(head | ~ cancer).

2. Based on our evidence theory, we know it shifts our belief to the same distance for positive and negative direction. Thus, we conclude that it's not a good test.

With regard to the cause of strong evidence, we have the following observation:

Knowledge Management in Bio-Information Systems 53

2. When doing an annual check, assume that 85% of the people with lung cancer will show positive for the chest x-ray test. About 6% of the people without lung cancer will

3. The second test called CT scan is done independently. It returns positive for 85% of the

**Answer:** For first part of the question, we can use the result in the previous section. Here is

strength(x-ray test) = P(positive x-ray | cancer) / P(positive x-ray | ~cancer)

strength(CT scan test) = P(positive CT scan | cancer) / P(positive CT scan | ~cancer)

For the second part of the problem (the distance each test sways our beliefs), we will

P(cancer) = 0.2%, P(healthy) = 99.8% For a person in this initial world, the probability of having lung cancer is 0.2% (pretty low). If we use the x-ray as a membership test, then the probability become following (already

P(cancer | positive x-ray) = 2.8%, P(healthy | positive x-ray) = 97.2% The x-ray test shifted our view from P(cancer) = 0.2% to P(cancer | positive x-ray) = 2.8%. It

Now, let's see how much the CT scan test will shift our view. Starting from the initial world, if we use the CT scan as a membership test, then the probability can be calculated as

P(�����������������|�������) � �(������) � �(����������������|�������) � �(�������)

P(cancer) = 0.2% (20 out of 10,000 have cancer)

P(~cancer) = 99.8% (9980 out of 10,000 have no cancer)

P(positive CT scan | cancer) = 85%

also show positive for the chest x-ray test.

the repeat: For x-ray test, we have the following:

= 0.85 / 0.06 = 14.17

= 0.85 / 0.001 = 850

We started in the initial world with following probabilities:

is a positive evidence. The percentage increase is 2.6%.

<sup>=</sup> P(�����������������|�������) � �(������)

For CT scan, we have:

proceed as follows:

following:

calculated in previous sections):

We use the Bayes' theorem:

�(������|����������������)

And plug in the following data:

people with lung cancer; its false rate is 0.1%.

**Observation 2:** strong evince is not caused by a very high probability of cause leads to the positive test, rather it is caused by a very low probability of not-cause could have led to the positive test.

For example, if it is raining, the grass in my front yard (there is no roof) is likely to be wet. But seeing the grass wet does not necessarily mean that it is raining (maybe it is caused by the sprinkler). In other words, when seeing the evidence of wet grass, we cannot reason that it is raining with certainty. This is a case of high probability of cause-effect but week evidence.

On the hand, if we are watching an area there is no sprinkler. Then, seeing the wet grass would always mean that it is raining, even though we assume that there is a weak causation link such as the rain will cause the grass wet only 60% of times. This is a case of low probability of cause-effect but strong evidence.

Now, let's answer the Question 1. We will use the evidence strength value to help us make the conclusion. For x-ray test, we have the following:

strength(x-ray test) = P(positive x-ray | cancer) / P(positive x-ray | ~cancer)

= 0.85 / 0.06 = 14.17

For CT scan, we have:

strength(CT scan test) = P(positive CT scan | cancer) / P(positive CT scan | ~cancer) = 0.85 / 0.001 = 850

Since the value 850 is greater than 14.17, we conclude that CT scan test is a better evidence in convincing us that the patient in question has lung cancer.

#### **6.3 The relationship between the evidence strength and its influence power**

The discussion above gives us some insights about evidence. In this section, we will investigate the relationship between the evidence strength and its power to influence the outcome of an event. Specifically, we want to see how the existence of a piece of evidence will shift our belief (its direction and its amount (may be rough estimation)). Based on the intuition we have about the evidence, we make the following claim.

**Claim 1:** the influence power of a given piece of evidence is proportional to the value of evidence strength. For positive evidence, the larger evidence strength value, the stronger the influence power; for negative evidence, the smaller evidence strength value, the stronger the influence power.

We will use the following example to give some insight about our Claim 1:

**Example 3:** Using the data in Example 2, calculate the strength for x-ray test and the strength for the CT scan test. Then, calculate the distance that each test moves our belief (including the direction) in terms of percentage change. We repeat the main points and data in the following:

1. About 0.2% of the population living in US with age above 20 has lung cancer.


**Answer:** For first part of the question, we can use the result in the previous section. Here is the repeat: For x-ray test, we have the following:

strength(x-ray test) = P(positive x-ray | cancer) / P(positive x-ray | ~cancer) = 0.85 / 0.06 = 14.17

For CT scan, we have:

52 Intelligent Systems

**Observation 2:** strong evince is not caused by a very high probability of cause leads to the positive test, rather it is caused by a very low probability of not-cause could have led to the

For example, if it is raining, the grass in my front yard (there is no roof) is likely to be wet. But seeing the grass wet does not necessarily mean that it is raining (maybe it is caused by the sprinkler). In other words, when seeing the evidence of wet grass, we cannot reason that it is raining with certainty. This is a case of high probability of cause-effect but week

On the hand, if we are watching an area there is no sprinkler. Then, seeing the wet grass would always mean that it is raining, even though we assume that there is a weak causation link such as the rain will cause the grass wet only 60% of times. This is a case of low

Now, let's answer the Question 1. We will use the evidence strength value to help us make

strength(x-ray test) = P(positive x-ray | cancer) / P(positive x-ray | ~cancer)

strength(CT scan test) = P(positive CT scan | cancer) / P(positive CT scan | ~cancer)

Since the value 850 is greater than 14.17, we conclude that CT scan test is a better evidence in

The discussion above gives us some insights about evidence. In this section, we will investigate the relationship between the evidence strength and its power to influence the outcome of an event. Specifically, we want to see how the existence of a piece of evidence will shift our belief (its direction and its amount (may be rough estimation)). Based on the

**Claim 1:** the influence power of a given piece of evidence is proportional to the value of evidence strength. For positive evidence, the larger evidence strength value, the stronger the influence power; for negative evidence, the smaller evidence strength value, the stronger the

**Example 3:** Using the data in Example 2, calculate the strength for x-ray test and the strength for the CT scan test. Then, calculate the distance that each test moves our belief (including the direction) in terms of percentage change. We repeat the main points and data

**6.3 The relationship between the evidence strength and its influence power** 

positive test.

evidence.

For CT scan, we have:

influence power.

in the following:

probability of cause-effect but strong evidence.

= 0.85 / 0.06 = 14.17

= 0.85 / 0.001 = 850

convincing us that the patient in question has lung cancer.

intuition we have about the evidence, we make the following claim.

We will use the following example to give some insight about our Claim 1:

1. About 0.2% of the population living in US with age above 20 has lung cancer.

the conclusion. For x-ray test, we have the following:

strength(CT scan test) = P(positive CT scan | cancer) / P(positive CT scan | ~cancer)

= 0.85 / 0.001 = 850

For the second part of the problem (the distance each test sways our beliefs), we will proceed as follows:

We started in the initial world with following probabilities:

$$\text{P(cancer)} = 0.2\% \text{ (healthy)} = 99.8\%$$

For a person in this initial world, the probability of having lung cancer is 0.2% (pretty low). If we use the x-ray as a membership test, then the probability become following (already calculated in previous sections):

P(cancer | positive x-ray) = 2.8%, P(healthy | positive x-ray) = 97.2%

The x-ray test shifted our view from P(cancer) = 0.2% to P(cancer | positive x-ray) = 2.8%. It is a positive evidence. The percentage increase is 2.6%.

Now, let's see how much the CT scan test will shift our view. Starting from the initial world, if we use the CT scan as a membership test, then the probability can be calculated as following:

We use the Bayes' theorem:

�(������|����������������)

<sup>=</sup> P(�����������������|�������) � �(������)

P(�����������������|�������) � �(������) � �(����������������|�������) � �(�������)

And plug in the following data:

P(cancer) = 0.2% (20 out of 10,000 have cancer)

P(~cancer) = 99.8% (9980 out of 10,000 have no cancer)

P(positive CT scan | cancer) = 85%

Knowledge Management in Bio-Information Systems 55

850:1 (get from 0.85/0.001) To get the answer for low level reasoning, we calculate the odds for a person with cancer who score positive on the two tests, versus a person without cancer who score positive on the two tests. Using the basic principles in algebra, the above odds can be calculated as

2\*14.17\*850:998\*1\*1 =

Once get the final odds, we can get probability of a person having lung cancer given that he

P(cancer | positive x-ray & positive CT scan) = 24089 / (24089+998)

As you can see, using the ratio and the odds tool is simpler than using the Bayes' theorem directly. We can simplify our calculation even further by using another tool called logarithm in mathematics. Before we can take the advantage of logarithm, we need to give a new

**Definition of evidence degree:** we define evidence degree of a test as the as the following

Once represented in logarithmic format (degree of evidence), the aggregated effect of

As mentioned before, at low-level reasoning, the logic employed by a human is the same as the Bayes' theorem. In this section, we will show how to reason using the evidence expressed in the form of degree. As the topic suggested, the focus of our reasoning method

**Question Type:** Given a set of evidences and prior probability of an event A, we want to reason about the posterior probability of A (here, the event of interest can be anything, such as the survival chance of a disease, the goal in a planning problem, etc.). In other words, we

P(A | seen evidences x, y, z, …) = ?

evidence toward a goal can be obtained by simple adding instead of multiplying.

is on evidence. The reasoning method addresses the question of the following type:

degree(test) = 10 log10 strength (test) (Formula 6)

strength(test) = 10 degree(test) / 10 (Formula 7)

24089:998

This is the same answer as we get using Bayes' theorem in section 5.

To get the strength from the degree, we use the following formula:

want to figure out the left side of the following equation:

following

formula:

score both tests positive as following:

= 24089 / 25087

**6.5 The evidence based reasoning** 

definition on evidence called evidence degree.

= 96%

(85% of people with lung cancer have positive CT scan)

P(positive CT scan | ~cancer) = 0.1%

(0.1% of people without lung cancer have positive CT scan)

And plug in the above data into the above Bayes' theorem, we will get:

$$\text{P(cancer \mid positive CT scan)} = 85\% \text{\*} 0.2\% \text{ / (\$\\$5\% \text{\*} 0.2\% + 0.1\% \text{\*} 99.8\%)}$$

$$= 0.0017 \text{ / (\$0.0017 + 0.001)}$$

$$= 0.0017 \text{ / } 0.0027$$

$$= 0.63$$

This result tells use that the CT scan test will shift our belief in positive direction, The percentage increase is 62.8%. These results support our claim 1.

Note that the x-ray test and CT scan test have the same positive cause-effect probability rate but different false alarm rate. In x-ray test, the false alarm probability P(positive x-ray | ~cancer) is 6%, while in CT scan test, the false alarm probability P(positive CT scan | ~cancer) is only 0.1%. Here is an example that the **low false alarm probability is the dominate factor** in deciding the strength of evidence.

#### **6.4 The logarithmical representation of evidence degrees**

In the previous section, we used the ratio of two conditional probabilities as the strength measurement. Under our abstract view of reasoning model in Figure 4, evidences are used to distort the world space. As indicated in that figure, reasoning is the process of make a judgment using the knowledge (embedded in the conditionals) based on the evidence (the right side of "|" on the left side of Formula 1) presented. One thing to point out is that our abstract reasoning model can be applied to multiple evidences.

To capture the essence of the low-level reasoning in situations with multiple evidences, we can use a tool in mathematics called ratio and the concept in statistics called odds. Also, the use of these tools will make reasoning in situations that have multiple evidences easier. Odds can capture the same information as probability. In statistics, odds are defined as the ratio of the probability of an event's occurring to the probability of its not occurring. The reasoning of solving the problem in Example 2 using the odds concept will be like this: in the initial world, the lung cancer rate is 0.2%. Thus, 2 out of 1000 people have lung cancer, and 998 people out of 1000 people do not have lung cancer. Using odds, we define the event of interest as a person has lung cancer vs. a person has no cancer. So the 0.2% cancer rate can be expressed as the following odds:

2:998

And the evidence strengths of the two tests x-ray, and CT scan can be expressed in odds notation as:

14.17:1 (get from 0.85/0.06)

(85% of people with lung cancer have positive CT scan)

P(positive CT scan | ~cancer) = 0.1%

(0.1% of people without lung cancer have positive CT scan)

P(cancer | positive CT scan) = 85%\* 0.2% / (85%\*0.2%+0.1%\*99.8%)

This result tells use that the CT scan test will shift our belief in positive direction, The

Note that the x-ray test and CT scan test have the same positive cause-effect probability rate but different false alarm rate. In x-ray test, the false alarm probability P(positive x-ray | ~cancer) is 6%, while in CT scan test, the false alarm probability P(positive CT scan | ~cancer) is only 0.1%. Here is an example that the **low false alarm probability is the** 

In the previous section, we used the ratio of two conditional probabilities as the strength measurement. Under our abstract view of reasoning model in Figure 4, evidences are used to distort the world space. As indicated in that figure, reasoning is the process of make a judgment using the knowledge (embedded in the conditionals) based on the evidence (the right side of "|" on the left side of Formula 1) presented. One thing to point out is that our

To capture the essence of the low-level reasoning in situations with multiple evidences, we can use a tool in mathematics called ratio and the concept in statistics called odds. Also, the use of these tools will make reasoning in situations that have multiple evidences easier. Odds can capture the same information as probability. In statistics, odds are defined as the ratio of the probability of an event's occurring to the probability of its not occurring. The reasoning of solving the problem in Example 2 using the odds concept will be like this: in the initial world, the lung cancer rate is 0.2%. Thus, 2 out of 1000 people have lung cancer, and 998 people out of 1000 people do not have lung cancer. Using odds, we define the event of interest as a person has lung cancer vs. a person has no cancer. So the 0.2% cancer rate can

2:998 And the evidence strengths of the two tests x-ray, and CT scan can be expressed in odds

14.17:1 (get from 0.85/0.06)

And plug in the above data into the above Bayes' theorem, we will get:

= 0.0017 / (0.0017+0.001)

percentage increase is 62.8%. These results support our claim 1.

= 0.0017 / 0.0027

**dominate factor** in deciding the strength of evidence.

**6.4 The logarithmical representation of evidence degrees** 

abstract reasoning model can be applied to multiple evidences.

be expressed as the following odds:

notation as:

= 0.63

850:1 (get from 0.85/0.001)

To get the answer for low level reasoning, we calculate the odds for a person with cancer who score positive on the two tests, versus a person without cancer who score positive on the two tests. Using the basic principles in algebra, the above odds can be calculated as following

#### 2\*14.17\*850:998\*1\*1 =

#### 24089:998

Once get the final odds, we can get probability of a person having lung cancer given that he score both tests positive as following:

P(cancer | positive x-ray & positive CT scan) = 24089 / (24089+998) = 24089 / 25087 = 96%

This is the same answer as we get using Bayes' theorem in section 5.

As you can see, using the ratio and the odds tool is simpler than using the Bayes' theorem directly. We can simplify our calculation even further by using another tool called logarithm in mathematics. Before we can take the advantage of logarithm, we need to give a new definition on evidence called evidence degree.

**Definition of evidence degree:** we define evidence degree of a test as the as the following formula:

$$\text{degree(test)} = 10 \log\_{10} \text{strength (test)} \tag{\text{Formula 6}}$$

To get the strength from the degree, we use the following formula:

$$\text{strength(test)} = 10^{\text{degree(test)}} / \, ^{\text{(10)}} \qquad \text{(Formula 7)}$$

Once represented in logarithmic format (degree of evidence), the aggregated effect of evidence toward a goal can be obtained by simple adding instead of multiplying.

#### **6.5 The evidence based reasoning**

As mentioned before, at low-level reasoning, the logic employed by a human is the same as the Bayes' theorem. In this section, we will show how to reason using the evidence expressed in the form of degree. As the topic suggested, the focus of our reasoning method is on evidence. The reasoning method addresses the question of the following type:

**Question Type:** Given a set of evidences and prior probability of an event A, we want to reason about the posterior probability of A (here, the event of interest can be anything, such as the survival chance of a disease, the goal in a planning problem, etc.). In other words, we want to figure out the left side of the following equation:

P(A | seen evidences x, y, z, …) = ?

Knowledge Management in Bio-Information Systems 57

P(cancer | positive x-ray & positive CT scan) = 0.96 Here, we are going to show you that our new framework of reasoning will help us to get the

First, we decide what is the question: after read the problem statement, we know the

Second, we calculate the degree for the prior probability (having cancer in a population) and

degree(answer) = 13.8

P(cancer | positive x-ray & positive CT scan) = 0.96 (same value as the one got in Example 2) As you can see, our evidence based reasoning is easier than the original Bayes' theorem in dealing with many evidences. One thing to point out is that our evidence based reasoning can be used in many areas. For example, in bioinformatics, data mining, category

We described the fundamentals of computer reasoning and proposed an EvidenceBasedReasoning algorithm. In this section, we will introduce a framework of knowledge management in the context of bio-information system architectures. Based on this framework, we will introduce a prototype implementation of the Bio-information

In a typical knowledge management system, there are many components. Figure 5 shows an information system architecture upon which we base our reasoning framework and

**7. Knowledge management in bio-information system architecture** 

result easier. The following is the analysis and steps of finding the answer:

 degree(prior) = 10 log10 (0.002) = - 27 (get from 2:998) degree(x-ray) = 10 log10 (14.17) = 11.5 (get from 0.85/0.06) degree(CT scan) = 10 log10 (850) = 29.3 (get from 0.85/0.001)

question is: P(cancer | positive x-ray & positive CT scan) = ?

Third, we get the overall degree by adding all above degree values:

Convert to probability, it equals P = 23.99 / (23.99 + 1) = 0.96

Fourth, we extract the answer in terms of probability by using Formula 7:

the degrees for the two tests (x-ray and CT scan):

strength(answer) = 10 degree(answer) / 10

classification, etc., just to name a few.

knowledge management system.

knowledge management methods.

**7.1 Knowledge management framework** 

= 10 13.8 / 10

= 23.99

Thus the final answer is:

The assumption of this method is that each piece of evidence is independent. Because of the strength of Bayes' theorem, this assumption works even for evidences that are not independent. Studies show that systems based on Bayes' theorem with the same assumption such as Hidden Naïve Bayes (Jin & et al., 2007) are robust because of the model constructing can accommodate minor factors easily. The reason for this robustness stems from the fact that the model itself has already captured the main causality. Any other accuracy consideration does not improve too much. In a sense, it only adds the complexity.

Our reasoning method can be represented as the following algorithm:

#### **EvidenceBasedReasoning Algorithm: Inputs: raw data, input question of probability of an event of interest; Output: posterior probability information (answer to the input question)**

Step 1: constructing models (or knowledge) from raw data.

Step 2: calculate the quality of evidence related to the input question in terms of evidence degree with the help of Formulas 6 and 7.

Step 3: calculate the overall evidence degree.

Step 4: interpret the information by converting the overall evidence degree back to the probability (using Formulas 6 and 7 again).

We have the following comments about the degree of evidence:


Now, let's use an example to illustrate how our EvidenceBasedReasoning works.

**Example 4:** Solve the problem in Example 2 again using the EvidenceBasedReasoning algorithm. We repeat the main points and assumptions in the following:


**Answer:** We will solve this problem using the EvidenceBasedReasoning algorithm. Using Bayes' theorem, we already solved the problem and knew the correct answer for that question is

The assumption of this method is that each piece of evidence is independent. Because of the strength of Bayes' theorem, this assumption works even for evidences that are not independent. Studies show that systems based on Bayes' theorem with the same assumption such as Hidden Naïve Bayes (Jin & et al., 2007) are robust because of the model constructing can accommodate minor factors easily. The reason for this robustness stems from the fact that the model itself has already captured the main causality. Any other accuracy

**EvidenceBasedReasoning Algorithm: Inputs: raw data, input question of probability of an event of interest; Output: posterior probability information (answer to the input** 

Step 2: calculate the quality of evidence related to the input question in terms of evidence

Step 4: interpret the information by converting the overall evidence degree back to the

1. The critical point for the degree of evidence is 0. 0 means the evidence is neutral; the probability of positive conditional is equal to the probability of negative conditional. It

2. If the evidence's degree is greater than 0, then it will shift our view toward believing event A is true; if the evidence's degree is less than 0, then it will shift our view toward

3. Degree is measured in terms of order of degree. If evidence A's degree is 10 and evidence B's degree is 20, then evidence B is ten order of magnitude (100 times)

**Example 4:** Solve the problem in Example 2 again using the EvidenceBasedReasoning

2. When doing an annual check, assume that 85% of the people with lung cancer will show positive for the chest x-ray test. About 6% of the people without lung cancer will

3. The second test called CT scan is done independently. It returns positive for 85% of the

4. If a person went through the annual check and had positives on both the chest x-ray

**Answer:** We will solve this problem using the EvidenceBasedReasoning algorithm. Using Bayes' theorem, we already solved the problem and knew the correct answer for that

Now, let's use an example to illustrate how our EvidenceBasedReasoning works.

1. About 0.2% of the population living in US with age above 20 has lung cancer.

and the CT scan, what is the probability that he/she has the lung cancer?

algorithm. We repeat the main points and assumptions in the following:

consideration does not improve too much. In a sense, it only adds the complexity.

Our reasoning method can be represented as the following algorithm:

Step 1: constructing models (or knowledge) from raw data.

We have the following comments about the degree of evidence:

does not add anything in shifting our view to the world.

stronger than evidence A in persuasion power.

also show positive for the chest x-ray test.

people with lung cancer; its false rate is 0.1%.

degree with the help of Formulas 6 and 7. Step 3: calculate the overall evidence degree.

probability (using Formulas 6 and 7 again).

believing event A is not true;

**question)** 

question is

P(cancer | positive x-ray & positive CT scan) = 0.96

Here, we are going to show you that our new framework of reasoning will help us to get the result easier. The following is the analysis and steps of finding the answer:

First, we decide what is the question: after read the problem statement, we know the question is: P(cancer | positive x-ray & positive CT scan) = ?

Second, we calculate the degree for the prior probability (having cancer in a population) and the degrees for the two tests (x-ray and CT scan):


Third, we get the overall degree by adding all above degree values:

degree(answer) = 13.8

Fourth, we extract the answer in terms of probability by using Formula 7:

```
strength(answer) = 10 degree(answer) / 10
```
= 10 13.8 / 10

= 23.99

Convert to probability, it equals P = 23.99 / (23.99 + 1) = 0.96

Thus the final answer is:

P(cancer | positive x-ray & positive CT scan) = 0.96 (same value as the one got in Example 2)

As you can see, our evidence based reasoning is easier than the original Bayes' theorem in dealing with many evidences. One thing to point out is that our evidence based reasoning can be used in many areas. For example, in bioinformatics, data mining, category classification, etc., just to name a few.

#### **7. Knowledge management in bio-information system architecture**

We described the fundamentals of computer reasoning and proposed an EvidenceBasedReasoning algorithm. In this section, we will introduce a framework of knowledge management in the context of bio-information system architectures. Based on this framework, we will introduce a prototype implementation of the Bio-information knowledge management system.

#### **7.1 Knowledge management framework**

In a typical knowledge management system, there are many components. Figure 5 shows an information system architecture upon which we base our reasoning framework and knowledge management methods.

Knowledge Management in Bio-Information Systems 59

schedules of doctors and nurses, sometimes the patient is not able to get answers in a timely manner. Here, we will develop a prototype system that will answer most of the questions that a terminal patient will have. Also, the answers from our system will be tailored to individual patients. Ideally, our system should be able to relieve a lot of burdens from

We are going to develop a prototype of the evidence based reasoning software component.

1. The component should have a Graphic User Interface (GUI) to facilitate the use of the

2. It should be interactive. Based on the information in the knowledge data base, it may

3. The component should be developed in such a way that it can be used to query

4. There should be default values for those fields that a user does not input specific

5. The knowledge database should be separated from this component for the benefit of

Fig. 6. A screen capture of the uesr interface for the evidence based reasoning system

the system. The following is the assumptions that we made.

With the above design ideas in mind, we will set the boundaries and make assumptions for

**7.4 The evidence based reasoning software design ideas** 

system. Figure 6 is a screen capture of the user interface.

different terminal illnesses, in other words, it should be generic.

In the following, we will outline our design ideas.

**Main design ideas:** we strive the following:

ask patient questions.

information.

less coupling.

doctors and nurses.

Fig. 5. A bio-information knowledge management framework

In Figure 5, the NLP stands for the natural language processor. NLP is used to translate a problem written in natural language such as English or Chinese into a well formed problem statement that is understood by reasoning engine which is enclosed inside the dotted region in Figure 5. The reasoning engine consists of Knowledge Query and Manipulation Language (KQML), the Evidence based reasoning software, and the Action planner. KQML is used to manage data stored in the knowledge database. Its role in an expert system is much like the role that the SQL language played in a database management system. Action planner is the component that drives the system. One thing to point out is that the reasoning engine works with the help of the knowledge database.

#### **7.2 The evidence based reasoning software**

As you can see from Figure 5, the complete system of a bio-information knowledge management system has both software and hardware. In this presentation, we will focus on the software side. In particular, we will focus on one software component: the evidence based reasoning software (expert Software). We will assume that other components are already implemented and working.

#### **7.3 The potential areas of using the evidence based reasoning system**

One of the application areas of our evidence reasoning system is the terminal patient consulting bio-information system. When a patient is diagnosed with terminal illness, his first reaction is disbelieving. Then, he probably will ask questions like: what is the prognosis such as how long he can live; what is its etiology such as the cause of the disease; and whether it is hereditary. These questions are usually being answered by doctors or nurses. Often, answers that a patient got are generic based on average cases. Also, because of tight

In Figure 5, the NLP stands for the natural language processor. NLP is used to translate a problem written in natural language such as English or Chinese into a well formed problem statement that is understood by reasoning engine which is enclosed inside the dotted region in Figure 5. The reasoning engine consists of Knowledge Query and Manipulation Language (KQML), the Evidence based reasoning software, and the Action planner. KQML is used to manage data stored in the knowledge database. Its role in an expert system is much like the role that the SQL language played in a database management system. Action planner is the component that drives the system. One thing to point out is that the reasoning engine works

As you can see from Figure 5, the complete system of a bio-information knowledge management system has both software and hardware. In this presentation, we will focus on the software side. In particular, we will focus on one software component: the evidence based reasoning software (expert Software). We will assume that other components are

One of the application areas of our evidence reasoning system is the terminal patient consulting bio-information system. When a patient is diagnosed with terminal illness, his first reaction is disbelieving. Then, he probably will ask questions like: what is the prognosis such as how long he can live; what is its etiology such as the cause of the disease; and whether it is hereditary. These questions are usually being answered by doctors or nurses. Often, answers that a patient got are generic based on average cases. Also, because of tight

**7.3 The potential areas of using the evidence based reasoning system** 

Fig. 5. A bio-information knowledge management framework

with the help of the knowledge database.

already implemented and working.

**7.2 The evidence based reasoning software** 

schedules of doctors and nurses, sometimes the patient is not able to get answers in a timely manner. Here, we will develop a prototype system that will answer most of the questions that a terminal patient will have. Also, the answers from our system will be tailored to individual patients. Ideally, our system should be able to relieve a lot of burdens from doctors and nurses.

#### **7.4 The evidence based reasoning software design ideas**

We are going to develop a prototype of the evidence based reasoning software component. In the following, we will outline our design ideas.

**Main design ideas:** we strive the following:


Fig. 6. A screen capture of the uesr interface for the evidence based reasoning system

With the above design ideas in mind, we will set the boundaries and make assumptions for the system. The following is the assumptions that we made.

Knowledge Management in Bio-Information Systems 61

of cancers) is because our body lost control to the cell growth. For normal cells, their growth is controlled by the information in their DNA. These cells know when to stop. On the other hand, for a cancer cell (either caused by spontaneous mutation or by hereditary predisposition), this control is lost. Thus, it will grow unchecked and with misshape. Environment factors such as high animal fat diet, radiation exposure, Streptococcus bacterial infection (bacteremia), inflammatory bowel disease, etc. make a person susceptible to colorectal cancers. But these factors do not mean a person has cancer. Cells in our body have innate ability to fight cancers. This ability is rested on the fact that normal cells have cancer suppressing genes. For example, "the long arm of chromosome 5 (including the APC gene)" is responsible for the suppression of one type of colon cancer (polyposis coli) development. "The loss of this genetic material (i.e., allelic loss) results in the absence of tumor-suppressor genes whose protein products would normally inhibit neoplastic growth." (Kasper, 2005, p. 528) Thus, when we see a cancer, it is the result of both the presence of the environmental

In this section, we will apply our prototype reasoning software to the case example introduced in the previous section. To show the effect of evidence, we will show two outputs: one with specific personal information and one without. The case information for

**Case 1: we use the following general information (with no specific personal information):** 

Using the input information in case 1, we will get the default 5-year survival chance. Figure

Fig. 7. A screen capture of the general 5-year survival probability for a person with colon

The information stored in the knowledge database is contained in Table 1.

Suppose that the patient (Michael Dodd) is diagnosed with (HNPCC) colon cancer

risk factors and the absence of the cancer fighting genes.

stage III.

cancer of stage III

7 is the output screen capture for case 1.

**7.6 Sample runs of the evidence based reasoning software** 

the one that has no specific personal information is the following:

**Assumption:** we assume the following:


To emphasize the main point, our implementation uses a simple design. Without losing generality, we loaded data from a file instead of asking the user to input them from a keyboard. We also watered down some features for the sake of simplicity. For example, the whole knowledge database is substituted by hard-coded logic.

#### **7.5 A case example: Colorectal cancer**

To help our presentation, we will use a medical case example to illustrate some features of our evidence based reasoning system. The medical case used is the colorectal cancer. And we will use the most common form of the colorectal cancer: the hereditary nonpolyposis colon cancer (HNPCC). This form of cancer is also called *Lynch syndrome*. The following is some facts related to this disease:

**Some facts of colorectal cancer:** "Cancer of the large bowel is second only to lung cancer as a cause of cancer death in the United States; 146,940 new cases occurred in 2004, and 56,730 deaths were due to colorectal cancer." (Kasper, 2005, p. 527) This disease has hereditary factors. "As many as 25% of patients with colorectal cancer have a family history of the disease, suggesting a hereditary predisposition." (Kasper, 2005, p. 527) Once diagnosed, the prognosis "is related to the depth of tumor penetration into the bowel wall and the presence of both regional lymph node involvement and distant metastases. These variables are incorporated into the staging system introduced by Dukes and applied to a TNM classification method, in which T represents the depth of tumor penetration, N the presence of lymph node involvement, and M the presence or absence of distant metastases (Table 1).


Table 1. Staging of and Prognosis for Colorectal Cancer (Kasper, 2005, p. 529-530)

The prevalent belief of the cause of the disease is the interplay between the environment and the cancer suppressing genes. The reason why we have colorectal cancers (in fact, any type

1. All other components shown in Figure 5 are developed and working. The only component that we are focusing on is the evidence based reasoning software.

3. The component will answer only predefined set of questions (most important to a patient) such as the cause of the disease (etiology), once diagnosed, how long a person

To emphasize the main point, our implementation uses a simple design. Without losing generality, we loaded data from a file instead of asking the user to input them from a keyboard. We also watered down some features for the sake of simplicity. For example, the

To help our presentation, we will use a medical case example to illustrate some features of our evidence based reasoning system. The medical case used is the colorectal cancer. And we will use the most common form of the colorectal cancer: the hereditary nonpolyposis colon cancer (HNPCC). This form of cancer is also called *Lynch syndrome*. The following is

**Some facts of colorectal cancer:** "Cancer of the large bowel is second only to lung cancer as a cause of cancer death in the United States; 146,940 new cases occurred in 2004, and 56,730 deaths were due to colorectal cancer." (Kasper, 2005, p. 527) This disease has hereditary factors. "As many as 25% of patients with colorectal cancer have a family history of the disease, suggesting a hereditary predisposition." (Kasper, 2005, p. 527) Once diagnosed, the prognosis "is related to the depth of tumor penetration into the bowel wall and the presence of both regional lymph node involvement and distant metastases. These variables are incorporated into the staging system introduced by Dukes and applied to a TNM classification method, in which T represents the depth of tumor penetration, N the presence of lymph node involvement, and M the presence or absence of distant metastases (Table 1).

Table 1. Staging of and Prognosis for Colorectal Cancer (Kasper, 2005, p. 529-530)

The prevalent belief of the cause of the disease is the interplay between the environment and the cancer suppressing genes. The reason why we have colorectal cancers (in fact, any type

**Assumption:** we assume the following:

can live (prognosis), etc.

some facts related to this disease:

**7.5 A case example: Colorectal cancer** 

2. The diagnosis of the illness is already known.

whole knowledge database is substituted by hard-coded logic.

of cancers) is because our body lost control to the cell growth. For normal cells, their growth is controlled by the information in their DNA. These cells know when to stop. On the other hand, for a cancer cell (either caused by spontaneous mutation or by hereditary predisposition), this control is lost. Thus, it will grow unchecked and with misshape. Environment factors such as high animal fat diet, radiation exposure, Streptococcus bacterial infection (bacteremia), inflammatory bowel disease, etc. make a person susceptible to colorectal cancers. But these factors do not mean a person has cancer. Cells in our body have innate ability to fight cancers. This ability is rested on the fact that normal cells have cancer suppressing genes. For example, "the long arm of chromosome 5 (including the APC gene)" is responsible for the suppression of one type of colon cancer (polyposis coli) development. "The loss of this genetic material (i.e., allelic loss) results in the absence of tumor-suppressor genes whose protein products would normally inhibit neoplastic growth." (Kasper, 2005, p. 528) Thus, when we see a cancer, it is the result of both the presence of the environmental risk factors and the absence of the cancer fighting genes.

#### **7.6 Sample runs of the evidence based reasoning software**

In this section, we will apply our prototype reasoning software to the case example introduced in the previous section. To show the effect of evidence, we will show two outputs: one with specific personal information and one without. The case information for the one that has no specific personal information is the following:

#### **Case 1: we use the following general information (with no specific personal information):**

Suppose that the patient (Michael Dodd) is diagnosed with (HNPCC) colon cancer stage III.

The information stored in the knowledge database is contained in Table 1.

Using the input information in case 1, we will get the default 5-year survival chance. Figure 7 is the output screen capture for case 1.


Fig. 7. A screen capture of the general 5-year survival probability for a person with colon cancer of stage III

Knowledge Management in Bio-Information Systems 63

2. We rate the evidence E1 as follows: Michael's father lived 3 years after diagnosis as negative relative to the event of interesting: Michael is able to live 5 years. Considering Michael's father is his direct relative, we assign a degree(father's condition->Michael 5-

3. Similarly, we rate the evidence E2 as follows: Michael's older sister lived 4 years after diagnosis as negative relative to the event of interesting: Michael is able to live 5 years. Considering Michael's sister is his direct relative and the year 4 is pretty close to 5, we

strength(answer) = 10 degree(answer) / 10

= 10 -3.4 / 10

= 1 / 2.2 = 0.45

**Note:** our assignments of degrees to the two evidences (in step 2, 3) are arbitrary in a sense that it is not verified. In real situation, we should determine these values by clinical trials.

As a consequence of these reasoning steps, the evidence reasoning software produces the

The screen captures in the previous section are produced by Java code. We implemented the prototype using popular Java language. List 1 shows the code that produces the output

assign half degree(older sister's condition->Michael 5-year survival) = -.5.

4. Calculate the overall degree: FinalDegree = -1 + (-.5) + (-1.9) = -3.4.

year survival) = -1.

5. Convert the degree to the final strength:

6. Thus, the revised range will be: 15-45%.

revised survival probability as shown in Figure 8.

**7.7 Java code for the reasoning software** 

List 1: Java code to produce the output screens

screens.

The case information for the second run that has specific personal information is the following:

#### **Case 2: we use the following specific information (with personal information):**

Suppose that the patient (Michael Dodd) is diagnosed with (HNPCC) colon cancer stage III.

Michael's father had colon cancer, the time between the diagnosis and the death was 3 years.

Michael's older sister had colon cancer, the time between the diagnosis and the death was 4 years.

The information stored in the knowledge database is contained in Table 1.

Using the input information in case 2, we are able to get the revised 5-year survival chance. Figure 8 is the output screen capture for case 2.

Fig. 8. A screen capture of the individualized 5-year survival probability for a person with colon cancer of stage III

As you can see from the output in Figure 8, the 5-year survival probability is revised down words. Since in this case, we have more information (patient's father's cancer history; patient's older sister's cancer history), the evidence based reasoning software takes the new information into account and produces more accurate output. With regard to the event of 5 year survival, these evidences reduce the probability. Thus, they are negative evidences according to our evidence theory. Specifically, the 5-year survival probability is revised from 35-65% down to 15-45%. The following is the rationale and steps to get this new result.

1. We first calculate the degree of prior probability (in this case, take the data from Table 1 (< 65%)) as follows:

degree(prior) = 10 log10 (0.65) = - 1.9

The case information for the second run that has specific personal information is the

Suppose that the patient (Michael Dodd) is diagnosed with (HNPCC) colon cancer

Michael's father had colon cancer, the time between the diagnosis and the death was 3

Michael's older sister had colon cancer, the time between the diagnosis and the death

Using the input information in case 2, we are able to get the revised 5-year survival chance.

Fig. 8. A screen capture of the individualized 5-year survival probability for a person with

As you can see from the output in Figure 8, the 5-year survival probability is revised down words. Since in this case, we have more information (patient's father's cancer history; patient's older sister's cancer history), the evidence based reasoning software takes the new information into account and produces more accurate output. With regard to the event of 5 year survival, these evidences reduce the probability. Thus, they are negative evidences according to our evidence theory. Specifically, the 5-year survival probability is revised from 35-65% down to 15-45%. The following is the rationale and steps to get this new result.

1. We first calculate the degree of prior probability (in this case, take the data from Table 1

degree(prior) = 10 log10 (0.65) = - 1.9

**Case 2: we use the following specific information (with personal information):** 

The information stored in the knowledge database is contained in Table 1.

following:

stage III.

was 4 years.

colon cancer of stage III

(< 65%)) as follows:

Figure 8 is the output screen capture for case 2.

years.


strength(answer) = 10 degree(answer) / 10 = 10 -3.4 / 10

= 1 / 2.2 = 0.45

6. Thus, the revised range will be: 15-45%.

**Note:** our assignments of degrees to the two evidences (in step 2, 3) are arbitrary in a sense that it is not verified. In real situation, we should determine these values by clinical trials.

As a consequence of these reasoning steps, the evidence reasoning software produces the revised survival probability as shown in Figure 8.

#### **7.7 Java code for the reasoning software**

The screen captures in the previous section are produced by Java code. We implemented the prototype using popular Java language. List 1 shows the code that produces the output screens.

List 1: Java code to produce the output screens

**3** 

*Japan* 

**Efficiency of Knowledge Transfer by Hearing** 

One of the most common means of acquiring useful knowledge is reading suitable documents and websites. However, this is time-consuming and cannot be done in parallel with other tasks. Is there a way to acquire knowledge when we cannot read written texts, such as while driving a car, walking around or doing housework? It is not easy to remember the contents of a document simply by listening to its reading aloud from the top, even if we concentrate while listening. In contrast, it is sometimes easier to remember words heard on

While we are doing something, listening to conversation is better than listening to a precise reading out of a draft or summary for memorizing the contents and turning them into knowledge. We are therefore trying to improve the efficiency of knowledge transfer1 by

In order to support knowledge acquisition by humans, we aim to develop a system which provides people with useful knowledge while they are doing something or not concentrating on listening. We did not try to edit notes to be read out, or to summarize documents; rather, we aimed to develop a way of transferring knowledge. Specifically, in order to provide knowledge efficiently with computers, we consider how to turn the content into a dialogue that is easily remembered, and develop a system to produce dialogue by

In the next section of this article, we explain our prototype system named "Sophisticated Eliza" (Isahara et al., 2005) Then, we discuss the idea of "Efficient knowledge transfer by

1 In this paper, "(knowledge) transfer" is a movement of knowledge/information from a knowledge source, including a human, to a human recipient. That is to say, the term "knowledge transfer" means not only transferring knowledge between people but also transferring knowledge from computers to human. "Acquisition" is a process of understanding/memorizing knowledge by the human recipient. We focus on the process of synthesizing conversation being uttered for knowledge transfer, which relates to the "externalization" in SECI model (Nonaka & Takeuchi, 1995), in order to realize efficient knowledge acquisition by the recipient, which relates to the "combination" in the

hearing a conversation while doing something"(Yamamoto & Isahara, 2008).

the radio or television even if we are not concentrating on them.

"hearing a conversation while doing something."

which one can easily acquire knowledge.

model.

**1. Introduction** 

**a Conversation While Doing Something** 

*Gifu Shotoku Gakuen University, Toyohashi University of Technology* 

Eiko Yamamoto and Hitoshi Isahara

#### **8. Conclusion**

In this chapter, we described the relationships among data, knowledge, and intelligence. We proposed one reasoning theory: evidence based reasoning theory. We gave the Java code for the implementation of a prototype. The future work includes more detailed mapping between the evidence strength value and its percentage change; the implementation of missing components such as the knowledge database, the beef up of the watered down features.

#### **9. Acknowledgement**

I want to give thanks to my family for their support for this book writing project: Enlu Peng, Yuqing Peng, and Daniel Jian.

#### **10. References**


### **Efficiency of Knowledge Transfer by Hearing a Conversation While Doing Something**

Eiko Yamamoto and Hitoshi Isahara

*Gifu Shotoku Gakuen University, Toyohashi University of Technology Japan* 

#### **1. Introduction**

66 Intelligent Systems

In this chapter, we described the relationships among data, knowledge, and intelligence. We proposed one reasoning theory: evidence based reasoning theory. We gave the Java code for the implementation of a prototype. The future work includes more detailed mapping between the evidence strength value and its percentage change; the implementation of missing components such as the knowledge database, the beef up of the watered down

I want to give thanks to my family for their support for this book writing project: Enlu Peng,

Asy'arie, A., & Pribadi, A. (2009). Automatic News Articles Classification in Indonesian

Bellinger, G. (2004). Knowledge Management—Emerging Perspectives, (Internet resouce:

Fujita, H.; and et al. (2010). Virtual Doctor System (VDS): Medical Decision Reasoning Based

Jin, X.; and et al. (2007). Automatic Web Pages Categorization with ReliefF and Hidden Naive Bayes*, In Proceedings of SAC 2007*, pp. 617-621, Seoul, Korea, March, 2007 Kasper, D.; and et al. (2005). *Harrison's Principles of Internal Medicine* (Sixteenth Edition),

Williams, L. & Hopper, P. (2003). *Understaning Medical Surgical Nursing* (2nd Edition), F. A.

McGraw-Hill Companies, Inc., ISBN 0-07-139140-1, USA

Davis Company, ISBN 0-8036-1037-8, Philadelphia, PA, USA

Language by Using Naive Bayes Classifier method, *In Proceedings of iiWAS2009*,

http://www.systems-thinking.org/kmgmt/kmgmt.htm. Retrieved on 5/30/2011)

On Physical and Mental Ontologies, *In Proceedings of IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied* 

**8. Conclusion** 

**9. Acknowledgement** 

**10. References** 

Yuqing Peng, and Daniel Jian.

Kuala Lumpur, Malaysia, 2009

*intelligent systems*, Volume Part III, 2010

features.

One of the most common means of acquiring useful knowledge is reading suitable documents and websites. However, this is time-consuming and cannot be done in parallel with other tasks. Is there a way to acquire knowledge when we cannot read written texts, such as while driving a car, walking around or doing housework? It is not easy to remember the contents of a document simply by listening to its reading aloud from the top, even if we concentrate while listening. In contrast, it is sometimes easier to remember words heard on the radio or television even if we are not concentrating on them.

While we are doing something, listening to conversation is better than listening to a precise reading out of a draft or summary for memorizing the contents and turning them into knowledge. We are therefore trying to improve the efficiency of knowledge transfer1 by "hearing a conversation while doing something."

In order to support knowledge acquisition by humans, we aim to develop a system which provides people with useful knowledge while they are doing something or not concentrating on listening. We did not try to edit notes to be read out, or to summarize documents; rather, we aimed to develop a way of transferring knowledge. Specifically, in order to provide knowledge efficiently with computers, we consider how to turn the content into a dialogue that is easily remembered, and develop a system to produce dialogue by which one can easily acquire knowledge.

In the next section of this article, we explain our prototype system named "Sophisticated Eliza" (Isahara et al., 2005) Then, we discuss the idea of "Efficient knowledge transfer by hearing a conversation while doing something"(Yamamoto & Isahara, 2008).

<sup>1</sup> In this paper, "(knowledge) transfer" is a movement of knowledge/information from a knowledge source, including a human, to a human recipient. That is to say, the term "knowledge transfer" means not only transferring knowledge between people but also transferring knowledge from computers to human. "Acquisition" is a process of understanding/memorizing knowledge by the human recipient. We focus on the process of synthesizing conversation being uttered for knowledge transfer, which relates to the "externalization" in SECI model (Nonaka & Takeuchi, 1995), in order to realize efficient knowledge acquisition by the recipient, which relates to the "combination" in the model.

Efficiency of Knowledge Transfer by Hearing a Conversation While Doing Something 69

and we use rules to extract the fragments of conversation using information in the encyclopedia. As for the manual compilation of rules, we carefully analyzed the output of the Japanese parser and found useful patterns to extract knowledge from the encyclopedia.

Fig. 1. Screenshot of Sophisticated Eliza

Fig. 2. System Flow of Sophisticated Eliza

In section 4, we evaluate the effectiveness of knowledge transfer via listening to conversation, comparing it with listening to monologues. We firstly choose several topics and select suitable documents on the topic. Then we extract informative sentences from the document and form conversation by splitting a sentence into conversational fragments. In order to verify our hypothesis, we conduct evaluation on the usefulness of listening conversation formed by the fragments with human subjects. We will also get the suggestion from the experiments how to select suitable domain for our system.

In section 5, we introduce our prototype system and present some examples of the conversations extracted by the system. As for the information transfer system, although our final target is to handle topics which are practically useful such as knowledge from newspapers, encyclopedia and Wikipedia, as a first step we tried to compile rules for small procedural domains such as cooking recipes.

#### **2. Sophisticated Eliza**

Recently, thanks to the improvement of natural language processing (NLP) technology, development of high-performance computers and the availability of huge amounts of stored linguistic data, there are now useful NLP-based systems. There are also practical speech synthesis tools for reading out documents and tools for summarizing documents. These tools do not necessarily use state-of-the-art technologies to achieve deep and accurate language understanding, but are based on huge amounts of linguistic resources that used not to be available. Although current computer systems can collect huge amounts of knowledge from real examples, it is not obvious how to transfer knowledge more naturally between such powerful computer systems and humans. We need to develop a novel way to transfer knowledge from computers to humans.

We believe that, based on large amounts of text data, it is possible to devise a system which can generate dialogue by a simple mechanism to give people the impression that two intelligent persons are talking. We verified this approach by implementing a system named Sophisticated Eliza which can simulate conversation between two persons on a computer. Sophisticated Eliza is not a Human-Computer Interaction system; instead, it simulates conversation by two people and users acquire information by listening to the conversation generated by the system. Concretely, using an encyclopedia in Japanese (Kodansha International, 1998) as a knowledge base, we develop rules to extract information from the knowledge base and create fragments of conversation. We extract rules with syntactic patterns to make a conversation, for example, "What is A?" "It's B." from "A is B." The system extracts candidate fragments of conversation using these simple scripts and two voices then read the conversation aloud. This system cannot generate long conversations as humans do on one topic, but it can simulate short conversations from stored linguistic resources and continue conversations while changing topics.

Figure 1 shows a screenshot of Sophisticated Eliza and Figure 2 shows its system flow. Figure 3 is examples of conversation generated by the system.

The encyclopedia utilized here contains all about Japan, e.g., history, culture, economy and politics. All sentences in the encyclopedia are analyzed syntactically using a Japanese parser

Fig. 1. Screenshot of Sophisticated Eliza

In section 4, we evaluate the effectiveness of knowledge transfer via listening to conversation, comparing it with listening to monologues. We firstly choose several topics and select suitable documents on the topic. Then we extract informative sentences from the document and form conversation by splitting a sentence into conversational fragments. In order to verify our hypothesis, we conduct evaluation on the usefulness of listening conversation formed by the fragments with human subjects. We will also get the suggestion

In section 5, we introduce our prototype system and present some examples of the conversations extracted by the system. As for the information transfer system, although our final target is to handle topics which are practically useful such as knowledge from newspapers, encyclopedia and Wikipedia, as a first step we tried to compile rules for small

Recently, thanks to the improvement of natural language processing (NLP) technology, development of high-performance computers and the availability of huge amounts of stored linguistic data, there are now useful NLP-based systems. There are also practical speech synthesis tools for reading out documents and tools for summarizing documents. These tools do not necessarily use state-of-the-art technologies to achieve deep and accurate language understanding, but are based on huge amounts of linguistic resources that used not to be available. Although current computer systems can collect huge amounts of knowledge from real examples, it is not obvious how to transfer knowledge more naturally between such powerful computer systems and humans. We need to develop a novel way to

We believe that, based on large amounts of text data, it is possible to devise a system which can generate dialogue by a simple mechanism to give people the impression that two intelligent persons are talking. We verified this approach by implementing a system named Sophisticated Eliza which can simulate conversation between two persons on a computer. Sophisticated Eliza is not a Human-Computer Interaction system; instead, it simulates conversation by two people and users acquire information by listening to the conversation generated by the system. Concretely, using an encyclopedia in Japanese (Kodansha International, 1998) as a knowledge base, we develop rules to extract information from the knowledge base and create fragments of conversation. We extract rules with syntactic patterns to make a conversation, for example, "What is A?" "It's B." from "A is B." The system extracts candidate fragments of conversation using these simple scripts and two voices then read the conversation aloud. This system cannot generate long conversations as humans do on one topic, but it can simulate short conversations from stored linguistic

Figure 1 shows a screenshot of Sophisticated Eliza and Figure 2 shows its system flow.

The encyclopedia utilized here contains all about Japan, e.g., history, culture, economy and politics. All sentences in the encyclopedia are analyzed syntactically using a Japanese parser

from the experiments how to select suitable domain for our system.

procedural domains such as cooking recipes.

transfer knowledge from computers to humans.

resources and continue conversations while changing topics.

Figure 3 is examples of conversation generated by the system.

**2. Sophisticated Eliza** 

Fig. 2. System Flow of Sophisticated Eliza

and we use rules to extract the fragments of conversation using information in the encyclopedia. As for the manual compilation of rules, we carefully analyzed the output of the Japanese parser and found useful patterns to extract knowledge from the encyclopedia.

Efficiency of Knowledge Transfer by Hearing a Conversation While Doing Something 71

We utilized the speech synthesizer "Polluxstar" by Oki Electric Co. Ltd., which enables speech synthesis with one's own voice. We input information of voice of authors of this

We prepared three materials to be synthesized for the experiments. Two are about recipes and another is about sports news. For recipes, we chose them from one of recipes sites with movies in Japan (http://recipe.gnavi.co.jp/movie/sweetkitchen/). One is about cooking rice bowl with chicken and eggs, and the other is about cooking gratin. This site contains a short movie with chef and assistant, and contains written recipes for each dish. For dialogue example, we transcribed all conversation between chef and assistant and made speech synthesizer read it aloud. For monologue example, we simply made speech synthesizer read it aloud with one of two voices in the system. As for news article, we chose a news article about women's soccer games in the newspaper in Japan. For its monologue, we made speech synthesizer read it aloud with one of two voices in the system. For its dialogue, we added inquiries manually

We gathered participants of our experience among students of Toyohashi University of Technology, Japan. We had 33 participants and additional 4 male student participants. Because the main topic of the experience is recipe, we gathered mainly woman students. The participants were requested to listen to all six synthesized speeches, i.e. two dialogues for cooking, two monologues for cooking, one dialogue of news, and one monologue of news,

To verify above mentioned fact, we conducted experiments with human subject.

about some of the point of the news, and made speech synthesizer read it aloud.

and fill questionnaire when one finished each speech.

1a: whether you can make the dish or not

1d: important points for the procedure

2a: impression of monologue 2b: impression of dialogue

The items which are asked in the questionnaire are as follows;

1e: comparison between monologue and dialogue

2c: comparison between monologue and dialogue

The objective features extracted from the questionnaire are as follows;

**3.1 Settings** 

1. Recipe

2. News

**3.2 Results 3.2.1 Recipe** 

For rice ball recipe;

Number of ingredients

1b: ingredients 1c: cooking procedure

3. Over all comparison

paper (one male and one female).

The terms extracted during the syntactic analysis are stored in the keyword table and are used for selection of topics and words during the conversation synthesis.



#### Fig. 3. Examples of Generated Conversation

Note that in our current system, we use Japanese documents as the input. Because we are using only syntactic information output by the Japanese parser, our mechanism is also applicable to other languages such as English. We use a rather simple mechanism to generate actual conversations in the system, which includes rules to select fragments containing similar words and rules to change topics. The contents in the encyclopedia are divided into seven categories, i.e. geography, history, politics, economy, society, culture and life. When the topic in a conversation moves from one topic to another, the system generates utterance showing such move. As for the speech synthesis part, we use the synthesizer "Polluxstar" developed by Oki Electric Industry Co. Ltd., Japan. The two authors of this paper, one male and one female, recorded 400 sentences each and the two characters in the system talk to each other by impersonating our voices. The images of the two characters are also based on the authors.

Because this system uses simple template-like knowledge, it cannot generate semantically deep conversation on a topic by considering context or by compiling highly precise rules to extract script-like information from text. Thus, the mechanism used in this system has room for improvement to create conversations for knowledge transfer.

#### **3. Efficiency of hearing a conversation comparing with hearing a monologue**

In the daily transfer of knowledge, such as in a cooking program on TV, there are not only the reading aloud of recipes by the presenter but also conversation between the cook and assistant. Through such conversations, information which viewers want to know and which they should memorize is transferred to them naturally.

To verify above mentioned fact, we conducted experiments with human subject.

#### **3.1 Settings**

70 Intelligent Systems

The terms extracted during the syntactic analysis are stored in the keyword table and are

Note that in our current system, we use Japanese documents as the input. Because we are using only syntactic information output by the Japanese parser, our mechanism is also applicable to other languages such as English. We use a rather simple mechanism to generate actual conversations in the system, which includes rules to select fragments containing similar words and rules to change topics. The contents in the encyclopedia are divided into seven categories, i.e. geography, history, politics, economy, society, culture and life. When the topic in a conversation moves from one topic to another, the system generates utterance showing such move. As for the speech synthesis part, we use the synthesizer "Polluxstar" developed by Oki Electric Industry Co. Ltd., Japan. The two authors of this paper, one male and one female, recorded 400 sentences each and the two characters in the system talk to each other by impersonating our voices. The images of the two characters are

Because this system uses simple template-like knowledge, it cannot generate semantically deep conversation on a topic by considering context or by compiling highly precise rules to extract script-like information from text. Thus, the mechanism used in this system has room

**3. Efficiency of hearing a conversation comparing with hearing a monologue**  In the daily transfer of knowledge, such as in a cooking program on TV, there are not only the reading aloud of recipes by the presenter but also conversation between the cook and assistant. Through such conversations, information which viewers want to know and which

for improvement to create conversations for knowledge transfer.

they should memorize is transferred to them naturally.

used for selection of topics and words during the conversation synthesis.

Fig. 3. Examples of Generated Conversation

also based on the authors.

We utilized the speech synthesizer "Polluxstar" by Oki Electric Co. Ltd., which enables speech synthesis with one's own voice. We input information of voice of authors of this paper (one male and one female).

We prepared three materials to be synthesized for the experiments. Two are about recipes and another is about sports news. For recipes, we chose them from one of recipes sites with movies in Japan (http://recipe.gnavi.co.jp/movie/sweetkitchen/). One is about cooking rice bowl with chicken and eggs, and the other is about cooking gratin. This site contains a short movie with chef and assistant, and contains written recipes for each dish. For dialogue example, we transcribed all conversation between chef and assistant and made speech synthesizer read it aloud. For monologue example, we simply made speech synthesizer read it aloud with one of two voices in the system. As for news article, we chose a news article about women's soccer games in the newspaper in Japan. For its monologue, we made speech synthesizer read it aloud with one of two voices in the system. For its dialogue, we added inquiries manually about some of the point of the news, and made speech synthesizer read it aloud.

We gathered participants of our experience among students of Toyohashi University of Technology, Japan. We had 33 participants and additional 4 male student participants. Because the main topic of the experience is recipe, we gathered mainly woman students. The participants were requested to listen to all six synthesized speeches, i.e. two dialogues for cooking, two monologues for cooking, one dialogue of news, and one monologue of news, and fill questionnaire when one finished each speech.

The items which are asked in the questionnaire are as follows;


#### **3.2 Results**

#### **3.2.1 Recipe**

The objective features extracted from the questionnaire are as follows;

For rice ball recipe;

Number of ingredients

Efficiency of Knowledge Transfer by Hearing a Conversation While Doing Something 73

We started to develop a mechanism to achieve natural knowledge acquisition for humans by turning information that is written in documents into conversational text. Efficient methods of acquiring knowledge include not only "reading documents" and "listening to passages read aloud," but also "hearing a conversation while doing something," provided that information is appropriately embedded into the conversation. We believe that we can verify that this "conversation hearing" can assist knowledge acquisition by developing a system for synthesizing conversations by collecting fragments of conversation and

As a means to transfer information, contents conveyed by an interpretive reading with pronounced intonation are better retained in memory than if read monotonously from a document or summary. Furthermore, by turning contents into conversation style, even someone who is not concentrating on listening may become interested in the topic and acquire the contents naturally. This suggests that several factors in conversations, such as throwing in words of agreement, pauses and questions, which may appear to decrease the density of information, are actually effective means of transferring information matching humans' ability to acquire knowledge with limited concentration. Based on this idea, we propose a novel mechanism of an information transfer system by considering the way of

Various dialogue systems have already been developed as communication tools between humans and computers (Waizenbaum, 1966; Matsusaka et al., 1999). However, in our novel approach, the dialogue system regards the user as an outsider, presents conversation by two speakers in the computer which is of interest to the outside user, and thus provides the user

There are dialogue systems (Nadamoto & Tanaka, 2004; ALICE; UZURA) which can join in a conversation between a human and a computer, but they simply create fragments of conversation and so do not sound like an intelligent human speaker. One reason is that they do not aim to provide knowledge or transfer information to humans, and few theoretical evaluations have been done in this field. In this research, we consider a way to transfer knowledge and develop a conversation system which generates dialogue by which humans can acquire knowledge from dialogue conducted by two speakers in the computer. We analyze the way to transfer knowledge to humans with this system. This kind of research is beneficial not only from an engineering viewpoint but also cognitive science and cognitive linguistics. Furthermore, a speech synthesis system in which two participants conduct spoken conversation automatically is rare. In this research, we develop an original information-providing system by assigning conversation to two speakers in the computer in

The principle of Sophisticated Eliza is that because a large amount of text data is available, even if the recall of information extraction is low, we can obtain sufficient information to generate short conversations. However, the rules still need to be improved by careful

**4. Efficient knowledge transfer by hearing a conversation while doing** 

**something** 

conducting experiments by using the system.

transferring knowledge from computers to humans.

with useful knowledge.

order to transfer knowledge to humans.

**5. System implementation** 

analysis of input texts.

 Dialogue: 8.1 items (111) Monologue: 7.3 items (100) Number of words for cooking procedure Dialogue: 99 words (96) Monologue: 103 words (100) Number of word for the important points for the procedure Dialogue: 61 words (124) Monologue: 49 words (100)

It seems that dialogue is slightly better than monologue. However, the experiment about gratin recipe shows different result, i.e., monologue is better than dialogue. We checked the result carefully and found the followings;

A group which listened to Gratin dialogue listened to it at the beginning of the experiment. But another group which listened to Gratin monologue listened to Riceball speech before they listen to Gratin monologue. Therefore each participant who listened to Gratin monologue already knows what kind of inquiries they will be asked. They can concentrate to grasp such points. We did additional experiments with smaller participant where each participant listened to the Gratin dialogue after listening to the Riceball speech. Then, the results became closer to the Riceball case. Actually, Gratin dialogue can not be such worse than its monologue. In the free answer opinion in the questionnaire, more participants wrote that they prefer dialogue to monologue than the reverse.

This situation also occurred for Riceball case, i.e., Riceball monogolue was heard after Gratin speech. The difference between Dialogue and Monologue for riceball recipe can be bigger than the figures above.

#### **3.2.2 News**

We asked participant which you prefer between monologue and dialogue. Then more than two third of participants explicitly wrote that they prefer dialogue to monologue.

#### **3.2.3 Discussion**

As for recipe listening, dialogue seems slightly better than monologue. However, there are several factors in our experiment which affect the result in favor of monologue. We utilized written text on the web as a text for monologue. The important points of the recipe are listed at the end of the texts, therefore it will be memorable to listeners. If we make text for dialogue from written text, the result will be better than the one in our current settings.

As for news listening, the second speaker inserted only several inquiries about topics talked next. This is not a conversation but something like an interview. Some participants strongly prefer this situation. We should establish the way to generate dialogues properly from texts.

Our hypothesis is that dialogue is more useful to get information while doing something. However, in this experiment, participants were asked to listen to monologue and dialogue and answer questionnaire. This situation is different from our original settings. We should make more suitable way to verify our hypothesis.

It seems that dialogue is slightly better than monologue. However, the experiment about gratin recipe shows different result, i.e., monologue is better than dialogue. We checked the

A group which listened to Gratin dialogue listened to it at the beginning of the experiment. But another group which listened to Gratin monologue listened to Riceball speech before they listen to Gratin monologue. Therefore each participant who listened to Gratin monologue already knows what kind of inquiries they will be asked. They can concentrate to grasp such points. We did additional experiments with smaller participant where each participant listened to the Gratin dialogue after listening to the Riceball speech. Then, the results became closer to the Riceball case. Actually, Gratin dialogue can not be such worse than its monologue. In the free answer opinion in the questionnaire, more participants wrote

This situation also occurred for Riceball case, i.e., Riceball monogolue was heard after Gratin speech. The difference between Dialogue and Monologue for riceball recipe can be bigger

We asked participant which you prefer between monologue and dialogue. Then more than

As for recipe listening, dialogue seems slightly better than monologue. However, there are several factors in our experiment which affect the result in favor of monologue. We utilized written text on the web as a text for monologue. The important points of the recipe are listed at the end of the texts, therefore it will be memorable to listeners. If we make text for dialogue from written text, the result will be better than the one in our current settings.

As for news listening, the second speaker inserted only several inquiries about topics talked next. This is not a conversation but something like an interview. Some participants strongly prefer this situation. We should establish the way to generate dialogues properly from texts. Our hypothesis is that dialogue is more useful to get information while doing something. However, in this experiment, participants were asked to listen to monologue and dialogue and answer questionnaire. This situation is different from our original settings. We should

two third of participants explicitly wrote that they prefer dialogue to monologue.

 Dialogue: 8.1 items (111) Monologue: 7.3 items (100) Number of words for cooking procedure

 Dialogue: 99 words (96) Monologue: 103 words (100)

 Dialogue: 61 words (124) Monologue: 49 words (100)

result carefully and found the followings;

than the figures above.

**3.2.2 News** 

**3.2.3 Discussion** 

that they prefer dialogue to monologue than the reverse.

make more suitable way to verify our hypothesis.

Number of word for the important points for the procedure

#### **4. Efficient knowledge transfer by hearing a conversation while doing something**

We started to develop a mechanism to achieve natural knowledge acquisition for humans by turning information that is written in documents into conversational text. Efficient methods of acquiring knowledge include not only "reading documents" and "listening to passages read aloud," but also "hearing a conversation while doing something," provided that information is appropriately embedded into the conversation. We believe that we can verify that this "conversation hearing" can assist knowledge acquisition by developing a system for synthesizing conversations by collecting fragments of conversation and conducting experiments by using the system.

As a means to transfer information, contents conveyed by an interpretive reading with pronounced intonation are better retained in memory than if read monotonously from a document or summary. Furthermore, by turning contents into conversation style, even someone who is not concentrating on listening may become interested in the topic and acquire the contents naturally. This suggests that several factors in conversations, such as throwing in words of agreement, pauses and questions, which may appear to decrease the density of information, are actually effective means of transferring information matching humans' ability to acquire knowledge with limited concentration. Based on this idea, we propose a novel mechanism of an information transfer system by considering the way of transferring knowledge from computers to humans.

Various dialogue systems have already been developed as communication tools between humans and computers (Waizenbaum, 1966; Matsusaka et al., 1999). However, in our novel approach, the dialogue system regards the user as an outsider, presents conversation by two speakers in the computer which is of interest to the outside user, and thus provides the user with useful knowledge.

There are dialogue systems (Nadamoto & Tanaka, 2004; ALICE; UZURA) which can join in a conversation between a human and a computer, but they simply create fragments of conversation and so do not sound like an intelligent human speaker. One reason is that they do not aim to provide knowledge or transfer information to humans, and few theoretical evaluations have been done in this field. In this research, we consider a way to transfer knowledge and develop a conversation system which generates dialogue by which humans can acquire knowledge from dialogue conducted by two speakers in the computer. We analyze the way to transfer knowledge to humans with this system. This kind of research is beneficial not only from an engineering viewpoint but also cognitive science and cognitive linguistics. Furthermore, a speech synthesis system in which two participants conduct spoken conversation automatically is rare. In this research, we develop an original information-providing system by assigning conversation to two speakers in the computer in order to transfer knowledge to humans.

#### **5. System implementation**

The principle of Sophisticated Eliza is that because a large amount of text data is available, even if the recall of information extraction is low, we can obtain sufficient information to generate short conversations. However, the rules still need to be improved by careful analysis of input texts.

Efficiency of Knowledge Transfer by Hearing a Conversation While Doing Something 75

4. Improvement of conversation script and template considering "fragment of

domain knowledge about cooking such as cookware, cookery and ingredient.

By considering the useful part of information written in the knowledge base, we modify the templates to extract conversational text. Contextual information such as ellipsis and anaphora is also treated in this part. As a first step, we will handle anaphora resolution in a specific domain, such as cooking, considering factors described at 2). We will use

We will conduct tests with participants to evaluate our methodology and verify the effectiveness of our method for transferring knowledge. So far, we are reported by some small number of participants that it is rather easy to listen to the voice of the system,

We introduced an approach for developing an information-providing system in order to support knowledge acquisition. The system can transfer knowledge to humans even while the person is doing something or is not concentrating on listening to the voice. Our approach does not create a summary of the key points of what is being read out, but focuses on the knowledge transferring method. Specifically, to provide knowledge efficiently, we consider what kinds of conversation are naturally retained in the brain, as such conversations may enable people to obtain knowledge more easily. We aim to construct an intelligent system which can create such conversations by applying natural language

We would like to thank Mr. Satoshi Watanabe of WANT Co. Ltd. for his support on tuning

A.L.I.C.E. The Artificial Linguistic Internet Computer Entity, http://alice.pandorabots.com

Isahara, H.; Yamamoto, E.; Ikeno, A. & Hamaguchi, Y. (2005). Eliza's daughter. *Annual* 

Kodansha International (1998). *The Kodansha Bilingual Encyclopedia about Japan* (in Japanese and English), Kodansha International Ltd., ISBN 978-477-0021-30-4, Tokyo, Japan

speech synthesizer to our voice and generate all speech data for our experiments.

Artificial non-Intelligence UZURA, http://www.din.or.jp/~ohzaki/uzura.htm

*Meeting of Association for Natural Language Processing of Japan*, Japan

Sophisticated Eliza outputs informative short conversations, but the content of the conversation is not consistent as a whole. In this research, we are developing a system to provide people with some useful knowledge. We have to recognize the useful part of the knowledge base and to place great importance on the extracted useful part of the text. We previously reported how to extract an informative set of words using a measure of inclusive relations (Yamamoto et al., 2005), and will apply a similar method

3. Mechanism to extract (fragment of) knowledge from text

however, objective evaluation is still our future work.

to this conversation system.

knowledge"

5. Evaluation

**6. Conclusion** 

processing techniques.

**7. Acknowledgment** 

**8. References** 

As for the information transfer system, although our final target is to handle topics which are practically useful such as knowledge from newspapers, encyclopedia and Wikipedia, as a first step we are trying to compile rules for small procedural domains such as cooking recipes. Concretely, we are developing the new system via the following five steps repeatedly.

1. Enlargement of conversational script and template in order to generate sentences in natural conversation

We have already compiled simple templates for extracting fragments of conversation as a part of Sophisticated Eliza. We are now enlarging the set of templates to handle wider contexts, domain-specific knowledge and insertion of words. This enlargement is basically being done manually. Here, domain-specific knowledge includes domain documents in a specific format, such as recipes. Insertion of words includes words of agreement and encouragement for the other speaker, part of which is already introduced in Sophisticated Eliza. An example of synthesized conversation is shown in Figure 4.

2. Implementation of system in which two speakers (agents/characters) make conversation in a computer considering dialogue and document contexts

Using the conversational templates extracted based on the contexts, the system continues conversation with two speakers. Fundamental functions of this kind have already been developed for Sophisticated Eliza.

Here, there are two types of "context." One is the context in the documents, i.e. knowledge-base. For the recipe example, cooking heavily depends on the order of each process and on the result of each process. The other type is the context in the conversation.

A: Let's make boiled scallop with lettuce and cream. B: It is 244 Kcal for one person. A: What kinds of materials are needed? B: Lettuce and scallop. For four persons, four peaces of tomatoes and …… …………… A: How will we cook lettuce? B: Pour off the hot water after boiling it. Then cool it. A: How about tomatoes?

B: Remove seeds and dice them.

#### Fig. 4. Example conversation

If all subevents included in an event are explicitly uttered in conversation, it would be very dull and makes understanding obstruct. For example, "Make hot water in a pan. Peel potatoes and boil them" is enough and it is not necessary to say "boil peeled potatoes in the hot water in a pan." Appropriate use of ellipsis and anaphoric representation based on the context in the conversation are useful tools for easy understanding.

Though speech synthesis itself is out of the scope of our research, pauses in utterances are also important in natural communication.


By considering the useful part of information written in the knowledge base, we modify the templates to extract conversational text. Contextual information such as ellipsis and anaphora is also treated in this part. As a first step, we will handle anaphora resolution in a specific domain, such as cooking, considering factors described at 2). We will use domain knowledge about cooking such as cookware, cookery and ingredient.

5. Evaluation

74 Intelligent Systems

As for the information transfer system, although our final target is to handle topics which are practically useful such as knowledge from newspapers, encyclopedia and Wikipedia, as a first step we are trying to compile rules for small procedural domains such as cooking recipes.

1. Enlargement of conversational script and template in order to generate sentences in

2. Implementation of system in which two speakers (agents/characters) make

Using the conversational templates extracted based on the contexts, the system continues conversation with two speakers. Fundamental functions of this kind have

Here, there are two types of "context." One is the context in the documents, i.e. knowledge-base. For the recipe example, cooking heavily depends on the order of each process and on the result of each process. The other type is the context in the

If all subevents included in an event are explicitly uttered in conversation, it would be very dull and makes understanding obstruct. For example, "Make hot water in a pan. Peel potatoes and boil them" is enough and it is not necessary to say "boil peeled potatoes in the hot water in a pan." Appropriate use of ellipsis and anaphoric representation based on the context in the conversation are useful tools for easy

Though speech synthesis itself is out of the scope of our research, pauses in utterances

conversation in a computer considering dialogue and document contexts

A: Let's make boiled scallop with lettuce and cream.

For four persons, four peaces of tomatoes and ……

B: Pour off the hot water after boiling it. Then cool it.

already been developed for Sophisticated Eliza.

B: It is 244 Kcal for one person.

A: How will we cook lettuce?

A: How about tomatoes? B: Remove seeds and dice them.

B: Lettuce and scallop.

A: What kinds of materials are needed?

We have already compiled simple templates for extracting fragments of conversation as a part of Sophisticated Eliza. We are now enlarging the set of templates to handle wider contexts, domain-specific knowledge and insertion of words. This enlargement is basically being done manually. Here, domain-specific knowledge includes domain documents in a specific format, such as recipes. Insertion of words includes words of agreement and encouragement for the other speaker, part of which is already introduced in Sophisticated Eliza. An example of synthesized conversation is shown in

Concretely, we are developing the new system via the following five steps repeatedly.

natural conversation

Figure 4.

conversation.

Fig. 4. Example conversation

……………

understanding.

are also important in natural communication.

We will conduct tests with participants to evaluate our methodology and verify the effectiveness of our method for transferring knowledge. So far, we are reported by some small number of participants that it is rather easy to listen to the voice of the system, however, objective evaluation is still our future work.

#### **6. Conclusion**

We introduced an approach for developing an information-providing system in order to support knowledge acquisition. The system can transfer knowledge to humans even while the person is doing something or is not concentrating on listening to the voice. Our approach does not create a summary of the key points of what is being read out, but focuses on the knowledge transferring method. Specifically, to provide knowledge efficiently, we consider what kinds of conversation are naturally retained in the brain, as such conversations may enable people to obtain knowledge more easily. We aim to construct an intelligent system which can create such conversations by applying natural language processing techniques.

#### **7. Acknowledgment**

We would like to thank Mr. Satoshi Watanabe of WANT Co. Ltd. for his support on tuning speech synthesizer to our voice and generate all speech data for our experiments.

#### **8. References**

A.L.I.C.E. The Artificial Linguistic Internet Computer Entity, http://alice.pandorabots.com Artificial non-Intelligence UZURA, http://www.din.or.jp/~ohzaki/uzura.htm

Isahara, H.; Yamamoto, E.; Ikeno, A. & Hamaguchi, Y. (2005). Eliza's daughter. *Annual Meeting of Association for Natural Language Processing of Japan*, Japan

Kodansha International (1998). *The Kodansha Bilingual Encyclopedia about Japan* (in Japanese and English), Kodansha International Ltd., ISBN 978-477-0021-30-4, Tokyo, Japan

**4** 

**Algorithm Selection:** 

**From Meta-Learning** 

**to Hyper-Heuristics** 

*1Instituto Tecnológico de Cd. Madero* 

*4Universidad de Ciudad Juárez* 

*México* 

*3Universidad Politécnica de Nuevo León* 

Laura Cruz-Reyes1, Claudia Gómez-Santillán1,

*2Centro Nacional de Investigación y Desarrollo Tecnológico* 

In order for a company to be competitive, an indispensable requirement is the efficient management of its resources. As a result derives a lot of complex optimization problems that need to be solved with high-performance computing tools. In addition, due to the complexity of these problems, it is considered that the most promising approach is the solution with approximate algorithms; highlighting the heuristic optimizers. Within this category are the basic heuristics that are experience-based techniques and the metaheuristic

A variety of approximate algorithms, which had shown satisfactory performance in optimization problems, had been proposed in the literature. However, there is not an algorithm that performs better for all possible situations, given the amount of available strategies, is necessary to select the one who adapts better to the problem. An important

The chapter begins with the formal definition of the Algorithm Selection Problem (ASP), since its initial formulation. The following section describes examples of "Intelligent Systems" that use a strategy of algorithm selection. After that, we present a review of the literature related to the ASP solution. Section four presents the proposals of our research group for the ASP solution; they are based on machine learning, neural network and hyper-heuristics. Besides, the section presents experimental results in order to conclude about the advantages and disadvantages of each approach. Due to a fully automated solution to ASP is an undecidable problem, Section Five reviews other less rigid approach which combines intelligently different strategies: The Hybrid Systems of

algorithms that are inspired by natural or artificial optimization processes.

point is to know which strategy is the best for the problem and why it is better.

**1. Introduction** 

Metaheuristics.

Joaquín Pérez-Ortega2, Vanesa Landero3, Marcela Quiroz1 and Alberto Ochoa4

