3. Perspectives at inductive modeling and data mining

### 3.1. Approaches and opinions

task in focus is player modeling or, more generally, user modeling. Seen from the data point of view, it is string mining. Seen from the viewpoint of algorithms deployed, the task is pattern

To achieve a high expressiveness, the authors prefer logical terminology powerful enough to approximately represent human goals, intentions, preferences, desires, fears, and the like. Seen

No doubt, digitalization pervades nearly every sphere of life. Humans are facing more and more digital systems at their workplaces, in everyday education, in their spare time, and in health care. With the US Food and Drug Administration's approval of aripiprazole tablets with sensors in November 2017 [11], the digitalization reaches the inside of the human body.

Frequently, the process of digitalization is placing on humans the burden of learning about new digital systems and how to use them appropriately. More digital systems do not necessarily ease the human life. To use them effectively, users need to become acquainted with software tools, have to understand the interfaces, and have to learn how to wield the tools. "A tool is something that does not do anything by itself unless a user is wielding it appropriately. Tools are valuable for numerous simple tasks and in cases in which a human knows precisely how to operate the tool. Those tools have their limitations as soon as dynamics come into play. There are various sources of dynamics, such as a changing world or human users with different wishes, desires, and needs" (see [12], p. xii). As the present authors put it earlier, the digitalization process "bears abundant evidence of the need for a paradigmatic shift from digital

Thinking about human assistance, the most helpful assistants are those who have own ideas, go their own ways, and—from time to time—surprise us with unexpected outcomes. This does

Approaches to intelligent system assistance are manifold (e.g., see [13, 14] and the references

Digital assistants are programmed to behave differently in different conditions such as varying environmental or infrastructure contexts and varying human users with different prior knowledge, preferences, skills, needs, desires, fears, and the like. To adapt accordingly, assistant systems need to learn from the data available. In a sense, a digital assistant system has "to ask itself," so to speak, how to learn what the user needs from sparse information such as mouse

Seen in its right perspective, digital assistant systems are facing problems of learning from incomplete information sometimes called inductive inference [17]. Digital assistant systems are

The purpose of the system's learning is understanding the context of interaction to adapt to. In

this chapter, the authors confine themselves to understanding the human user.

this way, the task is theory induction, and the method is hypotheses refutation [9, 10].

2. From software tools to intelligent assistant systems

tools to intelligent assistant systems" (see [7], p. 28).

therein including the authors' contributions [15, 16]).

apply to digital assistant systems as well.

clicks or wisps over the screen.

necessarily learning systems.

inference.

48 Data Mining

Already for decades, the misconception of data mining as digging for golden nuggets is spooking through the topical literature [25, 26]. Some authors believe that data mining means somehow squeezing out insights from the given data and put this opinion in words such as "visualization exploration is the process of extracting insight from data via interaction with visual depictions of that data" (see [27], p. 357).

Instead, data mining is a creative process of model formation based on incomplete information (see [7], p. 108). In brevity, data mining is inductive modeling.

The details of the inductive modeling process depend substantially on the data, on the goal of modeling, of the algorithmic technologies in use, and on the underlying model concept including the syntax of representing models [28, 29]. For going into the details of the model concept, [30] is of particular interest.

In the thesis [30, 31] is taken as a basis for a systematic framework of data mining design in which the model concept resides in the center. An outer frame, so to speak, consists of the application domain, the methods of model construction, and the methods of model use. Every concrete model depends (a) on the context of modeling, (b) on communities of practice, (c) on the purpose of modeling, and, possibly, (d) on models generated earlier (see [31], p. 37). This covers Fayyad's KDD process [32] and the CRISP data mining model [33].

Figure 1. Fayyad's KDD process according to [32] vs. the CRISP data mining model as in [33].

With respect to the difficulty of learning from incomplete information, process models of data mining allow for cyclic processing as shown in Figure 1. Consequently, data mining has to be seen as a process over time that does not result in an alone model, but in an unforeseeably long sequence of subsequently generated hypothetical models. This does perfectly resemble the learning systems perspective of [17].

most importantly, provides algorithmic concepts for learning patterns from instances [40]. Exactly this is what an assistant needs to do when collecting sequences of interaction data.

Mining HCI Data for Theory of Mind Induction http://dx.doi.org/10.5772/intechopen.74400 51

In game studies such as [3, 5], interaction is abstractly represented by finite sequences over an alphabet A that contains identifiers of all possible activities of all engaged agents such as human players, non-player characters (NPCs), and other computerized components. A<sup>+</sup> denotes the set of all those strings, and given any particular game G, Π(G) is the subset of all strings that can occur according to the rules and mechanics of G. Angluin's pattern concept describes string properties of a certain type. If instances ω1, …, ω<sup>n</sup> ∈ Π(G) occur, the learning task consists in finding a pattern p that holds in all the observed strings. In a sense, p is a

The authors generalize the before-mentioned approach toward human-computer interaction in

The validity of the logical expression {ω1, …, ωn} ⊨ p means consistency of the hypothesis p with the set of observations Ω<sup>n</sup> = {ω1, …, ωn} it is built upon. In conditions more general than pattern inference according to [40], the choice of the logic is decisive to consistency. In the conventional case, the question for consistency Ω<sup>n</sup> ⊨ p is recursively decidable but NP-hard. Concerning the background of computability and complexity, the authors rely on [41, 42]. For the moment, let us assume any suitable logic. Details will be discussed as soon as they become interesting.

A closer look at conventional data mining process models as in Figure 1 reveals that original data appear somehow static. Both models on display show data represented by a drum icon. An emergence over time is beyond the limits of conventional perspectives. In contrast, humancomputer interaction data emerge over time [3, 5, 7]. This leads to the learning task of processing sequences Ω<sup>1</sup> ⊆ Ω<sup>2</sup> ⊆ … ⊆ Ω<sup>n</sup> ⊆ … of growing finite data sets of observations. When learning patterns according to [39], the learner returns hypotheses p1, p2, …, pn, … such

Consistence is a critical requirement and may be refined by approximations in different ways. In learning theory, it is known that algorithms that are allowed to temporarily return inconsistent hypotheses are of higher effectiveness [17, 43, 44]. The authors refrain from a detailed

Extending the abovementioned approach, one arrives at an understanding of mining HCI data as the induction of theories over emerging sequences Ω<sup>1</sup> ⊆ Ω<sup>2</sup> ⊆ … ⊆ Ω<sup>n</sup> ⊆ … of data. The result is a corresponding sequence of logical theories T1, T2, …, Tn, … which, if possible, should converge to an ultimate explanation T of the observed human-computer interaction, i.e., Ω<sup>n</sup> ⊨

By way of illustration, [5] is based on a case study in which the sequence of theories begins with some default T0 and consists of 35 subsequent hypothetical theories. In this sequence, subsequent theories remain unchanged frequently. There are only nine changes of hypotheses.

Note that there are varying other approaches to deal with dynamics such as [45, 46]. The authors, however, stick to the logical approach for its declarativity and expressiveness.

theory with {ω1, …, ωn} ⊨ p, where ⊨ denotes the logical consequence operator.

general beyond the limits of game play as undertaken in [6, 7].

that every hypothesis pn is consistent with the underlying data set Ωn.

The final theory reasonably explains the overall human user's behavior.

discussion of these effects, for reasons of space.

T for all sets of observations.

In the authors' opinion, the thinking aboutemerging sequences of hypothesesis badly underestimated in contemporary data mining investigations. Pondering model concepts is not sufficient. We need to put emphasis on the investigation of suitable spaces of model hypotheses.

### 3.2. Theories of mind

Throughout the rest of this chapter, spaces of hypotheses will be spaces of logical theories. The concept theory of mind is adopted and adapted from behavioral research in animals [34]. There is much evidence that certain animals reflect about intentions and behaviors of other animals [35, 36]. Birds of the species Aphelocoma californica—the western scrub jay, esp. the California scrub jay—are food-caching. They do not only cache food but also colorful and shiny objects such as plastic toys. In case such a bird, let us name it A, is caching food or other treasures, and if it is watched by another bird of its species, we name it B, then A returns shortly after to unearth the treasures cached before. The interpretation is, loosely speaking, that the bird A thinks about the possibly malicious thoughts of the bird B. It builds its own theory of mind. More generally speaking, thinking about another one's thoughts means to build a theory of mind.

The authors aim at digital assistant systems able "to understand their human users" by hypothesizing theories of mind. Anthropomorphically speaking, digital assistant systems shall be enabled "to think about their human user's thoughts." The cornerstone has been laid in [1, 2]. Case studies as in [4, 5, 37] demonstrate that this is possible.

For this purpose, user models are seen as theories—just formalizations of theories of mind such that human user modeling becomes theory induction. The conceptual approach is called theory of mind modeling and induction [1, 2] based on human-computer interaction data.

### 3.3. Data mining as theory induction based on HCI data

What the system "knows" about its human user comes from an analysis of interaction data. [3] describes a study based on a commercial digital game. When playing the game, players may learn about pieces of legerdemain. They play successfully when being able to script the necessary steps for doing conjuring tricks. Patterns of game playing behavior reveal the human players' success or failure. Instances of those patterns are shown in recorded game play. It is the system's task to learn patterns from their instances. This approach is generalized toward theory induction.

The concept of a pattern in science dates back to work by Alexander in architecture [38, 39]. Angluin redefines the pattern concept for purposes of formal language investigations and, most importantly, provides algorithmic concepts for learning patterns from instances [40]. Exactly this is what an assistant needs to do when collecting sequences of interaction data.

With respect to the difficulty of learning from incomplete information, process models of data mining allow for cyclic processing as shown in Figure 1. Consequently, data mining has to be seen as a process over time that does not result in an alone model, but in an unforeseeably long sequence of subsequently generated hypothetical models. This does perfectly resemble the learning

In the authors' opinion, the thinking aboutemerging sequences of hypothesesis badly underestimated in contemporary data mining investigations. Pondering model concepts is not sufficient. We need

Throughout the rest of this chapter, spaces of hypotheses will be spaces of logical theories. The concept theory of mind is adopted and adapted from behavioral research in animals [34]. There is much evidence that certain animals reflect about intentions and behaviors of other animals [35, 36]. Birds of the species Aphelocoma californica—the western scrub jay, esp. the California scrub jay—are food-caching. They do not only cache food but also colorful and shiny objects such as plastic toys. In case such a bird, let us name it A, is caching food or other treasures, and if it is watched by another bird of its species, we name it B, then A returns shortly after to unearth the treasures cached before. The interpretation is, loosely speaking, that the bird A thinks about the possibly malicious thoughts of the bird B. It builds its own theory of mind. More generally speaking, thinking about another one's thoughts means to build a theory of

The authors aim at digital assistant systems able "to understand their human users" by hypothesizing theories of mind. Anthropomorphically speaking, digital assistant systems shall be enabled "to think about their human user's thoughts." The cornerstone has been laid in

For this purpose, user models are seen as theories—just formalizations of theories of mind such that human user modeling becomes theory induction. The conceptual approach is called

What the system "knows" about its human user comes from an analysis of interaction data. [3] describes a study based on a commercial digital game. When playing the game, players may learn about pieces of legerdemain. They play successfully when being able to script the necessary steps for doing conjuring tricks. Patterns of game playing behavior reveal the human players' success or failure. Instances of those patterns are shown in recorded game play. It is the system's task to learn patterns from their instances. This approach is generalized toward

The concept of a pattern in science dates back to work by Alexander in architecture [38, 39]. Angluin redefines the pattern concept for purposes of formal language investigations and,

theory of mind modeling and induction [1, 2] based on human-computer interaction data.

to put emphasis on the investigation of suitable spaces of model hypotheses.

[1, 2]. Case studies as in [4, 5, 37] demonstrate that this is possible.

3.3. Data mining as theory induction based on HCI data

systems perspective of [17].

50 Data Mining

3.2. Theories of mind

mind.

theory induction.

In game studies such as [3, 5], interaction is abstractly represented by finite sequences over an alphabet A that contains identifiers of all possible activities of all engaged agents such as human players, non-player characters (NPCs), and other computerized components. A<sup>+</sup> denotes the set of all those strings, and given any particular game G, Π(G) is the subset of all strings that can occur according to the rules and mechanics of G. Angluin's pattern concept describes string properties of a certain type. If instances ω1, …, ω<sup>n</sup> ∈ Π(G) occur, the learning task consists in finding a pattern p that holds in all the observed strings. In a sense, p is a theory with {ω1, …, ωn} ⊨ p, where ⊨ denotes the logical consequence operator.

The authors generalize the before-mentioned approach toward human-computer interaction in general beyond the limits of game play as undertaken in [6, 7].

The validity of the logical expression {ω1, …, ωn} ⊨ p means consistency of the hypothesis p with the set of observations Ω<sup>n</sup> = {ω1, …, ωn} it is built upon. In conditions more general than pattern inference according to [40], the choice of the logic is decisive to consistency. In the conventional case, the question for consistency Ω<sup>n</sup> ⊨ p is recursively decidable but NP-hard. Concerning the background of computability and complexity, the authors rely on [41, 42]. For the moment, let us assume any suitable logic. Details will be discussed as soon as they become interesting.

A closer look at conventional data mining process models as in Figure 1 reveals that original data appear somehow static. Both models on display show data represented by a drum icon. An emergence over time is beyond the limits of conventional perspectives. In contrast, humancomputer interaction data emerge over time [3, 5, 7]. This leads to the learning task of processing sequences Ω<sup>1</sup> ⊆ Ω<sup>2</sup> ⊆ … ⊆ Ω<sup>n</sup> ⊆ … of growing finite data sets of observations. When learning patterns according to [39], the learner returns hypotheses p1, p2, …, pn, … such that every hypothesis pn is consistent with the underlying data set Ωn.

Consistence is a critical requirement and may be refined by approximations in different ways. In learning theory, it is known that algorithms that are allowed to temporarily return inconsistent hypotheses are of higher effectiveness [17, 43, 44]. The authors refrain from a detailed discussion of these effects, for reasons of space.

Extending the abovementioned approach, one arrives at an understanding of mining HCI data as the induction of theories over emerging sequences Ω<sup>1</sup> ⊆ Ω<sup>2</sup> ⊆ … ⊆ Ω<sup>n</sup> ⊆ … of data. The result is a corresponding sequence of logical theories T1, T2, …, Tn, … which, if possible, should converge to an ultimate explanation T of the observed human-computer interaction, i.e., Ω<sup>n</sup> ⊨ T for all sets of observations.

By way of illustration, [5] is based on a case study in which the sequence of theories begins with some default T0 and consists of 35 subsequent hypothetical theories. In this sequence, subsequent theories remain unchanged frequently. There are only nine changes of hypotheses. The final theory reasonably explains the overall human user's behavior.

Note that there are varying other approaches to deal with dynamics such as [45, 46]. The authors, however, stick to the logical approach for its declarativity and expressiveness. Different from other approaches, they dovetail logical reasoning and inductive inference [17, 43]. In this way, logics and recursion theory are underpinning data mining on HCI data.

### 3.4. Formalization and operationalization of theory induction on HCI data

The logical background of the authors' approach includes reasoning about changes in time. This leads directly to temporal logics that are around for already more than half a century [47, 48]. In these good old days, time was tense, but in conditions of digitalization, time became digital as well [49]. It was already known before that this makes a difference [50].

In the simple digital game case study [4, 5], it is sufficient to choose the Hilbert-style logic K (see [49], Section 1.6).

Which logic to choose depends on the particular domain of application. In particular, there is an indispensable need (i) to formalize background knowledge. The logic must allow for the representation of knowledge in such a way that it is easy (ii) to refute hypotheses [9, 10]. Below, we will come back to these two issues. Logics taken into account come from [49–56]. For the generic approach discussed in this section, however, the choice is subordinate.

> there is an opportunity of witchcraft, a comment (co) in response to a human user's click to inform the player what to do, the opening of the magic book (mo) to allow for scripting tricks, and, in case the trick has been scripted correctly and the user has triggered its execution by clicking to the magic wand, a virtual execution (ex) of the trick by means of a cut scene and

Mining HCI Data for Theory of Mind Induction http://dx.doi.org/10.5772/intechopen.74400 53

The (cutouts of) strings on display in Figure 2 have different properties that are indicators of the players' mastery of game play, in general, and of scripting tricks, in particular [3]. For a precise and readable treatment, actions in A are written in brackets such as [mh] and [ex]. […] abbreviates an action not of interest. Using this convention, the cutout of π<sup>1</sup> is [mh][cl][co][mb] [tp][st][mo][sc][em][…][mh][cl][mo][sc][mw-][sc][mw-][sc][mw-]. Readers may easily recognize that the player has a problem. The substring [sc][mw-] indicates a failed effort of scripting

Suppose that ≼ denotes the substring relation. π<sup>1</sup> ≼ π<sup>2</sup> means that there are (possibly empty)

By way of illustration, the following two sample formulas φ<sup>2</sup> = [sc][mw-][sc][mw-] ≼ π and φ<sup>3</sup> = [sc][mw-][sc][mw-][sc][mw-] ≼ π describe certain string properties. This justifies logical expressions such as π ⊨ φ<sup>2</sup> and π ⊨ φ<sup>3</sup> meaning that the string π satisfies the corresponding property. It is custom to say that π is a model of φ<sup>2</sup> or φ3, respectively. The intuitive meaning is quite obvious. When φ<sup>3</sup> occurs in a string π describing human game play, the player appears to stab around in the dark. According to Figure 2, it holds π<sup>1</sup> ⊨ φ3, π<sup>5</sup> ⊨ φ3, and π<sup>7</sup> ⊨ φ3. Properties of this type are called patterns. Patterns according to Angluin [40] are properties of

Because the information about the other eight strings of game play in Figure 2 is incomplete, we are not sure whether or not one of the patterns φ<sup>2</sup> and φ<sup>3</sup> is satisfied. With respect to the information available, all we know is that we are not able to disprove one of these patterns.

strings π' and π "satisfying π'π1π "= π2. In other words, π<sup>1</sup> occurs somewhere in π2.

strings that are decidable. This does obviously apply to both φ<sup>2</sup> and φ<sup>3</sup> as well.

some response (re) to the player about the success of the performance.

Figure 2. Excerpts from recorded game play of 11 subjects striving to do a conjuring trick.

and doing a conjuring trick. This will be discussed in some detail.

Speaking about human-computer interaction with the intention of user modeling by theories of mind, the fundamental question is what to take into account. Interaction may be represented on largely varying levels of granularity [57] ranging from keystrokes and wisps over the screen through compound actions to activities on a task level (named quests in the world of digital games, where the approach originated). The authors are engaged in a joint project in which even the exact position of a document on the screen plays a role in mining HCI data and, thus, must be documented in interaction representations [58].

The (finite) set of actions of interest is denoted by A. It is considered an alphabet. As usual in theoretical computer science, A\* denotes the set of all finite strings over A including the empty string ε, A<sup>+</sup> = A\*\{ε}. If it makes sense, one may restrict A<sup>+</sup> to the set Π of only those sequences that can occur in practice, Π ⊆ A<sup>+</sup> . In conditions of strictly regulated interaction possibilities, Π is a formal language [59].

Every string π ∈ Π abstractly represents some process of human-computer interaction such as a game play [3] or a session with a data analysis tool [7]. When trying "to understand the human user," π is subject to investigation. Sometimes, there is a finite subset of Π available. By way of illustration, see Figure 2 adopted and adapted from ([3], p. 89, Figure 6.3).

According to the game magazines worldwide, allegedly, the innovation of the commercial game studied in [3] consists in the unprecedented feature of players doing conjuring tricks. To investigate this in more detail, the alphabet A of actions contains player actions such as clicking to a magic head (denoted by mh in the strings on display in Figure 2), opening a grimoire, in the game called a magic book (mb), turning pages of the book when searching for an appropriate trick (tp), selecting a trick from the book (st), scripting the steps of the trick in preparation of performance (sc), and presenting the trick by means of a magic wand either successfully (mw) or not (mw-). The digital game system's actions indicated by boxes ranging from yellowish to reddish in Figure 2 are the presentation of a magic head (mh) indicating that


Figure 2. Excerpts from recorded game play of 11 subjects striving to do a conjuring trick.

Different from other approaches, they dovetail logical reasoning and inductive inference [17, 43]. In this way, logics and recursion theory are underpinning data mining on HCI data.

The logical background of the authors' approach includes reasoning about changes in time. This leads directly to temporal logics that are around for already more than half a century [47, 48]. In these good old days, time was tense, but in conditions of digitalization, time became

In the simple digital game case study [4, 5], it is sufficient to choose the Hilbert-style logic K

Which logic to choose depends on the particular domain of application. In particular, there is an indispensable need (i) to formalize background knowledge. The logic must allow for the representation of knowledge in such a way that it is easy (ii) to refute hypotheses [9, 10]. Below, we will come back to these two issues. Logics taken into account come from [49–56]. For the

Speaking about human-computer interaction with the intention of user modeling by theories of mind, the fundamental question is what to take into account. Interaction may be represented on largely varying levels of granularity [57] ranging from keystrokes and wisps over the screen through compound actions to activities on a task level (named quests in the world of digital games, where the approach originated). The authors are engaged in a joint project in which even the exact position of a document on the screen plays a role in mining HCI

The (finite) set of actions of interest is denoted by A. It is considered an alphabet. As usual in theoretical computer science, A\* denotes the set of all finite strings over A including the empty string ε, A<sup>+</sup> = A\*\{ε}. If it makes sense, one may restrict A<sup>+</sup> to the set Π of only those sequences

Every string π ∈ Π abstractly represents some process of human-computer interaction such as a game play [3] or a session with a data analysis tool [7]. When trying "to understand the human user," π is subject to investigation. Sometimes, there is a finite subset of Π available. By

According to the game magazines worldwide, allegedly, the innovation of the commercial game studied in [3] consists in the unprecedented feature of players doing conjuring tricks. To investigate this in more detail, the alphabet A of actions contains player actions such as clicking to a magic head (denoted by mh in the strings on display in Figure 2), opening a grimoire, in the game called a magic book (mb), turning pages of the book when searching for an appropriate trick (tp), selecting a trick from the book (st), scripting the steps of the trick in preparation of performance (sc), and presenting the trick by means of a magic wand either successfully (mw) or not (mw-). The digital game system's actions indicated by boxes ranging from yellowish to reddish in Figure 2 are the presentation of a magic head (mh) indicating that

way of illustration, see Figure 2 adopted and adapted from ([3], p. 89, Figure 6.3).

. In conditions of strictly regulated interaction possibilities, Π

3.4. Formalization and operationalization of theory induction on HCI data

digital as well [49]. It was already known before that this makes a difference [50].

generic approach discussed in this section, however, the choice is subordinate.

data and, thus, must be documented in interaction representations [58].

(see [49], Section 1.6).

52 Data Mining

that can occur in practice, Π ⊆ A<sup>+</sup>

is a formal language [59].

there is an opportunity of witchcraft, a comment (co) in response to a human user's click to inform the player what to do, the opening of the magic book (mo) to allow for scripting tricks, and, in case the trick has been scripted correctly and the user has triggered its execution by clicking to the magic wand, a virtual execution (ex) of the trick by means of a cut scene and some response (re) to the player about the success of the performance.

The (cutouts of) strings on display in Figure 2 have different properties that are indicators of the players' mastery of game play, in general, and of scripting tricks, in particular [3]. For a precise and readable treatment, actions in A are written in brackets such as [mh] and [ex]. […] abbreviates an action not of interest. Using this convention, the cutout of π<sup>1</sup> is [mh][cl][co][mb] [tp][st][mo][sc][em][…][mh][cl][mo][sc][mw-][sc][mw-][sc][mw-]. Readers may easily recognize that the player has a problem. The substring [sc][mw-] indicates a failed effort of scripting and doing a conjuring trick. This will be discussed in some detail.

Suppose that ≼ denotes the substring relation. π<sup>1</sup> ≼ π<sup>2</sup> means that there are (possibly empty) strings π' and π "satisfying π'π1π "= π2. In other words, π<sup>1</sup> occurs somewhere in π2.

By way of illustration, the following two sample formulas φ<sup>2</sup> = [sc][mw-][sc][mw-] ≼ π and φ<sup>3</sup> = [sc][mw-][sc][mw-][sc][mw-] ≼ π describe certain string properties. This justifies logical expressions such as π ⊨ φ<sup>2</sup> and π ⊨ φ<sup>3</sup> meaning that the string π satisfies the corresponding property. It is custom to say that π is a model of φ<sup>2</sup> or φ3, respectively. The intuitive meaning is quite obvious. When φ<sup>3</sup> occurs in a string π describing human game play, the player appears to stab around in the dark. According to Figure 2, it holds π<sup>1</sup> ⊨ φ3, π<sup>5</sup> ⊨ φ3, and π<sup>7</sup> ⊨ φ3. Properties of this type are called patterns. Patterns according to Angluin [40] are properties of strings that are decidable. This does obviously apply to both φ<sup>2</sup> and φ<sup>3</sup> as well.

Because the information about the other eight strings of game play in Figure 2 is incomplete, we are not sure whether or not one of the patterns φ<sup>2</sup> and φ<sup>3</sup> is satisfied. With respect to the information available, all we know is that we are not able to disprove one of these patterns. Needless to state that in computational logics, double negation cannot be removed [60, 61]. In other words, (¬¬p!p) is no valid axiom of (propositional) computational logics.

approach weakens the requirement (see [37], p. 12): Patterns are logical theories that are cosemi-decidable. In other words, under the assumption of an underlying logic with (i) its consequence operator ⊨, (ii) the operator's implementation ⊢, (iii) background knowledge, and (iv) current observations, the implementation ⊢ may be used to find out in a uniform way whether any set of observations and any theory are inconsistent. Furthermore, according to scenarios of analyzing human experience of patterns in HCI data [69], patterns should have the property of locality. Informally, once a pattern instance occurred, it does not disappear throughout subsequent interaction. In formal terminology, for any pattern φ and for any π1,

Mining HCI Data for Theory of Mind Induction http://dx.doi.org/10.5772/intechopen.74400 55

To sum up, theory induction on HCI data is operationalized by construction of theories and sticking to them as long as they are not refuted. The underlying decisive knowledge forms an effectively enumerable space of hypotheses. In formal language learning, the appropriate technical term is called an indexed family of formal languages [70]. For the purpose of theory induction, this concept has been slightly generalized. The authors coined the term of an indexed family of logical formulas [5]. Because logic in general is more expressive than formal languages are, there is a need for requirements that are weaker but still sufficient to allow for

Assume any logic that does not exceed the expressive power of first-order predicate calculus to allow for a completeness theorem [71]. The logic brings with it its well-formed formulas, its consequence operator ⊨ and the operator's implementation ⊢ (due to completeness). Practically, refutation completeness is sufficient [67]. By way of illustration, the authors' recent

Given domain-specific background knowledge BK, an indexed family F of logical formulas is defined by the following conditions. F = {φn}n = 0,1,2,… such that the sequence of formulas φ<sup>n</sup> is effectively enumerable. Furthermore, for any two indices m and n with m < n, the formula that

Note that the sequence of formulas {ψn}n = 0,1,2,… discussed in the context of PCP games above meets the conditions and, thus, is an example of an indexed family of logical formulas. The corresponding background knowledge comprises the rules of play including Peano arithmetic. Apparently, the authors' approach is a two-stage process above the granularity of the more conventional processes depicted in Figure 1. First, one selects an effectively enumerable space of hypotheses. Second, one performs identification by enumeration as the key learning methodology. Other conventional steps such as data selection, data preparation, and data

As the choice of spaces of hypothetic models—an issue ignored in conventional approaches is decisive, it is worth to take updates and revisions into account. The authors introduced a generalization for which they coined the term dynamic identification by enumeration [73].

application uses Horn logic and relies on the refutation completeness of Prolog [72].

occurs later in the enumeration does not imply the earlier one, i.e., BK ⊭ (φn!φm).

preprocessing occur as well [8]. However, the latter are not in focus of this chapter.

π<sup>2</sup> ∈ A\*, the validity of π<sup>1</sup> ⊨ φ implies the validity of π1π<sup>2</sup> ⊨ φ.

inductive learning.

3.5. Theory of mind model induction via identification by enumeration

Jantke [37, 62] has developed a family of games based on the Post correspondence problem (see [63], Section 2.6, pp. 88ff). Patterns that occur in game play are of higher complexity than those sketched above. Computational learning of these patterns—theory of mind induction is possible but considerably more involved. As illustrated in [64], a computer program may even learn skills the human player is not aware of. The authors confine themselves to a sketch of the essentials of PCP games within the following four paragraphs.

A Post correspondence system (see [63], p. 89)—in PCP games, this is called a pool—is a finite set of pairs of strings that may be visualized as dominos. Some of these systems have solutions; others have not. Playing a PCP game means to incrementally modify a common pool according to some rules of play. The goal is to make a pool solvable and to prevent others from doing so. Who makes a pool solvable and declares victory accordingly wins the game by showing a solution. If the player's demonstration fails, the game is lost.

Interestingly, the solvability of Post correspondence systems is algorithmically undecidable (for a comprehensive treatment of undecidability, [65] is recommended). As a consequence, a player might be unaware of being able to declare victory and to win the game accordingly. There is the phenomenon of missing a win. This may occur repeatedly.

Using elementary formalizations (see [62, 64] for details), one may write down formulas φ<sup>n</sup> of first-order predicate calculus saying that a player never misses more than n wins in a game. Whether or not ψ<sup>n</sup> holds in recorded game play π is effectively undecidable. But the problem is effectively enumerable (some call it semi-decidable).

Therefore, a computer program can watch a human playing PCP games. It can analyze strings describing the human-computer interaction for the occurrence of missing wins. The program's first hypothesis may be ψ0. In case a missing win is detected, the hypothesis is changed to ψ1. If ψ<sup>n</sup> is hypothesized, but one more missing win is diagnosed, the hypothesis is changed to ψn+1. The underlying process is identification by enumeration [66].

Let us have a look—quick and dirty—at the principle of identification by enumeration from a logical viewpoint. A space of hypotheses is an effective enumeration T0, T1, T2, T3, T4, T5, … of theories; in the paragraph before, these theories are the singleton sets {ψn}. When sets of observations Ω<sup>1</sup> ⊆ Ω<sup>2</sup> ⊆ Ω<sup>3</sup> ⊆ Ω<sup>4</sup> ⊆ Ω<sup>5</sup> … come in subsequently, learning means to search the given enumeration of hypotheses for the first theory that does not contradict the current information. Formally, a learner L getting fed in Ω<sup>n</sup> searches for k = μm [¬(Ω<sup>n</sup> ⊭ Tk)] and hypothesizes L(Ωn)=Tk. The symbol μ represents the minimum operator [41].

As explicated already much earlier [67], the key logical reasoning problem in learning from incomplete information is refutation. This is sound with related philosophical positions [11]. The crux is that ¬(Ω<sup>n</sup> ⊭ Tk) is usually undecidable as seen in the PCP game case study. This leads to the authors' original pattern concept. Whereas in [3, 68]—adopted from [40]—the assumption is that the validity of a pattern in a stream of HCI data is decidable, the ultimate approach weakens the requirement (see [37], p. 12): Patterns are logical theories that are cosemi-decidable. In other words, under the assumption of an underlying logic with (i) its consequence operator ⊨, (ii) the operator's implementation ⊢, (iii) background knowledge, and (iv) current observations, the implementation ⊢ may be used to find out in a uniform way whether any set of observations and any theory are inconsistent. Furthermore, according to scenarios of analyzing human experience of patterns in HCI data [69], patterns should have the property of locality. Informally, once a pattern instance occurred, it does not disappear throughout subsequent interaction. In formal terminology, for any pattern φ and for any π1, π<sup>2</sup> ∈ A\*, the validity of π<sup>1</sup> ⊨ φ implies the validity of π1π<sup>2</sup> ⊨ φ.

### 3.5. Theory of mind model induction via identification by enumeration

Needless to state that in computational logics, double negation cannot be removed [60, 61]. In

Jantke [37, 62] has developed a family of games based on the Post correspondence problem (see [63], Section 2.6, pp. 88ff). Patterns that occur in game play are of higher complexity than those sketched above. Computational learning of these patterns—theory of mind induction is possible but considerably more involved. As illustrated in [64], a computer program may even learn skills the human player is not aware of. The authors confine themselves to a sketch

A Post correspondence system (see [63], p. 89)—in PCP games, this is called a pool—is a finite set of pairs of strings that may be visualized as dominos. Some of these systems have solutions; others have not. Playing a PCP game means to incrementally modify a common pool according to some rules of play. The goal is to make a pool solvable and to prevent others from doing so. Who makes a pool solvable and declares victory accordingly wins the game by

Interestingly, the solvability of Post correspondence systems is algorithmically undecidable (for a comprehensive treatment of undecidability, [65] is recommended). As a consequence, a player might be unaware of being able to declare victory and to win the game accordingly.

Using elementary formalizations (see [62, 64] for details), one may write down formulas φ<sup>n</sup> of first-order predicate calculus saying that a player never misses more than n wins in a game. Whether or not ψ<sup>n</sup> holds in recorded game play π is effectively undecidable. But the problem is

Therefore, a computer program can watch a human playing PCP games. It can analyze strings describing the human-computer interaction for the occurrence of missing wins. The program's first hypothesis may be ψ0. In case a missing win is detected, the hypothesis is changed to ψ1. If ψ<sup>n</sup> is hypothesized, but one more missing win is diagnosed, the hypothesis is changed to ψn+1.

Let us have a look—quick and dirty—at the principle of identification by enumeration from a logical viewpoint. A space of hypotheses is an effective enumeration T0, T1, T2, T3, T4, T5, … of theories; in the paragraph before, these theories are the singleton sets {ψn}. When sets of observations Ω<sup>1</sup> ⊆ Ω<sup>2</sup> ⊆ Ω<sup>3</sup> ⊆ Ω<sup>4</sup> ⊆ Ω<sup>5</sup> … come in subsequently, learning means to search the given enumeration of hypotheses for the first theory that does not contradict the current information. Formally, a learner L getting fed in Ω<sup>n</sup> searches for k = μm [¬(Ω<sup>n</sup> ⊭ Tk)] and

As explicated already much earlier [67], the key logical reasoning problem in learning from incomplete information is refutation. This is sound with related philosophical positions [11]. The crux is that ¬(Ω<sup>n</sup> ⊭ Tk) is usually undecidable as seen in the PCP game case study. This leads to the authors' original pattern concept. Whereas in [3, 68]—adopted from [40]—the assumption is that the validity of a pattern in a stream of HCI data is decidable, the ultimate

hypothesizes L(Ωn)=Tk. The symbol μ represents the minimum operator [41].

other words, (¬¬p!p) is no valid axiom of (propositional) computational logics.

of the essentials of PCP games within the following four paragraphs.

54 Data Mining

showing a solution. If the player's demonstration fails, the game is lost.

There is the phenomenon of missing a win. This may occur repeatedly.

effectively enumerable (some call it semi-decidable).

The underlying process is identification by enumeration [66].

To sum up, theory induction on HCI data is operationalized by construction of theories and sticking to them as long as they are not refuted. The underlying decisive knowledge forms an effectively enumerable space of hypotheses. In formal language learning, the appropriate technical term is called an indexed family of formal languages [70]. For the purpose of theory induction, this concept has been slightly generalized. The authors coined the term of an indexed family of logical formulas [5]. Because logic in general is more expressive than formal languages are, there is a need for requirements that are weaker but still sufficient to allow for inductive learning.

Assume any logic that does not exceed the expressive power of first-order predicate calculus to allow for a completeness theorem [71]. The logic brings with it its well-formed formulas, its consequence operator ⊨ and the operator's implementation ⊢ (due to completeness). Practically, refutation completeness is sufficient [67]. By way of illustration, the authors' recent application uses Horn logic and relies on the refutation completeness of Prolog [72].

Given domain-specific background knowledge BK, an indexed family F of logical formulas is defined by the following conditions. F = {φn}n = 0,1,2,… such that the sequence of formulas φ<sup>n</sup> is effectively enumerable. Furthermore, for any two indices m and n with m < n, the formula that occurs later in the enumeration does not imply the earlier one, i.e., BK ⊭ (φn!φm).

Note that the sequence of formulas {ψn}n = 0,1,2,… discussed in the context of PCP games above meets the conditions and, thus, is an example of an indexed family of logical formulas. The corresponding background knowledge comprises the rules of play including Peano arithmetic.

Apparently, the authors' approach is a two-stage process above the granularity of the more conventional processes depicted in Figure 1. First, one selects an effectively enumerable space of hypotheses. Second, one performs identification by enumeration as the key learning methodology. Other conventional steps such as data selection, data preparation, and data preprocessing occur as well [8]. However, the latter are not in focus of this chapter.

As the choice of spaces of hypothetic models—an issue ignored in conventional approaches is decisive, it is worth to take updates and revisions into account. The authors introduced a generalization for which they coined the term dynamic identification by enumeration [73].
