**Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids**

Jorge Rodríguez1, Teresa Gutiérrez2, Emilio J. Sánchez1, Sara Casado2 and Iker Aguinaga1 *1CEIT, Centro de Estudios e Investigaciones Técnicas de Gipuzkoa 2Tecnalia Spain* 

#### **1. Introduction**

42 Virtual Reality and Environments

http://en.wikipedia.org/wiki/Poser http://en.wikipedia.org/wiki/Video\_card http://en.wikipedia.org/wiki/Workstation

http://vresources.org/company\_link\_list

http://www.acroneng.com/3D%20Modeller.htm

http://www.eias3d.com/products/eias-overview

http://www.laner.com/en/media/witness/vr.cmf

http://www.flexsim.com/products/ds/

http://www.newtek.com/lightwave.html

http://www.virtualheroes.com/about.asp

http://www.ashlar.com/3d-modeling/3d-modeling-cobalt.html

http://www.simulation.com/products/glstudio/glstudio.html http://www.thegamecreators.com/?m=view\_products&id=2092

http://www.maxon.net/products/cinema-4d-studio/who-should-use-it.html

http://usa.autodesk.com/maya

http://www.3dace.com http://www.3dwebtech.co.uk

http://www.adacel.com http://www.alibre.com

http://www.edgedsign.com

http://www.sidefx.com

http://www.vxsim.com

http://www.icarusstudios.com/ http://www.isl.uiuc.edu/syzygy.htm

http://www.5dt.com

http://www.cae.ca/

A high percentage of the human activities are based on *procedural tasks*, for example cooking a cake, driving a car, fixing a machine, etc. Reading an instruction book, watching a video or listening the explanation from an expert have been the *traditional methods* to learn procedural tasks. But, most researchers agree that procedural tasks are learnt gradually as a result of practice through repeating exposures to the task.

Nowadays, *Virtual Reality (VR) technologies* can be used to improve the learning of a procedural task and moreover to evaluate the performance of the trainee. For example, a Virtual Environment (VE) can allow the trainees to interact physically with the virtual scenario, integrating a haptic device with the computer vision system, so that trainees can interact and manipulate the virtual objects, feeling the collisions among them and, the most important, they can practice the task under the approach of *learning by doing*. In this way, trainees can practice in order to improve their abilities until they are proficient with the task.

This chapter introduces a **new Multimodal Training System (MTS)** that provides a **new interactive tool** for assisting the trainees during the **learning of a procedural task** in the assembly and disassembly operations. This MTS uses the haptic feedback to simulate the real behaviour of the task and special visual aids to provide information and help trainees to undertake the task. One of the main advantages of this platform is that trainees can learn and practice different procedural tasks without the necessity of having physical access to real tools, components and machines. For example, trainees can assemble/disassemble the different components from a virtual machine as it would be done in the real life. This system was designed and implemented as an activity in the context of the SKILLS project1.

During the *learning of a task*, most of the times, **trainees need to receive aids with information about how to proceed** with the task. From the point of view of the authors, these aids can be divided into two groups according to the type of information that they provide and how this information is provided: *direct aids and indirect aids***.** The difference between both types of aids is in the cognitive load requested by the trainees to interpret and understand the information provided by the aid. In this way, *indirect aids* require that trainees are active in

<sup>1</sup> IST FP6 ICT-IP-035005-2006

There are many ways by which people can learn a new task. People can learn by reading instructions, by watching a video, by listening to an expert, and of course by practicing themselves. In the context of procedural tasks, the training involves the learning of the sequence of the needed steps to accomplish the goal of the task, as well as, the learning of

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 45

Edgar Dale (Dale (1969)) developed from his experience in teaching and his observations of learners the *cone of learning*. He demostrates that the least effective methods to learn and remember things are by means of reading a manual and hearing a presentation because after two weeks people remember the 10% of what they read and 20% of what they hear. In contrast, the most effective method involves direct learning experiences, such as doing the physical task and simulating the real task because people are able to remember the 90% of what they do.

*Learning by doing* is the acquisition of knowledge or abilities through *direct experience of carrying out a task*, as a part of a training, and it is closely associated with the practical experience (McLaughlin & Rogers (2010)). The reason to focus on guided enactive training is that it is direct, intuitive, fast, embodied into common action-perception behaviours, and that it may be more efficient than other types of training methods for becoming an expert. Besides, if the learning by doing approach is supervised, trainees not only could practice the task, but they also could also receive feedback about their actions along the training sessions in order

In this way, *direct practice* under the guidance of an expert seems to be the preferable approach for both acquiring procedural cognitive knowledge and motor skills associated with procedural tasks. However, this preferred situation is expensive and many times hard to achieve with the traditional training methods. Sometimes the practice on the real environment is impossible due to safety, cost and time constraints; other times the physical collocation of the trainee and expert is impossible. Consequently, new training tools are requested to improve the learning of procedural tasks. And here it is where the VR technologies can play

Virtual environments are increasingly used for teaching and training in a range of domains including surgery (Albani & Lee (2007); Howell et al. (2008)), aviation (Blake (1996)), anesthesia (Gaba (1991); Gaba & DeAnda (1988)), rehabilitation (Holden (2005)), aeronautics assembly (Borro et al. (2004); Savall et al. (2004)) and driving (Godley et al. (2002); Lee et al. (1998)). These simulated and virtual worlds can be used for acquiring new abilities or improving existing ones (Derossis et al. (1998); Kneebone (2003); Weller (2004)). In general, VR technologies can provide new opportunities compared to traditional training methodologies to capture the essence of the abilities involved in the task to be learnt, allowing to store them and to transfer them efficiently. The speed, efficiency, and mainly transferability of training

The use of VR technologies for training provides a range of benefits with respect to traditional

are three major requests in the process of design of new virtual training systems.

**2.1 Advantages and risks of the VR technologies for training**

new motor skills or the improvement of existing ones.

This approach is called *enactive learning* or *learning by doing*.

to avoid errors and improve their performance.

**2. The use of VR technologies for training**

an important role.

training systems, mainly:

order to cognitively translate the information received to the actions that must be undertaken at that step, meanwhile using *direct aids*, the information is explicit that the cognitive load is almost null.

The *use of direct aids* is attractive for both trainers and trainees since they are easy to implement and they provide useful information to perform the task in an easy way. However, some research works such as Yuviler-Gavish et al. (Yuviler-Gavish et al. (2011)) suggest that the use of direct aids can have adverse effects on training since they prevent trainees from actively exploring the task. In addition, trainees can create dependences with the training system that impede the skills transfer in the real situation when these aids are no longer available. In this way, it would be useful to define **a strategy to provide direct aids that maintains the benefits of these aids but eliminates their negative effects.**

This chapter analyzes the **hypothesis** that if **direct aids are provided in a controlled way** during the training process, then **the transfer of knowledge will not be hindered.** This chapter describes the test conducted to compare the performances obtained with a training strategy based on controlled direct aids with a training strategy based on indirect aids.

This chapter is organized as follows. The rest of this section provides a brief introduction to theoretical concepts about procedural tasks and its learning. The section 2 describes the use of virtual reality technologies for training focusing mainly on multimodal systems and the section 3 presents a new Multimodal Training System for leaning assembly and disassembly procedural tasks. Section 4 describes the results of an experiment conducted to analyze the use of direct aids to train a procedural task. Lastly, section 5 provides the conclusions derived from this chapter and propose some future research challenges.

#### **1.1 What is a procedural task**

A *procedural task* involves the execution of an ordered sequence of operations/steps that need to be carried out to achieve a specific goal. In other words, this kind of activities require following a procedure and they are usually evaluated by the speed and accuracy of their performance. There are two main types of skills (abilities) that may be involved in each step:


The cognitive skill has an important role in the process of learning a procedural task, it reflects the ability of the humans to *obtain a good mental representation of task organization,* and to knows *what* appropriate actions should be done, *when* to do them (appropriate time) and *how* to do them (appropriate method).

#### **1.2 Learning of procedural tasks**

From a dictionary, one can identify two primary meanings for the word learning. The first definition is the *active process* of acquiring information, knowledge, procedures and capabilities for performing operations and achieving goals to satisfy the requirements of a given task. The second one is *knowledge or ability* that is acquired by instruction or study. During the learning process, the learner selects and transforms information, constructs hypotheses, and makes decisions, relying on a cognitive structure (e.g. schemas and mental models) using their past/current knowledge and experiences (Dale (1969)).

2 Will-be-set-by-IN-TECH

order to cognitively translate the information received to the actions that must be undertaken at that step, meanwhile using *direct aids*, the information is explicit that the cognitive load is

The *use of direct aids* is attractive for both trainers and trainees since they are easy to implement and they provide useful information to perform the task in an easy way. However, some research works such as Yuviler-Gavish et al. (Yuviler-Gavish et al. (2011)) suggest that the use of direct aids can have adverse effects on training since they prevent trainees from actively exploring the task. In addition, trainees can create dependences with the training system that impede the skills transfer in the real situation when these aids are no longer available. In this way, it would be useful to define **a strategy to provide direct aids that maintains the benefits**

This chapter analyzes the **hypothesis** that if **direct aids are provided in a controlled way** during the training process, then **the transfer of knowledge will not be hindered.** This chapter describes the test conducted to compare the performances obtained with a training strategy based on controlled direct aids with a training strategy based on indirect aids.

This chapter is organized as follows. The rest of this section provides a brief introduction to theoretical concepts about procedural tasks and its learning. The section 2 describes the use of virtual reality technologies for training focusing mainly on multimodal systems and the section 3 presents a new Multimodal Training System for leaning assembly and disassembly procedural tasks. Section 4 describes the results of an experiment conducted to analyze the use of direct aids to train a procedural task. Lastly, section 5 provides the conclusions derived

A *procedural task* involves the execution of an ordered sequence of operations/steps that need to be carried out to achieve a specific goal. In other words, this kind of activities require following a procedure and they are usually evaluated by the speed and accuracy of their performance. There are two main types of skills (abilities) that may be involved in each step: 1. Physical skill, called here *motor skill*, which entails the execution of physical movements,

2. Mental skill, called here *cognitive skill*, which entails the knowledge about how the task should be performed, like knowing what nut should be unscrewed in each step.

The cognitive skill has an important role in the process of learning a procedural task, it reflects the ability of the humans to *obtain a good mental representation of task organization,* and to knows *what* appropriate actions should be done, *when* to do them (appropriate time) and *how* to do

From a dictionary, one can identify two primary meanings for the word learning. The first definition is the *active process* of acquiring information, knowledge, procedures and capabilities for performing operations and achieving goals to satisfy the requirements of a given task. The second one is *knowledge or ability* that is acquired by instruction or study. During the learning process, the learner selects and transforms information, constructs hypotheses, and makes decisions, relying on a cognitive structure (e.g. schemas and mental

models) using their past/current knowledge and experiences (Dale (1969)).

almost null.

**of these aids but eliminates their negative effects.**

from this chapter and propose some future research challenges.

**1.1 What is a procedural task**

like unscrewing a nut.

them (appropriate method).

**1.2 Learning of procedural tasks**

There are many ways by which people can learn a new task. People can learn by reading instructions, by watching a video, by listening to an expert, and of course by practicing themselves. In the context of procedural tasks, the training involves the learning of the sequence of the needed steps to accomplish the goal of the task, as well as, the learning of new motor skills or the improvement of existing ones.

Edgar Dale (Dale (1969)) developed from his experience in teaching and his observations of learners the *cone of learning*. He demostrates that the least effective methods to learn and remember things are by means of reading a manual and hearing a presentation because after two weeks people remember the 10% of what they read and 20% of what they hear. In contrast, the most effective method involves direct learning experiences, such as doing the physical task and simulating the real task because people are able to remember the 90% of what they do. This approach is called *enactive learning* or *learning by doing*.

*Learning by doing* is the acquisition of knowledge or abilities through *direct experience of carrying out a task*, as a part of a training, and it is closely associated with the practical experience (McLaughlin & Rogers (2010)). The reason to focus on guided enactive training is that it is direct, intuitive, fast, embodied into common action-perception behaviours, and that it may be more efficient than other types of training methods for becoming an expert. Besides, if the learning by doing approach is supervised, trainees not only could practice the task, but they also could also receive feedback about their actions along the training sessions in order to avoid errors and improve their performance.

In this way, *direct practice* under the guidance of an expert seems to be the preferable approach for both acquiring procedural cognitive knowledge and motor skills associated with procedural tasks. However, this preferred situation is expensive and many times hard to achieve with the traditional training methods. Sometimes the practice on the real environment is impossible due to safety, cost and time constraints; other times the physical collocation of the trainee and expert is impossible. Consequently, new training tools are requested to improve the learning of procedural tasks. And here it is where the VR technologies can play an important role.

#### **2. The use of VR technologies for training**

Virtual environments are increasingly used for teaching and training in a range of domains including surgery (Albani & Lee (2007); Howell et al. (2008)), aviation (Blake (1996)), anesthesia (Gaba (1991); Gaba & DeAnda (1988)), rehabilitation (Holden (2005)), aeronautics assembly (Borro et al. (2004); Savall et al. (2004)) and driving (Godley et al. (2002); Lee et al. (1998)). These simulated and virtual worlds can be used for acquiring new abilities or improving existing ones (Derossis et al. (1998); Kneebone (2003); Weller (2004)). In general, VR technologies can provide new opportunities compared to traditional training methodologies to capture the essence of the abilities involved in the task to be learnt, allowing to store them and to transfer them efficiently. The speed, efficiency, and mainly transferability of training are three major requests in the process of design of new virtual training systems.

#### **2.1 Advantages and risks of the VR technologies for training**

The use of VR technologies for training provides a range of benefits with respect to traditional training systems, mainly:

communication in a proper way, the multimodal sytems need high requiring hardware and software technologies. Nowadays, due to the last advances in computer technologies to increase their capabilities for processing and managing diverse information in parallel, the

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 47

Fig. 1. Multimodal system is a virtual system where the user interacts with the virtual scene

According with Figure 1, a multimodal system can provide information to the user through

1. *Visual intefaces*: the primary interaction style of the multimodal interfaces is usually the visual interaction through computer display technology (such as, computer screens and 3D projections) to render the virtual scenario. In addition, these interfaces can be used to provide relevant information for the task through messages, animations and visual aids. 2. *Auditory interfaces*: traditionally, sound in computer systems has been used either to convey feedback to the users or alert them from a system event, for example, a beep to indicate an error. However, some research studies are looking at the use of more complex speech systems to provide richer non-visual information/interaction to the user. One example of the research work in auditory interfaces is the auditory icons (Gaver (1994)) which provide information about computer system events without the need to use any of the

3. *Haptic interfaces*: the word haptic comes from the greek *haptikós* and refers to the *sense of touch*. The human touch system consists of various skin receptors, muscles and tendon receptors, nerve fibers that transmit the touch signals to the touch centre of the brain, as well as the control system for moving the body. In normal tactile exploration the receptors in the hairless skin play the dominant role but in haptic interaction the focus is shifted towards the proprioceptive and kinesthetic touch systems.There is usually a distinction made between: tactile interfaces and force (kinesthetic) feedback interfaces. The tactile interface is one that provides information more specifically for the skin receptors, and thus does not necessarily require movement in the same way as a kinesthetic interface

multimodal systems are increasing their use.

through more than one sensory modality.

different types of interfaces, mainly:

visual display space.


On the other hand, the greatest potential danger of the use of VR systems is that learners become increasingly dependent on features of the system which may inhibit the ability to perform the task in the absence of the features.

Developing dependence, or at least reliance, on VR features that do not exist in the real environment or are very different from their real world counterparts can result in negative transfer to the real world. If *fidelity* cannot be preserved or is hard to achieve, it is much better to avoid the use of the VR instantiation, or alternatively, particular care must be taken to develop a training program that identifies the VR-real world mismatch to the trainee and provides compensatory training mechanisms. It may also be necessary to manipulate the relational properties between feedbacks, i.e., the congruency between visual, audio, and haptic stimulations, in order to favor cross-modal attention.

The experiment described in this chapter demonstrates that the controlled use of multimodal feedback does not damage the trainee performance when the trainee changes from the VR system to the real world and therefore it eliminates the main disadvantage of the use of VR reported in the bibliography: the negative or no transfer.

#### **2.2 Multimodal systems for training**

Within the virtual reality systems we can find the multimodal systems. The term *multimodal* comes from the word multi that refers to *more than one* and the word modal that refers to *the human sense that is used to perceive the information*, the sensory modality (Mayes (1992)). Therefore, in this chapter a *multimodal system* is defined as a virtual reality system that supports communication with the trainees through more than one sensory modality (see Figure 1), mainly visual, haptic and auditory. In order to support this multimodal

<sup>2</sup> It refers to the use of different human sensory channels to provide information to users.

4 Will-be-set-by-IN-TECH

• It allows teach a task under the approach of *learning by doing*, so the trainees can practice

• It eliminates the constraints for using the real environment mainly: availability, safety, time and costs constraints. For example, in medical domains there is no danger to the patients, and little need for valuable elements such as cadavers or animals. Similarly for aviation, there is no risk for the aircraft, nor fuel costs. In the case of maintenance tasks, VR systems can offer a risk free environment for damaging machines of technicians

• It can provide extra cues, not available in the real world, and that can facilitate the learning of the task. These cues can be based on visual, audio or/and haptic feedback. A combination of these cues is called *multimodal feedback*2. For example, to provide information about the motion trajectory that users should follow, the system could provide: visual aids (for example displaying the target trajectory on the virtual scenario) or haptic aids (for example applying an extra force to constraint the motion of the user along the target trajectory) or audio aids (for example sending a sound when the user leaves the

• It allows simulating the task in a flexible way to adapt it to the need of trainees and the training goal, for example removing some constraints of the task in order to emphasize

On the other hand, the greatest potential danger of the use of VR systems is that learners become increasingly dependent on features of the system which may inhibit the ability to

Developing dependence, or at least reliance, on VR features that do not exist in the real environment or are very different from their real world counterparts can result in negative transfer to the real world. If *fidelity* cannot be preserved or is hard to achieve, it is much better to avoid the use of the VR instantiation, or alternatively, particular care must be taken to develop a training program that identifies the VR-real world mismatch to the trainee and provides compensatory training mechanisms. It may also be necessary to manipulate the relational properties between feedbacks, i.e., the congruency between visual, audio, and

The experiment described in this chapter demonstrates that the controlled use of multimodal feedback does not damage the trainee performance when the trainee changes from the VR system to the real world and therefore it eliminates the main disadvantage of the use of VR

Within the virtual reality systems we can find the multimodal systems. The term *multimodal* comes from the word multi that refers to *more than one* and the word modal that refers to *the human sense that is used to perceive the information*, the sensory modality (Mayes (1992)). Therefore, in this chapter a *multimodal system* is defined as a virtual reality system that supports communication with the trainees through more than one sensory modality (see Figure 1), mainly visual, haptic and auditory. In order to support this multimodal

<sup>2</sup> It refers to the use of different human sensory channels to provide information to users.

• It can provide enjoyment, increasing the motivation of trainees (Scott (2005)). • It allows logging the evolution of the trainees along the training process.

the task as many time as needed to achieve the level of proficiency requested.

(Morgan et al. (2004); Sanderson et al. (2008)).

perform the task in the absence of the features.

haptic stimulations, in order to favor cross-modal attention.

reported in the bibliography: the negative or no transfer.

**2.2 Multimodal systems for training**

target trajectory).

only key aspects.

communication in a proper way, the multimodal sytems need high requiring hardware and software technologies. Nowadays, due to the last advances in computer technologies to increase their capabilities for processing and managing diverse information in parallel, the multimodal systems are increasing their use.

Fig. 1. Multimodal system is a virtual system where the user interacts with the virtual scene through more than one sensory modality.

According with Figure 1, a multimodal system can provide information to the user through different types of interfaces, mainly:


Synergy3, redundancy, and increased bandwidth of information transfer are proposed benefits of multimodal presentation (Sarter (2006)). If information is presented in redundant multiple modalities, then various concurrent forms and different aspects of the same process are presented. Concurrent information may coordinate activities in response to the ongoing

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 49

Several studies have shown that using auditory and visual feedback together increase performance compared to using each of these feedbacks separately. For example, under noisy conditions, observing the movements of the lips and gestures of the speaker can compensate for the lost speech perception which can be equivalent to increasing the auditory signal-to-noise ratio (Sumby & Pollack (1954)). It was also demonstrated that reaction times for detecting a visual signal were slower in unimodal conditions, compared to a condition in which a sound was heard in close temporal synchrony to the visual signal (Doyle & Snowden (2001)). Handel and Buffardi (Handel & Buffardi (1969)) found that in a pattern judging task, performance was better using a combination of auditory and visual cues than either audition

On the other hand, adding haptic feedback to another modality was also found to improve performance compared to the use of each modality individually. Pairs of auditory and haptic signals delivered simultaneously were detected faster than when the same signals presented unimodally (Murray et al. (2005)). In a texture discrimination task, participants accuracy was improved (less errors) when they received bi-modal visual and haptic cues as compared to uni-modal conditions in which only the haptic or the visual cue was presented (Heller (1982)). Sarlegna et al. (Sarlegna et al. (2007)) demonstrated that feedback that was combined

of different sensory modalities improved the performance in reaching task.

information is presented using multimodal feedback.

channel to provide useful information for each one.

**2.4 State of the art on multimodal systems for procedural tasks**

<sup>3</sup> The merging information from multiple aspects of the same task.

In general, the advantages of the use of multimodal feedback are summarized below:

1. Various forms and different aspects of the same task can be presented simultaneosly when

2. The use of a second channel to provide information can compensate the lost of information

3. When one sensorial channel is overloaded, other channel can be used to convey additional

4. The reaction times are slower and the performance is increased with the use of multimodal

But, in order to provide all these benefits it is essential to assure the coherence and consistence among the different stimuli. Conflicts among the information provided by the different channels can deteriorate the performance of trainees. In addition, it is important to identify the different components of the task to be learnt in order to select and employ the most suitable

Some studies have tested and analyzed the use of the multimodal systems as a simulation tool or a training tool to teach and evaluate the knowledge of a trainee in procedural tasks. The review presented in this section is focused on procedural tasks related to the assembly and

processes.

or vision alone.

of the first one.

disassembly processes.

information.

feedback.

does. For simplicity, in this chapter, the haptic devices will denote the force feedback interfaces. Haptic devices are mechatronic input-output systems capable of tracking a physical movement of the users (input) and providing them force feedback (output). Therefore, haptic devices allow users to interact with a virtual scenario through the sense of touch. In other words, with these devices the users can touch and manipulate virtual objects inside of a 3D environment as if they were working in the real environment. These devices enable to simulate both unimanual tasks, which are performed with just one hand (for example grasping an object and insert it into another one), and bimanual tasks in which two hands are needed to manipulate an object or when two objects need to be dexterously manipulated at the same time (Garcia-Robledo et al. (2011)). Figure 2 shows some commercial haptic devices. Some relevant features of a haptic device are: its workspace, the number of degrees of freedom, the maximum level of force feedback and the number of contact points. The selection of a haptic device to be used in a MTS depends on several factors mainly, on the type of task to be learnt, the preferences/needs of trainees and cost constraints.

Fig. 2. Commercial haptic devices. On the left, PHANToM haptic devices with 6-DOF by SensAble. On the middle, the FALCON device with 3-DOF by NOVINT. On the right, the Quanser 5-DOF Haptic Wand, by Quanser

The sensorial richness of multimodal systems translates into a more complete and coherent experience of the virtual scenario and therefore the sense of being present inside of this VE is stronger (Held & Durlach (1992); Sheridan (1992); Witmer & Singer (1998)). The experience of being present is specially strong if the VE includes haptic (tactile and kinesthetic) sensations (Basdogan et al. (2000); Reiner (2004)).

In this chapter, a multimodal system combined with a set of suitable training strategies can be considered as a Multimodal Training System.

#### **2.3 Multimodal feedback in Virtual Training Systems**

The use of multimodal feedback can improve perception and enhance performance in a training process, for example, tasks trained over multiple feedbacks can be performed better than tasks completed within the same feedback (Wickens (2002)). Various studies have supported the multiple resource theory of attention.Wickens et al. (Wickens et al. (1983)) found that performance in a dual-task was better when feedback was manual and verbal than when feedback was manual only. Similarly, Oviatt et al. (Oviatt et al. (2004)) found that a flexible multimodal interface supported users in managing cognitive load.

Ernst and Bülthoff (Ernst & Bülthoff (2004)) suggested that no single sensory signal can provide reliable information about the structure of the environment in all circumstances. 6 Will-be-set-by-IN-TECH

Fig. 2. Commercial haptic devices. On the left, PHANToM haptic devices with 6-DOF by SensAble. On the middle, the FALCON device with 3-DOF by NOVINT. On the right, the

The sensorial richness of multimodal systems translates into a more complete and coherent experience of the virtual scenario and therefore the sense of being present inside of this VE is stronger (Held & Durlach (1992); Sheridan (1992); Witmer & Singer (1998)). The experience of being present is specially strong if the VE includes haptic (tactile and kinesthetic) sensations

In this chapter, a multimodal system combined with a set of suitable training strategies can be

The use of multimodal feedback can improve perception and enhance performance in a training process, for example, tasks trained over multiple feedbacks can be performed better than tasks completed within the same feedback (Wickens (2002)). Various studies have supported the multiple resource theory of attention.Wickens et al. (Wickens et al. (1983)) found that performance in a dual-task was better when feedback was manual and verbal than when feedback was manual only. Similarly, Oviatt et al. (Oviatt et al. (2004)) found that

Ernst and Bülthoff (Ernst & Bülthoff (2004)) suggested that no single sensory signal can provide reliable information about the structure of the environment in all circumstances.

a flexible multimodal interface supported users in managing cognitive load.

and cost constraints.

Quanser 5-DOF Haptic Wand, by Quanser

(Basdogan et al. (2000); Reiner (2004)).

considered as a Multimodal Training System.

**2.3 Multimodal feedback in Virtual Training Systems**

does. For simplicity, in this chapter, the haptic devices will denote the force feedback interfaces. Haptic devices are mechatronic input-output systems capable of tracking a physical movement of the users (input) and providing them force feedback (output). Therefore, haptic devices allow users to interact with a virtual scenario through the sense of touch. In other words, with these devices the users can touch and manipulate virtual objects inside of a 3D environment as if they were working in the real environment. These devices enable to simulate both unimanual tasks, which are performed with just one hand (for example grasping an object and insert it into another one), and bimanual tasks in which two hands are needed to manipulate an object or when two objects need to be dexterously manipulated at the same time (Garcia-Robledo et al. (2011)). Figure 2 shows some commercial haptic devices. Some relevant features of a haptic device are: its workspace, the number of degrees of freedom, the maximum level of force feedback and the number of contact points. The selection of a haptic device to be used in a MTS depends on several factors mainly, on the type of task to be learnt, the preferences/needs of trainees Synergy3, redundancy, and increased bandwidth of information transfer are proposed benefits of multimodal presentation (Sarter (2006)). If information is presented in redundant multiple modalities, then various concurrent forms and different aspects of the same process are presented. Concurrent information may coordinate activities in response to the ongoing processes.

Several studies have shown that using auditory and visual feedback together increase performance compared to using each of these feedbacks separately. For example, under noisy conditions, observing the movements of the lips and gestures of the speaker can compensate for the lost speech perception which can be equivalent to increasing the auditory signal-to-noise ratio (Sumby & Pollack (1954)). It was also demonstrated that reaction times for detecting a visual signal were slower in unimodal conditions, compared to a condition in which a sound was heard in close temporal synchrony to the visual signal (Doyle & Snowden (2001)). Handel and Buffardi (Handel & Buffardi (1969)) found that in a pattern judging task, performance was better using a combination of auditory and visual cues than either audition or vision alone.

On the other hand, adding haptic feedback to another modality was also found to improve performance compared to the use of each modality individually. Pairs of auditory and haptic signals delivered simultaneously were detected faster than when the same signals presented unimodally (Murray et al. (2005)). In a texture discrimination task, participants accuracy was improved (less errors) when they received bi-modal visual and haptic cues as compared to uni-modal conditions in which only the haptic or the visual cue was presented (Heller (1982)). Sarlegna et al. (Sarlegna et al. (2007)) demonstrated that feedback that was combined of different sensory modalities improved the performance in reaching task.

In general, the advantages of the use of multimodal feedback are summarized below:


But, in order to provide all these benefits it is essential to assure the coherence and consistence among the different stimuli. Conflicts among the information provided by the different channels can deteriorate the performance of trainees. In addition, it is important to identify the different components of the task to be learnt in order to select and employ the most suitable channel to provide useful information for each one.

#### **2.4 State of the art on multimodal systems for procedural tasks**

Some studies have tested and analyzed the use of the multimodal systems as a simulation tool or a training tool to teach and evaluate the knowledge of a trainee in procedural tasks. The review presented in this section is focused on procedural tasks related to the assembly and disassembly processes.

<sup>3</sup> The merging information from multiple aspects of the same task.

as disassembling different components of an intelligent hydraulic excavator. To make the disassembly process easy the parts that can be currently disassembled are highlighted in blue

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 51

• HAMMS: Haptic Assembly, Manufacturing and Machining System. (Ritchie et al. (2008)) HAMMS was developed by researchers at the Heriot-Watt University to explore the use of immersive technology and haptics in assembly planning. The hardware comprises a Phantom haptic device for interaction with the virtual environment, along with a pair of CrystalEyes® stereoscopic glasses for stereo viewing if required. Central to HAMMS is the physics engine, which enables rigid body simulations in real time. HAMMS logs data for each virtual object in the scene including devices that are used for interaction. The basic logged data comprises position, orientation, time stamps, velocity and an object index (or identifying number). By parsing through the logged data text files an assembly procedure can be automatically

• SHARP: Development of a Dual-handed haptic assembly system (Seth et al. (2008)).

SHARP is a dual-handed haptic interface for virtual assembly applications. The system allows users to simultaneously manipulate and orient CAD models to simulate assembly/disassembly operations. This interface provides both visual and haptic feedback, in this way, collision force feedback was provided to the user during assembly. Using VRJuggler as an application platform, the system could operate on different VR systems configurations including low-cost desktop configurations, Power Wall, four-sided and six-sided CAVE systems. Finally, different modules were created to address issues related to maintenance, training (record and play) and to facilitate collaboration (networked communication).

• HIIVR: A haptically enabled interactive virtual system for industrial assembly (Bhatti et al.

This system is an interactive and immersive VR system designed to imitate the real physical training environments within the context of visualization and physical limitations. Head Mounted Displays are used for immersive visualization equipped with 6DOF trackers to keep the virtual view synchronized with the human vision, PHANTOM® devices are used to impose physical movement constraints. In addition, 5DT data gloves are used to provide human hand representation within the virtual world. The aim of the proposed system is to support the learning process of general assembly operators. Users can repeat their learning

As it can be seen, most of the existing multimodal systems in the domain of assembly and disassembly tasks are focused mainly on their simulation without addressing explicitly their training. In the next Section, the authors present a new multimodal training system (combining visual, audio and haptic feedback) for training assembly and disassembly procedural tasks. Figure 3 shows a comparison among the previous systems and the new

This section presents a controlled multimodal training system, for learning assembly and disassembly procedural tasks. This platform supports the approach of *learning by doing* by means of an active multimodal interaction with the virtual scenario (visual, audio and haptic), eliminating the constraints of using the physical scenario (such as availability, time, cost, and

practices until they are proficient with the assembly task.

multimodal training system described in the next Section.

**3. The new multimodal training system**

color to help users make their selection.

formulated.

(2008)).

There are several multimodal systems that simulate assembly and disassembly tasks, although most of them are designed to analize and evaluate the assembly planning during the design process of new machines and only few of them provide additional training modules. Below there is a description of some of these systems:

• VEDA: Virtual Enviromnent for Design for Assembly (Gupta et al. (1997))

It is a desktop virtual environment in which the designers see a visual 2-D virtual representation of the objects and they are able to pick and place active objects, move them around, and feel the forces through haptic interface devices with force feedback. They also hear collision sounds when objects hit each other. The virtual models are 2-D in order to preserve interactive update rates. The simulation duplicated as well as possible the weight, shape, size, and frictional characteristics of the physical task. The system simulates simple tasks, such as peg-in-hole task.

• VADE: A Virtual Assembly Design Environment (Jayaram et al. (1999))

VADE was designed and implemented at the Washington State University. It is a VR-based engineering application that allows engineers to plan, evaluate, and verify the assembly of mechanical systems. Once the mechanical system is designed using a CAD system (such as Pro/Engineer), the system automatically exports the necessary data to VADE. The various parts and tools (screw driver, wrench, and so on) involved in the assembly process are presented to users in a VE. Users perform the assembly using their hands and the virtual assembly tools. VADE supports both one-handed and two-handed assembly. The virtual hand is based on an instrumented glove device, such as the CyberGlove, and a graphical model of a hand. VADE also provides audio feedback to assist novice users. The system lets users make decisions, make design changes, and perform a host of other engineering tasks. During this process, VADE maintains a link with the CAD system. At the end of the VADE session, users have generated design information that automatically becomes available in the CAD system.

• HIDRA: Haptic Integrated Dis/Reassembly Analysis. (McDermott & Bras (1999))

HIDRA is an application that integrates a haptic device into a disassembly/ assembly simulation environment. Two PHANToM haptic interfaces provide the user with a virtual index finger and thumb force feedback. The goal is to use HIDRA as a design tool, so designers must be able to use it in parallel with other CAD packages they may use throughout the design process. The V-clip library is used to perform collision detection between object inside the virtual scene. The system simulates simple scenarios, for example a simple shaft with a ring.

• MIVAS: A Multi-modal Immersive Virtual Assembly System (Wan et al. (2004))

MIVAS is a multi-modal immersive virtual assembly system developed at the State Key Lab of CAD&CG, Zhejiang University. This system provides an intuitive and natural way of assembly evaluation and planning using tracked devices for tracking both hands and head motions, dataglove with force feedback for simulating the realistic operation of human hand and providing force feedback, voice input for issuing commands, sound feedback for prompts, fully immersive 4-sided CAVE as working space. CrystalEyes shutter glasses and emitters are used to obtain stereoscopic view. Users can feel the size and shape of digital CAD models using the CyberGrasp haptic device (by Immersion Corporation). Since Haptic feedback was only provided in griping tasks, the application lacked in providing force information when parts collided. The system can simulate complex scenarios such 8 Will-be-set-by-IN-TECH

There are several multimodal systems that simulate assembly and disassembly tasks, although most of them are designed to analize and evaluate the assembly planning during the design process of new machines and only few of them provide additional training modules. Below

It is a desktop virtual environment in which the designers see a visual 2-D virtual representation of the objects and they are able to pick and place active objects, move them around, and feel the forces through haptic interface devices with force feedback. They also hear collision sounds when objects hit each other. The virtual models are 2-D in order to preserve interactive update rates. The simulation duplicated as well as possible the weight, shape, size, and frictional characteristics of the physical task. The system simulates simple

VADE was designed and implemented at the Washington State University. It is a VR-based engineering application that allows engineers to plan, evaluate, and verify the assembly of mechanical systems. Once the mechanical system is designed using a CAD system (such as Pro/Engineer), the system automatically exports the necessary data to VADE. The various parts and tools (screw driver, wrench, and so on) involved in the assembly process are presented to users in a VE. Users perform the assembly using their hands and the virtual assembly tools. VADE supports both one-handed and two-handed assembly. The virtual hand is based on an instrumented glove device, such as the CyberGlove, and a graphical model of a hand. VADE also provides audio feedback to assist novice users. The system lets users make decisions, make design changes, and perform a host of other engineering tasks. During this process, VADE maintains a link with the CAD system. At the end of the VADE session, users have generated design information that automatically becomes available in the

• HIDRA: Haptic Integrated Dis/Reassembly Analysis. (McDermott & Bras (1999))

• MIVAS: A Multi-modal Immersive Virtual Assembly System (Wan et al. (2004))

HIDRA is an application that integrates a haptic device into a disassembly/ assembly simulation environment. Two PHANToM haptic interfaces provide the user with a virtual index finger and thumb force feedback. The goal is to use HIDRA as a design tool, so designers must be able to use it in parallel with other CAD packages they may use throughout the design process. The V-clip library is used to perform collision detection between object inside the virtual scene. The system simulates simple scenarios, for example a simple shaft with a ring.

MIVAS is a multi-modal immersive virtual assembly system developed at the State Key Lab of CAD&CG, Zhejiang University. This system provides an intuitive and natural way of assembly evaluation and planning using tracked devices for tracking both hands and head motions, dataglove with force feedback for simulating the realistic operation of human hand and providing force feedback, voice input for issuing commands, sound feedback for prompts, fully immersive 4-sided CAVE as working space. CrystalEyes shutter glasses and emitters are used to obtain stereoscopic view. Users can feel the size and shape of digital CAD models using the CyberGrasp haptic device (by Immersion Corporation). Since Haptic feedback was only provided in griping tasks, the application lacked in providing force information when parts collided. The system can simulate complex scenarios such

• VEDA: Virtual Enviromnent for Design for Assembly (Gupta et al. (1997))

• VADE: A Virtual Assembly Design Environment (Jayaram et al. (1999))

there is a description of some of these systems:

tasks, such as peg-in-hole task.

CAD system.

as disassembling different components of an intelligent hydraulic excavator. To make the disassembly process easy the parts that can be currently disassembled are highlighted in blue color to help users make their selection.

• HAMMS: Haptic Assembly, Manufacturing and Machining System. (Ritchie et al. (2008))

HAMMS was developed by researchers at the Heriot-Watt University to explore the use of immersive technology and haptics in assembly planning. The hardware comprises a Phantom haptic device for interaction with the virtual environment, along with a pair of CrystalEyes® stereoscopic glasses for stereo viewing if required. Central to HAMMS is the physics engine, which enables rigid body simulations in real time. HAMMS logs data for each virtual object in the scene including devices that are used for interaction. The basic logged data comprises position, orientation, time stamps, velocity and an object index (or identifying number). By parsing through the logged data text files an assembly procedure can be automatically formulated.

• SHARP: Development of a Dual-handed haptic assembly system (Seth et al. (2008)).

SHARP is a dual-handed haptic interface for virtual assembly applications. The system allows users to simultaneously manipulate and orient CAD models to simulate assembly/disassembly operations. This interface provides both visual and haptic feedback, in this way, collision force feedback was provided to the user during assembly. Using VRJuggler as an application platform, the system could operate on different VR systems configurations including low-cost desktop configurations, Power Wall, four-sided and six-sided CAVE systems. Finally, different modules were created to address issues related to maintenance, training (record and play) and to facilitate collaboration (networked communication).

• HIIVR: A haptically enabled interactive virtual system for industrial assembly (Bhatti et al. (2008)).

This system is an interactive and immersive VR system designed to imitate the real physical training environments within the context of visualization and physical limitations. Head Mounted Displays are used for immersive visualization equipped with 6DOF trackers to keep the virtual view synchronized with the human vision, PHANTOM® devices are used to impose physical movement constraints. In addition, 5DT data gloves are used to provide human hand representation within the virtual world. The aim of the proposed system is to support the learning process of general assembly operators. Users can repeat their learning practices until they are proficient with the assembly task.

As it can be seen, most of the existing multimodal systems in the domain of assembly and disassembly tasks are focused mainly on their simulation without addressing explicitly their training. In the next Section, the authors present a new multimodal training system (combining visual, audio and haptic feedback) for training assembly and disassembly procedural tasks. Figure 3 shows a comparison among the previous systems and the new multimodal training system described in the next Section.

#### **3. The new multimodal training system**

This section presents a controlled multimodal training system, for learning assembly and disassembly procedural tasks. This platform supports the approach of *learning by doing* by means of an active multimodal interaction with the virtual scenario (visual, audio and haptic), eliminating the constraints of using the physical scenario (such as availability, time, cost, and


Fig. 3. Comparison among the new multimodal training system and other systems. \*NIA: no information available.

safety constraints). In this way, the trainees can interact and manipulate the components of the virtual scenario, sensing the collisions with the other components and simulating assembly and disassembly operations. In addition, the new system provides different multimodal aids and learning strategies that help and guide the trainees during their training process. One of the main features of this system is its flexibility to adapt itself to the task demands and to the trainees preferences and needs. In this way the sytem can be used with different types of haptic devices and allows undertaking mono-manual and bimanual operations in assembly

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 53

The new multimodal training system consists of a screen displaying a 3-D graphical scene, one haptic device and the training software, which simulates and teaches assembly and disassembly procedural tasks. As it will be discussed later, thanks to a special interface the platform is flexible enough to use different types of haptic devices with different features (large or small workspace, one or two contact points, different DoFs, etc.) depending on the

Figure 4 shows the training system in which trainees have to grasp the stylus of the haptic device with one hand to interact with the virtual pieces of the scene. Besides, trainees can use the keyboard to send commands to the application (e.g. grasp a piece, change the view, ...).

Fig. 4. Multimodal Training System. On the left, the system is configured to use a large workspace haptic device. On the right, the system is configured to use a desktop (small

The 3D-graphical scene is divided into two areas. The first area, the *repository*, is a back-wall, that contains the set of pieces to be assembled (in the case of an assembly task). The second area, *the working area*, is where trainees have to assemble/disassemble the machine or model. On the right part of the screen there is a *tools menu* with the virtual tools that can be chosen to accomplish the different operations of the task. When the user is closed to a tool icon, the user can "grasp" the corresponding tool and manipulate it. Figure 5 shows one example of a virtual assembly scene of the experimental task described in the next Section. Using the haptic device, the trainees must grasp the pieces from the repository (with or without tool) and move them to place them in its correct position in the model. Along the training session, the system provides different types of information about the task, such as: information about the "task progress", technical description of the components/tools and critical information of the operations. Critical information can also be sent through audio messages. When trainees make an error during the task, the system also displays a message with the type of error and

and disassembly tasks with or without tools.

**3.1 Set-up of the system**

wokspace) haptic device.

plays a sound to indicate the fault.

task demands.

safety constraints). In this way, the trainees can interact and manipulate the components of the virtual scenario, sensing the collisions with the other components and simulating assembly and disassembly operations. In addition, the new system provides different multimodal aids and learning strategies that help and guide the trainees during their training process. One of the main features of this system is its flexibility to adapt itself to the task demands and to the trainees preferences and needs. In this way the sytem can be used with different types of haptic devices and allows undertaking mono-manual and bimanual operations in assembly and disassembly tasks with or without tools.

#### **3.1 Set-up of the system**

10 Will-be-set-by-IN-TECH

Fig. 3. Comparison among the new multimodal training system and other systems. \*NIA: no

information available.

The new multimodal training system consists of a screen displaying a 3-D graphical scene, one haptic device and the training software, which simulates and teaches assembly and disassembly procedural tasks. As it will be discussed later, thanks to a special interface the platform is flexible enough to use different types of haptic devices with different features (large or small workspace, one or two contact points, different DoFs, etc.) depending on the task demands.

Figure 4 shows the training system in which trainees have to grasp the stylus of the haptic device with one hand to interact with the virtual pieces of the scene. Besides, trainees can use the keyboard to send commands to the application (e.g. grasp a piece, change the view, ...).

Fig. 4. Multimodal Training System. On the left, the system is configured to use a large workspace haptic device. On the right, the system is configured to use a desktop (small wokspace) haptic device.

The 3D-graphical scene is divided into two areas. The first area, the *repository*, is a back-wall, that contains the set of pieces to be assembled (in the case of an assembly task). The second area, *the working area*, is where trainees have to assemble/disassemble the machine or model. On the right part of the screen there is a *tools menu* with the virtual tools that can be chosen to accomplish the different operations of the task. When the user is closed to a tool icon, the user can "grasp" the corresponding tool and manipulate it. Figure 5 shows one example of a virtual assembly scene of the experimental task described in the next Section. Using the haptic device, the trainees must grasp the pieces from the repository (with or without tool) and move them to place them in its correct position in the model. Along the training session, the system provides different types of information about the task, such as: information about the "task progress", technical description of the components/tools and critical information of the operations. Critical information can also be sent through audio messages. When trainees make an error during the task, the system also displays a message with the type of error and plays a sound to indicate the fault.

2. The *simulation thread* validates the motion requested by the trainee according to the task, for example: detecting collisions of the model in movement, simulating an operation or

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 55

3. The *rendering thread* updates the graphical rendering of the virtual scene, sends the requested audio feedback at each time, processes the commands of the trainees (i.e. making a zoom in, rotating the view, etc.) entered by the keyboard or the switches of

• *Graphics Rendering Module*: it loads and displays the virtual scenario from a VRML file. Each object is represented by a triangle mesh and a transformation matrix. This module also includes some features to simulate the visual deformation of flexible objects as cables. • *Haptic Rendering Module:* it analysis the collisions between the user and the scene for exploration tasks and between the object in movement and the rest of the objects for manipulation tasks. This module also supports two haptic devices working

• *Operation Simulator:* it simulates some of the most common operations of a assembly/disassembly task such as: insert/remove a component in/from another one,

• *Virtual Fixtures Module*: the term *virtual fixture* is used to refer to a task dependent aid (visual, audio or haptic aid) that guides the motion and actions of users. This module provides both visual and haptic fixtures. Examples of *visual fixtures* are: changing the color of a virtual object, making a copy of an object and displaying it in a specific position, rendering auxiliary virtual elements such as arrows, trajectories, reference planes, points, etc. The output of the *haptic fixtures* is rendered in the form of forces dependent on the type of fixture: e.g. trainees can receive forces in order to constraint their hand-motion along a specific path or just attraction/repulsion forces with different levels of force and areas of influence (a surrounding area of a object or the whole-working space). This library of virtual fixtures is the basis of most of the training strategies provided by the system.

constraining the motion of trainees along a specific path, etc.

Fig. 6. Multi-thread solution implemented in the IMA-VR system.

The main components of the new training system are:

simultaneously to simulate bi-manual operations.

align two components along an axis, etc.

the haptic device and controls the training process.

Fig. 5. Virtual assembly scene. Trainees have to grasp the correct piece from the backwall and place it in its correct position in the model.

The platform also provides different utilities to configure the training process. For example, to start the training session at any step, to select the constraint in the order of the sequence of steps (the steps can be done in any flexible order or must follow a fixed order), to allow making steps automatically so the trainees can jump the easy steps and focus on the complex ones or even to "undo steps" to repeat difficult steps. The system logs information about the training sessions in order to analyze the performance of the trainee. Then, this information can be processed for extracting different performance measures. At the end of the session, the system provides a performance report containing information about:


#### **3.2 Architecture and components**

The multimodal interaction of the trainees with the virtual scene involves different activities, such as haptic interaction, collision detection, visual feedback, commands management, etc. All these activities must be synchronized and executed with the correct frequency in order to provide the trainees with a consistent and proper interaction. Therefore, a multi-threaded solution with 3 threads is employed (see Figure 6):

1. The *haptic thread* analyzes the user position, with the frequency requested by the haptic device (usually 1 Khz), in order to calculate the force that will be replicated to the trainee. It also manages the communication between the system and the haptic device through a special API (Application Programming Interface): the API–HAL (Haptic Abstract Library). This API makes the software application independent from the haptic device drivers and allows easily changing the system configuration, and using different haptic devices according to the task demands.

12 Will-be-set-by-IN-TECH

Fig. 5. Virtual assembly scene. Trainees have to grasp the correct piece from the backwall and

The platform also provides different utilities to configure the training process. For example, to start the training session at any step, to select the constraint in the order of the sequence of steps (the steps can be done in any flexible order or must follow a fixed order), to allow making steps automatically so the trainees can jump the easy steps and focus on the complex ones or even to "undo steps" to repeat difficult steps. The system logs information about the training sessions in order to analyze the performance of the trainee. Then, this information can be processed for extracting different performance measures. At the end of the session, the

• *Step performance*: time, information about how the step was performed (correct step, without errors but with aids, with errors but without aids, with errors and aids, not finished, performed automatically) and number of times that the help was needed to finish

• *Overall performance*: total time, description of the sequence done, total number of correct steps, total number of steps without errors but with aids, total number of steps with errors but without aids, total number of steps with errors and aids, total number of steps not finished, total number of steps performed automatically by the system, total number of times that the help was needed, total number or errors, and total number of consecutive

The multimodal interaction of the trainees with the virtual scene involves different activities, such as haptic interaction, collision detection, visual feedback, commands management, etc. All these activities must be synchronized and executed with the correct frequency in order to provide the trainees with a consistent and proper interaction. Therefore, a multi-threaded

1. The *haptic thread* analyzes the user position, with the frequency requested by the haptic device (usually 1 Khz), in order to calculate the force that will be replicated to the trainee. It also manages the communication between the system and the haptic device through a special API (Application Programming Interface): the API–HAL (Haptic Abstract Library). This API makes the software application independent from the haptic device drivers and allows easily changing the system configuration, and using different haptic devices

system provides a performance report containing information about:

place it in its correct position in the model.

the step and number or errors.

steps done correctly.

**3.2 Architecture and components**

according to the task demands.

solution with 3 threads is employed (see Figure 6):


Fig. 6. Multi-thread solution implemented in the IMA-VR system.

The main components of the new training system are:


Fig. 7. Building a real LEGO™ helicopter after being trained in the task with a multimodal

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 57

Fig. 8. Three pages of the instruction book, showing for each step the target brick and the

that position was applied as well (haptic aid), see Figure 9 on the bottom.

• *Group 2 - Direct aids:* the training platform provided **information about the immediate action** that **participants** had to **perform in a direct way** through visual and haptic aids*.* Each step consists of two main operations: select the correct brick and place the brick in its correct position on the model. For the first action, select the correct brick, the target brick was highlighted in yellow colour (visual aid) and the trainee received an attraction force to it (haptic aid), see Figure 9 on the top. For the second action, place the brick, a copy of the correct brick was rendered at its target position (visual aid) and an attraction force towards

From the point of view of the authors, the main **disadvantage of the direct aids** is that their abuse of use can **inhibit an active exploration of the task** and the **trainees can become dependent on these aids**. This fact could impede the transfer of skills in the real situation, where these aids are no longer available. **To avoid this problem**, the authors proposed **a training strategy based on providing these direct aids in a controlled way** and reduce them along the training process. In this experiment, the training period consisted of four training sessions. Therefore, during the first training session, in order to have an overview of the task, the aids were displayed automatically by the system, i.e. after finishing an action the trainee received automatically information for the next action. After that, during the second training

kind of aid was considered as an **indirect aid**, because the **participants** had to **translate in a cognitive way the information** of each diagram **to the actions** that had to be undertaken

training system.

at that step.

final result of the step.

• *Training Controller Module*: this module provides several trainings strategies with different degree of interaction between the trainees and the system in order to perform the task. In one extreme it is the *observational learning* strategy, where the trainees do not have any interaction with the virtual scenario but they just receive information about how to undertake the task in order to develop a *mental model*<sup>4</sup> of the task. On the other hand, there are some strategies where the trainees have to perform the virtual task being able to receive different types of multimodal aids from the system (based on the fixtures explained before). In the next section, one of these training strategies is described in detail. Finally, it is the *learning test strategy* in which the trainees have to perform the virtual task by themselves without receiving any aid from the system. Usually, a training programme combines several of these strategies. The selection of the strategies will depend on the task complexity and the profile of the trainees.

#### **4. Experiment: the use of direct aids to train a procedural task**

During the learning of a task, trainees can need to receive information about how to proceed with the task. As commented in previous sections, the way of providing theses aids is essential to assure the effectiveness of the training. The *direct aids* provide the information to perform the task in an easy way, however, some research works suggest that their use can have adverse effects on training. This section describes the experiment conducted to analyze if the **controlled use of direct aids** to train a procedural task **does not damage the transfer** of the involved skills.

#### **4.1 Experimental task**

The selected experimental task was learning how to build a LEGO™ helicopter model composed of 75 bricks. Participants were trained in the task using the multimodal training system explained in the last section with the OMNI device by SensAble. In order to avoid the effect of the short-tem memory, the experiment was conducted in two consecutive days. In the first day, the participants had to learn the procedural task using the virtual training system. During the second day, they had to build the same model using the physical LEGO™ bricks, as it is shown in Figure 7.

#### **4.2 Learning conditions**

According to the goal of the experiment, two experimental groups were defined to compare the effectiveness of the controlled direct aids with respect to a classical aid as it is an instruction manual (indirect aids). The participants of both groups trained in the task using the same multimodal platform, so all participants were able to interact with the bricks of the virtual LEGO™ model, through the haptic device, and build the virtual helicopter. The difference between the groups was the way of providing the information about the task:

• *Group 1 - Indirect aids*: the participants did not receive any aid from the training platform; to get information about the immediate action to perform, they had to consult an instruction book and each consult was logged as one aid. This book contained step-by-step diagrams of the 75 steps of the task. Each diagram included a picture of the LEGO™ brick needed for the current step and the view of the model at the end of the step (see Figure 8). This

<sup>4</sup> They provide humans with information on how physical systems work. Scientists sometimes use the term *mental model* as a synonym for *mental representation, knowledge of the task or knowledge representation.*

14 Will-be-set-by-IN-TECH

• *Training Controller Module*: this module provides several trainings strategies with different degree of interaction between the trainees and the system in order to perform the task. In one extreme it is the *observational learning* strategy, where the trainees do not have any interaction with the virtual scenario but they just receive information about how to undertake the task in order to develop a *mental model*<sup>4</sup> of the task. On the other hand, there are some strategies where the trainees have to perform the virtual task being able to receive different types of multimodal aids from the system (based on the fixtures explained before). In the next section, one of these training strategies is described in detail. Finally, it is the *learning test strategy* in which the trainees have to perform the virtual task by themselves without receiving any aid from the system. Usually, a training programme combines several of these strategies. The selection of the strategies will depend on the task

During the learning of a task, trainees can need to receive information about how to proceed with the task. As commented in previous sections, the way of providing theses aids is essential to assure the effectiveness of the training. The *direct aids* provide the information to perform the task in an easy way, however, some research works suggest that their use can have adverse effects on training. This section describes the experiment conducted to analyze if the **controlled use of direct aids** to train a procedural task **does not damage the transfer** of

The selected experimental task was learning how to build a LEGO™ helicopter model composed of 75 bricks. Participants were trained in the task using the multimodal training system explained in the last section with the OMNI device by SensAble. In order to avoid the effect of the short-tem memory, the experiment was conducted in two consecutive days. In the first day, the participants had to learn the procedural task using the virtual training system. During the second day, they had to build the same model using the physical LEGO™ bricks,

According to the goal of the experiment, two experimental groups were defined to compare the effectiveness of the controlled direct aids with respect to a classical aid as it is an instruction manual (indirect aids). The participants of both groups trained in the task using the same multimodal platform, so all participants were able to interact with the bricks of the virtual LEGO™ model, through the haptic device, and build the virtual helicopter. The difference

• *Group 1 - Indirect aids*: the participants did not receive any aid from the training platform; to get information about the immediate action to perform, they had to consult an instruction book and each consult was logged as one aid. This book contained step-by-step diagrams of the 75 steps of the task. Each diagram included a picture of the LEGO™ brick needed for the current step and the view of the model at the end of the step (see Figure 8). This

<sup>4</sup> They provide humans with information on how physical systems work. Scientists sometimes use the term *mental model* as a synonym for *mental representation, knowledge of the task or knowledge representation.*

between the groups was the way of providing the information about the task:

complexity and the profile of the trainees.

the involved skills.

**4.1 Experimental task**

as it is shown in Figure 7.

**4.2 Learning conditions**

**4. Experiment: the use of direct aids to train a procedural task**

Fig. 7. Building a real LEGO™ helicopter after being trained in the task with a multimodal training system.

kind of aid was considered as an **indirect aid**, because the **participants** had to **translate in a cognitive way the information** of each diagram **to the actions** that had to be undertaken at that step.

Fig. 8. Three pages of the instruction book, showing for each step the target brick and the final result of the step.

• *Group 2 - Direct aids:* the training platform provided **information about the immediate action** that **participants** had to **perform in a direct way** through visual and haptic aids*.* Each step consists of two main operations: select the correct brick and place the brick in its correct position on the model. For the first action, select the correct brick, the target brick was highlighted in yellow colour (visual aid) and the trainee received an attraction force to it (haptic aid), see Figure 9 on the top. For the second action, place the brick, a copy of the correct brick was rendered at its target position (visual aid) and an attraction force towards that position was applied as well (haptic aid), see Figure 9 on the bottom.

From the point of view of the authors, the main **disadvantage of the direct aids** is that their abuse of use can **inhibit an active exploration of the task** and the **trainees can become dependent on these aids**. This fact could impede the transfer of skills in the real situation, where these aids are no longer available. **To avoid this problem**, the authors proposed **a training strategy based on providing these direct aids in a controlled way** and reduce them along the training process. In this experiment, the training period consisted of four training sessions. Therefore, during the first training session, in order to have an overview of the task, the aids were displayed automatically by the system, i.e. after finishing an action the trainee received automatically information for the next action. After that, during the second training

**4.4 Procedure**

**4.5 Performance measures**

**4.6 Results and discussion**

non-solved errors).

assumed.

evaluation of the platform was also analyzed.

The experiment was conducted in two consecutive days. The first day, the participants were familiarized with the MTS and the corresponding training condition. Later, they had to use the training system to learn how to build the virtual helicopter model during four training sessions (i.e. they built the virtual helicopter four times), with a short break between sessions to allow the examiner to restart the system. The initial position of the bricks was generated randomly, so it was different in each training session. The evaluator was in the room along with the participant and information about the task performance (number of aids, number of errors and the training time) was logged automatically by the system for further analysis of the data. At the end of the training, the participants of both groups were requested to fill in a

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 59

The second day, the participants had to build the real LEGO™ helicopter model (the same one as in the training sessions). Each participant was instructed to complete the model as rapidly as possible, but without mistakes and missing bricks. At that time, the participants had the instruction book to consult it when they did not know how to continue with the task and this action was recorded as one "aid". The session was recorded in video for further analysis.

The measures to analyze the final performance of the trainee were: the training time, the real task performance (execution time, number of aids with the instruction book, number of non-solved errors and type of errors) and the evolution of the performance of the trainee along the training process and the transition from the training system to the real task. The subjective

Before starting to analyze the results, it would be note that although the experimental task is simple conceptually its correct execution is difficult due to the requested cognitive skills component. The task required memorizing the exact position of 75 bricks, most of them without any indication for functional or semantic meaning (just bricks with different size, color and number of pins) and some of the assembly procedure was totally arbitrary (different combinations of the bricks could generate the same shape of the helicopter model but only one specific combination was considered valid and the rest of options were considered as

In general, there were not significant differences between the results obtained in both groups. Statistical analysis was performed with independent-samples t-tests, equal variances

Figure 10 shows that the mean real task execution time for the direct aids group, 18.6 minutes, dd not have significant difference with the mean time for the indirect group, 17.9 minutes (t36=0.36, p=0.721). Nevertheless, it is important to point out that during the building of the real LEGO™ model, the participants of the indirect aids group had some advantages respect to the others because they were used to consult the instruction book during the training period,

In order to analyze the real task performance obtained in each group, each step of the task (75

and in this way they could get the information quicker than the others.

steps in total) was classified according to the following criteria:

questionnaire to evaluate their experience with the multimodal system.

Fig. 9. Direct aids provided by the platform. On the top: aids to select the correct brick. On the bottom: aids to place the brick in its correct position.

session, the aids were displayed only when the trainees request them by pressing the help key. Finally, during the third and fourth sessions, the aids were displayed only when the trainee requested them as in the second one, but they had also made at least one attempt of selecting the correct brick, i.e. the trainees had to try selecting the correct brick by themselves before being able to receive the aid. Each requested help was automatically logged as one aid.

In both groups, when trainees selected an incorrect piece, the platform played a "beep" sound and displayed a message to indicate the mistake. To evaluate the performance of the trainee the system also logged the number of incorrect pieces selected. Additionally, the order of the steps was fixed and numbered from 1 to 75, therefore both strategies followed the same building order.

#### **4.3 Participants**

The experiment was undertaken with 40 undergraduate students and staff from CEIT, the University of Navarra and TECNALIA. Following a between-participants design, each participant was assigned to one of the two experimental groups (20 participants in each one). For the assignment of the participants to the groups, first they filled in a demographic questionnaire whose answers were used to distribute them along the two groups in an homogeneous way. In the direct aids group, 48% of participants were female, the average age was 33 with a range from 24 to 50, and only 32% of participants did not have any experience with LEGO™ models. Meanwhile, in the indirect aids group, 53% of participants were female, the average age was 36 with a range from 26 to 48 and only the 37% of participants did not have any experience with LEGO™ models. All participants reported normal sense of touch and vision and did not have experience in using haptic technologies.

Before starting the experiment, participants signed an informed consent agreement and they were informed that if they achieved the best performance of their group, they would receive a reward.

#### **4.4 Procedure**

16 Will-be-set-by-IN-TECH

Fig. 9. Direct aids provided by the platform. On the top: aids to select the correct brick. On

session, the aids were displayed only when the trainees request them by pressing the help key. Finally, during the third and fourth sessions, the aids were displayed only when the trainee requested them as in the second one, but they had also made at least one attempt of selecting the correct brick, i.e. the trainees had to try selecting the correct brick by themselves before being able to receive the aid. Each requested help was automatically logged as one aid.

In both groups, when trainees selected an incorrect piece, the platform played a "beep" sound and displayed a message to indicate the mistake. To evaluate the performance of the trainee the system also logged the number of incorrect pieces selected. Additionally, the order of the steps was fixed and numbered from 1 to 75, therefore both strategies followed the same

The experiment was undertaken with 40 undergraduate students and staff from CEIT, the University of Navarra and TECNALIA. Following a between-participants design, each participant was assigned to one of the two experimental groups (20 participants in each one). For the assignment of the participants to the groups, first they filled in a demographic questionnaire whose answers were used to distribute them along the two groups in an homogeneous way. In the direct aids group, 48% of participants were female, the average age was 33 with a range from 24 to 50, and only 32% of participants did not have any experience with LEGO™ models. Meanwhile, in the indirect aids group, 53% of participants were female, the average age was 36 with a range from 26 to 48 and only the 37% of participants did not have any experience with LEGO™ models. All participants reported normal sense of touch

Before starting the experiment, participants signed an informed consent agreement and they were informed that if they achieved the best performance of their group, they would receive

and vision and did not have experience in using haptic technologies.

the bottom: aids to place the brick in its correct position.

building order.

**4.3 Participants**

a reward.

The experiment was conducted in two consecutive days. The first day, the participants were familiarized with the MTS and the corresponding training condition. Later, they had to use the training system to learn how to build the virtual helicopter model during four training sessions (i.e. they built the virtual helicopter four times), with a short break between sessions to allow the examiner to restart the system. The initial position of the bricks was generated randomly, so it was different in each training session. The evaluator was in the room along with the participant and information about the task performance (number of aids, number of errors and the training time) was logged automatically by the system for further analysis of the data. At the end of the training, the participants of both groups were requested to fill in a questionnaire to evaluate their experience with the multimodal system.

The second day, the participants had to build the real LEGO™ helicopter model (the same one as in the training sessions). Each participant was instructed to complete the model as rapidly as possible, but without mistakes and missing bricks. At that time, the participants had the instruction book to consult it when they did not know how to continue with the task and this action was recorded as one "aid". The session was recorded in video for further analysis.

#### **4.5 Performance measures**

The measures to analyze the final performance of the trainee were: the training time, the real task performance (execution time, number of aids with the instruction book, number of non-solved errors and type of errors) and the evolution of the performance of the trainee along the training process and the transition from the training system to the real task. The subjective evaluation of the platform was also analyzed.

#### **4.6 Results and discussion**

Before starting to analyze the results, it would be note that although the experimental task is simple conceptually its correct execution is difficult due to the requested cognitive skills component. The task required memorizing the exact position of 75 bricks, most of them without any indication for functional or semantic meaning (just bricks with different size, color and number of pins) and some of the assembly procedure was totally arbitrary (different combinations of the bricks could generate the same shape of the helicopter model but only one specific combination was considered valid and the rest of options were considered as non-solved errors).

In general, there were not significant differences between the results obtained in both groups. Statistical analysis was performed with independent-samples t-tests, equal variances assumed.

Figure 10 shows that the mean real task execution time for the direct aids group, 18.6 minutes, dd not have significant difference with the mean time for the indirect group, 17.9 minutes (t36=0.36, p=0.721). Nevertheless, it is important to point out that during the building of the real LEGO™ model, the participants of the indirect aids group had some advantages respect to the others because they were used to consult the instruction book during the training period, and in this way they could get the information quicker than the others.

In order to analyze the real task performance obtained in each group, each step of the task (75 steps in total) was classified according to the following criteria:

Fig. 12. Examples of a correct assembly, missing a piece and wrong position of a piece.

Fig. 13. Mean number of consults to the instruction book during the real task.

Fig. 14. Percentage of errors corrected by the trainees without using the book during the real

Regarding the analysis of the evolution of the performance of the trainees along the training process and the transition from the training platform to the real scenario, Figures 15 and

created a better mental model of the task.

task.

(t36=0.14, p=0.889). Although no participant was able to build the real LEGO™ model without consulting the instruction book, the mean number of aids was less than 29 for both strategies. Additionally, the percentage of errors corrected by the participants without using the instruction book (see Figure 14) was larger in the direct aids group (28.6%) than in the indirect aids group (20.7%), so it seems the participants of the direct aids group could have

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 61

Fig. 10. Real task execution time.


Taking into account this classification, Figure 11 shows that there were not significant differences between the real task performances obtained in each group. In both groups, the mean of correct steps (i.e. steps performed by the participants without the use of the instruction book and without non-solved errors) was around the 60% of the total steps, and the mean of steps with non-solved errors was very small, less than the 2% of the total steps. The typology of the non-solved errors was also similar in both groups. Most of the non-solved errors were related to the missing or wrong assembly of small pieces. Figure 12 shows some examples.

Fig. 11. Comparison of the real task performance between the two strategies.

In more detail, Figure 13 shows the mean number of consults to the instruction book during the real task, in which there were not significant differences between both groups 18 Will-be-set-by-IN-TECH

• Correct step: the participants made the step correctly (without any non-solved error) and

• Step with aids: the participants made the step correctly (without any non-solved error), although they looked at the instruction book because they had some doubts about how to undertake the step, or they did not know how to continue or how to solve an error. • Step with non-solved errors: the participants made some error in the step and they did not solve it because they did not know how to solve it or they were not aware of the error. Taking into account this classification, Figure 11 shows that there were not significant differences between the real task performances obtained in each group. In both groups, the mean of correct steps (i.e. steps performed by the participants without the use of the instruction book and without non-solved errors) was around the 60% of the total steps, and the mean of steps with non-solved errors was very small, less than the 2% of the total steps. The typology of the non-solved errors was also similar in both groups. Most of the non-solved errors were related to the missing or wrong assembly of small pieces. Figure 12 shows some

Fig. 11. Comparison of the real task performance between the two strategies.

In more detail, Figure 13 shows the mean number of consults to the instruction book during the real task, in which there were not significant differences between both groups

Fig. 10. Real task execution time.

examples.

without the use of the instruction book.

Fig. 12. Examples of a correct assembly, missing a piece and wrong position of a piece.

(t36=0.14, p=0.889). Although no participant was able to build the real LEGO™ model without consulting the instruction book, the mean number of aids was less than 29 for both strategies. Additionally, the percentage of errors corrected by the participants without using the instruction book (see Figure 14) was larger in the direct aids group (28.6%) than in the indirect aids group (20.7%), so it seems the participants of the direct aids group could have created a better mental model of the task.

Fig. 13. Mean number of consults to the instruction book during the real task.

Fig. 14. Percentage of errors corrected by the trainees without using the book during the real task.

Regarding the analysis of the evolution of the performance of the trainees along the training process and the transition from the training platform to the real scenario, Figures 15 and

in each one of the training trials. Figure 18 shows that in the trial 1, in which the participants of both groups had just to use the information about the task in order to build the virtual model, the training time of the direct aids group was almost the half of the time of the indirect aids group (12.8 minutes in comparison with 22.5 minutes). And in the trials 3 and 4, although the participants of the direct aids group had the constraint of having to make an attempt of grasping the correct brick before being able to receive the aid, their training time was very similar to the time of the indirect aids group, in which the participants did not have this restriction. All these considerations indicate that a suitable use of direct aids can increase the efficiency of the training process since it could reduce the training time and maintain the same

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 63

performance level.

Fig. 17. Total training time at each experimental group.

Fig. 18. Evolution of the time along the four training sessions

environment? (Extremely artificial = 1, Natural = 7).

At the end of the training sessions all participants filled in a questionnaire, which was a reduced version of Witmer & Singer's Presence Questionnaire (Witmer & Singer (1998)). Since the haptic technologies provide a new interaction paradigm, this set of questions was useful to get extra information about the experience with the haptic training platform. The questionnaire consisted of 8 main semantic differential items, each giving a score in the scale

• Q1. How natural was the mechanism which controlled movement through the

from 1 (worst) to 7 (best).The questionnaire was elicited by the following questions:

Fig. 15. Evolution of the performance of the trainees (percentage of steps without non-solved errors and without aids) in each group.

16 show that the percentage of correct steps (without non-solved errors and without aids) increased along the training trials. Moreover, in the direct aids group the performance obtained in the real task was a little better than in the last trial of the training process, even though the real task was performed the day after of the training sessions. Additionally, the statistical analysis demonstrates that there was not significant differences in the transition from the MTS to the real task between both groups (t36=0.11, p=0.911). This fact demonstrates that the controlled use of direct aids does not damage the trainees performance when they change from the training platform (virtual task) to the real world (physical task) and therefore it eliminates the main disadvantage of the use of direct aids reported in the bibliography (Yuviler-Gavish et al. (2011)). It means, the controlled direct aids did not impede the transfer of knowledge as it was hypothesised in this experiment.

Fig. 16. Detailed information about the performance of the trainees (percentage of correct steps without non-solved errors and without aids) in the last training trial and in the real task.

In relation to the last performance measure, Figure 17 shows the mean total training time at each group where the statistical analysis demonstrates that there was not significant difference between both groups (t36=-0.78, p=0.442). But, it is also interesting to analyze the training time 20 Will-be-set-by-IN-TECH

Fig. 15. Evolution of the performance of the trainees (percentage of steps without non-solved

16 show that the percentage of correct steps (without non-solved errors and without aids) increased along the training trials. Moreover, in the direct aids group the performance obtained in the real task was a little better than in the last trial of the training process, even though the real task was performed the day after of the training sessions. Additionally, the statistical analysis demonstrates that there was not significant differences in the transition from the MTS to the real task between both groups (t36=0.11, p=0.911). This fact demonstrates that the controlled use of direct aids does not damage the trainees performance when they change from the training platform (virtual task) to the real world (physical task) and therefore it eliminates the main disadvantage of the use of direct aids reported in the bibliography (Yuviler-Gavish et al. (2011)). It means, the controlled direct aids did not impede the transfer

Fig. 16. Detailed information about the performance of the trainees (percentage of correct steps without non-solved errors and without aids) in the last training trial and in the real

In relation to the last performance measure, Figure 17 shows the mean total training time at each group where the statistical analysis demonstrates that there was not significant difference between both groups (t36=-0.78, p=0.442). But, it is also interesting to analyze the training time

errors and without aids) in each group.

task.

of knowledge as it was hypothesised in this experiment.

in each one of the training trials. Figure 18 shows that in the trial 1, in which the participants of both groups had just to use the information about the task in order to build the virtual model, the training time of the direct aids group was almost the half of the time of the indirect aids group (12.8 minutes in comparison with 22.5 minutes). And in the trials 3 and 4, although the participants of the direct aids group had the constraint of having to make an attempt of grasping the correct brick before being able to receive the aid, their training time was very similar to the time of the indirect aids group, in which the participants did not have this restriction. All these considerations indicate that a suitable use of direct aids can increase the efficiency of the training process since it could reduce the training time and maintain the same performance level.

Fig. 17. Total training time at each experimental group.

Fig. 18. Evolution of the time along the four training sessions

At the end of the training sessions all participants filled in a questionnaire, which was a reduced version of Witmer & Singer's Presence Questionnaire (Witmer & Singer (1998)). Since the haptic technologies provide a new interaction paradigm, this set of questions was useful to get extra information about the experience with the haptic training platform. The questionnaire consisted of 8 main semantic differential items, each giving a score in the scale from 1 (worst) to 7 (best).The questionnaire was elicited by the following questions:

• Q1. How natural was the mechanism which controlled movement through the environment? (Extremely artificial = 1, Natural = 7).

The questions 1 to 4 (see Figure 19) measured the usability of the training platform. The results of these questions indicate that participants concentrated quite well on the task and the platform was easy to use. Nevertheless, participants did not feel totally comfortable using the system. Some participants suggested eliminating the use of keyboard for sending commands to the application by adding more buttons in the stylus of the haptic device. Moreover, they suggested increasing the duration of the familiarisation session in order to increase the level

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 65

The questions from 5 to 8 evaluated the system as a training tool. As showed in Figure 20, participants were quite confident to perform the real task and rated the system with a score of 5.7 (in the direct aids group) and 5.9 (in the indirect aids groups) as a training tool. Some of the participants in the group of direct aids commented that they did not like to be forced to

This chapter presents a new Multimodal Training System for assembly and disassembly

• It supports the approach of *learning by doing* by means of an active multimodal interaction (visual, audio and haptic) with the virtual scenario, eliminating the constraints of using the physical (real) scenario: mainly availability, time, cost, and safety constraints. For example, providing training when the machine is still in the design phase or it is working and can not be stopped or when errors during the training can damage the machine or the trainee. • It provides different multimodal aids, not available in the real world, that help and guide

• Its flexibility to adapt itself to the demands of the task and to the preferences/needs of the trainees, for example: flexibility in the available training strategies, flexibility in the sensory channel used to provide feedback and flexibility in the supported haptic devices. In this work the characteristics, advantages and dissavantages of the use of VR Technologies for training were also described. One of the main drawbacks in the use of Virtual Training Systems is that trainees become increasingly dependent on the features of these systems which may inhibit the ability to perform the task in the absence of them. This negative effect in the use of virtual aids was analyzed in the experiment described in this chapter and the findings suggest that the use of a strategy based on providing direct aids in a controlled way does not

This outcome is in contrast with other research works that shows the negative effects of the use of direct aids. Moreover, from the point of view of the authors, the use of direct aids could reduce the training time and therefore increase the efficiency of the training process. In this way, further experiments should be run in order to analyze in which way the use of direct aids could decrease the training time without damaging the final performance of the trainees.

During the experiment described in this chapter, the authors detected three main behaviour patterns in the participants that can be useful to define some design recommendations for the

1. Participants who like trying by themselves the next action but when they do not know how to continue they request help. In this case, it would not be necessary to add any constraint to be able to receive the aid, it would be enough that the aid was provided on demand.

damage the knowledge transfer from the virtual system to the real world.

make an attempt of grasping the correct brick before being able to receive the aid.

of confidence using the training platform.

**5. Conclusions and future research**

procedural tasks. Some features of this system are:

the trainees during the training process.

design of the virtual aids:


As it is shown in Figure 19 and 20, the results of the questionnaires did not have significant differences among the two groups. This fact seems logic since the interaction of the participants with the trianing system was similar in both groups; the difference was only the way in which the information about the task was provided. In general, the questionnaire results were quite positive.

Fig. 19. Level of usability of the Multimodal Training System.

Fig. 20. Evaluation of the system as a tool to learn procedural tasks.

The questions 1 to 4 (see Figure 19) measured the usability of the training platform. The results of these questions indicate that participants concentrated quite well on the task and the platform was easy to use. Nevertheless, participants did not feel totally comfortable using the system. Some participants suggested eliminating the use of keyboard for sending commands to the application by adding more buttons in the stylus of the haptic device. Moreover, they suggested increasing the duration of the familiarisation session in order to increase the level of confidence using the training platform.

The questions from 5 to 8 evaluated the system as a training tool. As showed in Figure 20, participants were quite confident to perform the real task and rated the system with a score of 5.7 (in the direct aids group) and 5.9 (in the indirect aids groups) as a training tool. Some of the participants in the group of direct aids commented that they did not like to be forced to make an attempt of grasping the correct brick before being able to receive the aid.

#### **5. Conclusions and future research**

22 Will-be-set-by-IN-TECH

• Q2. How well could you concentrate on the assigned tasks or required activities rather than on the mechanisms used to perform those tasks or activities? (Not at all = 1, Completely =

• Q3. How was the interaction with the virtual environment during the training session?

• Q4. During the training session how did you feel? (Nothing comfortable = 1, Strong

• Q5. During the training sessions, did you learn the task? (Nothing 0% = 1, All 100% = 7). • Q6. Are you able to repeat the task with the real bricks? (No, I can't do it = 1, Yes, I can do

• Q7. How consistent was you experience in the virtual environment with respect to the real

• Q8. What mark do you give to the Multimodal Training System as a training tool? (The

As it is shown in Figure 19 and 20, the results of the questionnaires did not have significant differences among the two groups. This fact seems logic since the interaction of the participants with the trianing system was similar in both groups; the difference was only the way in which the information about the task was provided. In general, the questionnaire

scenario? (Nothing consistent =1, Strong consistent=7).

Fig. 19. Level of usability of the Multimodal Training System.

Fig. 20. Evaluation of the system as a tool to learn procedural tasks.

7).

it = 7).

(Difficult = 1, Easy = 7).

worst = 1, The best = 7).

results were quite positive.

comfortable = 7).

This chapter presents a new Multimodal Training System for assembly and disassembly procedural tasks. Some features of this system are:


In this work the characteristics, advantages and dissavantages of the use of VR Technologies for training were also described. One of the main drawbacks in the use of Virtual Training Systems is that trainees become increasingly dependent on the features of these systems which may inhibit the ability to perform the task in the absence of them. This negative effect in the use of virtual aids was analyzed in the experiment described in this chapter and the findings suggest that the use of a strategy based on providing direct aids in a controlled way does not damage the knowledge transfer from the virtual system to the real world.

This outcome is in contrast with other research works that shows the negative effects of the use of direct aids. Moreover, from the point of view of the authors, the use of direct aids could reduce the training time and therefore increase the efficiency of the training process. In this way, further experiments should be run in order to analyze in which way the use of direct aids could decrease the training time without damaging the final performance of the trainees.

During the experiment described in this chapter, the authors detected three main behaviour patterns in the participants that can be useful to define some design recommendations for the design of the virtual aids:

1. Participants who like trying by themselves the next action but when they do not know how to continue they request help. In this case, it would not be necessary to add any constraint to be able to receive the aid, it would be enough that the aid was provided on demand.

Garcia-Robledo, P., Ortego, J., Ferre, M., Barrio, J. & Sanchez-Uran, M. (2011). Segmentation

Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids 67

Gaver, W. W. (1994). *Using and creating auditory icons, in auditory displays: sonification,*

Godley, S. T., Triggs, T. J. & Fildes, B. N. (2002). Driving simulator validation for speed

Gupta, R., Whitney, D. E. & Zeltzer, D. (1997). Prototyping and design for assembly analysis using multimodal virtual environments, *Computer-Aided Design* 29(8): 585–597. Handel, S. & Buffardi, L. (1969). Using several modalities to perceive one temporal pattern.,

Held, R. & Durlach, N. (1992). Telepresence, *Presence: Teleoperators and Virtual Environments*

Heller, M. (1982). Visual and tactual texture perception: intersensory cooperation., *Perception*

Holden, M. K. (2005). Virtual environments for motor rehabilitation: review., *CyberPsychology*

Howell, J., Conatser, R., Williams, R., Burns, J. & Eland, D. (2008). The virtual haptic back: a simulation for training in palpatory diagnosis, *BMC Medical Education* . Jayaram, S., Wang, Y., Jayaram, U., Lyons, K. & Hart, P. (1999). Vade: A virtual assembly design environment, *IEEE Computer Graphics and Applications* pp. 44–50. Kneebone, R. (2003). Simulation in surgical training: educational issues and practical

Lee, W. S., Kim, J. H. & & Cho, J. H. (1998). A driving simulator as a virtual reality tool, *IEEE*

Mayes, T. (1992). *Multimedia Interface Design in Education*, Berlin: Springer-Verlag., chapter

McDermott, S. D. & Bras, B. (1999). Development of a haptically enabled dis/re-assembly

McLaughlin, A. C. & Rogers, W. A. (2010). Learning by doing: understanding skill acquisition

Morgan, P. J., Cleave-Hogg, D., DeSousa, S. & Tarshis, J. (2004). High-fidelity patient

Murray, M. M., Molholm, S., Michel, C., Heslenfeld, D., Ritter, W., Javitt, D. C., Schroeder, C.

Oviatt, S., Coulston, R. & Lunsford, R. (2004). When do we interact multimodally? cognitive

Reiner, M. (2004). The role of haptics in immersive telecommunication environments, *IEEE*

*Transactions on Circuits and Systems for Video Technology* 14: 392–401.

The M word: Multimedia interfaces and their role in interactive learning systems,

simulation environment, *Proceedings of DETC'99: ASME Design Engineering Technical*

through skill acquisition, *Human Factors and Ergonomics Society Annual Meeting*

simulation: validation of performance checklists, *British Journal of Anaesthesia*

& Foxe, J. J. (2005). Grabbing your ear: rapid auditory-somatosensory multisensory interactions in low-level sensory cortices are not constrained by stimulus alignment.,

load and multimodal communication patterns, *In Proc. of International Conference on*

*Instrumentation and Measurement, IEEE Transactions on* 60(1): 69 –80.

*audification and auditory interfaces*, Addison Wesley.

research., *Accident Analysis & Prevention* 34: 589–600.

*Quarterly Journal of Experimental Psychology* 21: 256–266.

implications., *Medical Education* 37: 267–277.

*International Conference on Robotics and Automation*.

1: 109–112.

pp. 1–22.

*Conferences*.

92: 388–392.

*Proceedings* pp. 657–661.

*Cerebral Cortex* 15: 963–974.

*Multimodal Interfaces*, ACM Press, pp. 129–136.

*& Psychophysics* 31: 339–344.

*& Behavior* 8: 187–211.

of bimanual virtual object manipulation tasks using multifinger haptic interfaces,


Lastly, a final recommendation for the future implementation of virtual training systems is that the system should detect and evaluate the behaviour of trainees along the training session in order to display adequate information according with the evolution of their performance.

#### **6. Acknowledgment**

The authors would like to acknowledge the project SKILLS in the framework of European IST FP6 ICT-IP-035005-2006 and the "Ministerio de Ciencia e Innovación" (Spain).

#### **7. References**


24 Will-be-set-by-IN-TECH

2. Participants who prefer performing the next action by themselves although they do not know it, so they make a lot of attempts to guess the correct option. In most cases, these participants only spent more training time without making any improvement in their knowledge. It would be suitable to define a maximum number of attempts, so after achieving this limit the aid was provided automatically by the system. This maximum number of attempts should be configurable and could depend on the profile of the trainee

3. Participants who want to have an easy access to the aid when they do not know how to continue. These participants were annoyed by being forced to make an attempt of grasping the correct brick before being able to receive the aid, and in addition the training time increased without any benefit. In this case, when the trainees do not know how to continue,

Lastly, a final recommendation for the future implementation of virtual training systems is that the system should detect and evaluate the behaviour of trainees along the training session in order to display adequate information according with the evolution of their performance.

The authors would like to acknowledge the project SKILLS in the framework of European IST

Albani, J. & Lee, D. (2007). Virtual reality-assisted robotic surgery simulation, *Journal of*

Basdogan, C., Ho, C., Srinivasan, M. & Slater, M. (2000). An experimental study on the

Bhatti, A., Khoo, Y., Bing, C., Douglas, A. J., Nahavandi, S. & Zhou, M. (2008). Haptically

Blake, M. (1996). The nasa advanced concepts flight simulator - a unique transport aircraft research environment., *The AIAA Flight Simulation Technologies Conference*. Borro, D., Savall, J., Amundarain, A., Gil, J., García-Alonso, A. & Matey, L. (2004). A

Derossis, A. M., Bothwell, J., Sigman, H. H. & Fried, G. M. (1998). The effect of practice on performance in a laparoscopic simulator., *Surgical Endoscopy* 12: 1117–1120. Doyle, M. C. & Snowden, R. J. (2001). Identification of visual stimuli is improved by

Ernst, M. O. & Bülthoff, H. H. (2004). Merging the senses into a robust percept., *Trends in*

Gaba, D. M. (1991). Human performance issues in anesthesia patient safety., *Problems in*

Gaba, D. M. & DeAnda, A. (1988). A comprehensive anesthesia simulation environment:

role of touch in shared virtual environments, *ACM Transactions on Computer-Human*

enabled interactive virtual reality prototype for general assembly., *Proceedings of the*

large haptic device for aircraft engine maintainability, *IEEE Computer Graphics and*

accompanying auditory stimuli: The role of eye movements and sound location.,

re-creating the operating room for research and training, *Anesthesiology* 69: 387–394.

it is important that they can receive the aid from the system easily.

FP6 ICT-IP-035005-2006 and the "Ministerio de Ciencia e Innovación" (Spain).

or the target task.

**6. Acknowledgment**

*Endourology* 21: 285–287.

*Interaction* 7: 443–460.

*Perception* 30: 795–810.

*Anesthesia* 5: 329–350.

*Cognitive Science* 8: 162–169.

*Applications*.

*World Automation Congress WAC 08*, pp. 1–6.

Dale, E. (1969). *Audio-Visual Methods in Teaching*, New York.

**7. References**


**4** 

*Mexico* 

*1CUCEI-Universidad de Guadalajara* 

*2Universidad Autónoma de Yucatán-Mathematics School* 

**The Users' Avatars Nonverbal Interaction in** 

Adriana Peña Pérez Negrón1, Raúl A. Aguilar2 and Luis A. Casillas1

**Collaborative Virtual Environments for Learning** 

In a Collaborative Virtual Environment (CVE) for learning, an automatic analysis of collaborative interaction is helpful, either for a human or a virtual tutor, in a number of ways: to personalize or adapt the learning activity, to supervise the apprentices' progress, to scaffold learners or to track the students' involvement, among others. However, this monitoring task is a challenge that demands to understand and assess the interaction in a

In real life, when people interact to carry out a collaborative goal, they tend to communicate exclusively in terms that facilitate the task achievement; this communication goes through verbal and nonverbal channels. In multiuser computer scenarios, the graphical representation of the user, his/her avatar, is his/her means to interact with others and it

Particularly in a computer environment with visual feedback for interaction, collaborative interaction analysis should not be based only on dialogue, but also on the participants' nonverbal communication (NVC) where the interlocutor's answer can be an action or a

Human nonverbal behavior has been broadly studied, but as Knapp and Hall pointed out on their well-known book (2007): *"…the nonverbal cues sent in the form of computer-generated* 

Within this context, in a CVE each user action can be evaluated, in such a way that his/her nonverbal behavior represents a powerful resource for collaborative interaction analyses. On the other hand, virtual tutors are mainly intended for guiding and/or supervising the training task, that is, they are task-oriented rather than oriented to facilitate collaboration. With the aim to conduct automatic analyses intended to facilitate collaboration in small groups, the interpretation of the users' avatars nonverbal interaction during collaboration in CVEs for learning is here discussed. This scheme was formulated based on a NVC literature review in both, face-to-face and Virtual Environments (VE). In addition, an empirical study conducted to understand the potential of this monitoring type based on nonverbal behavior

comprises the means to display nonverbal cues as gaze direction or pointing.

*visuals will challenge the study of nonverbal communication in ways never envisioned".* 

**1. Introduction** 

computational mode.

gesture.

is presented.

