**Mobile Mixed Reality System for Architectural and Construction Site Visualization**

Charles Woodward and Mika Hakkarainen *VTT Technical Research Centre of Finland, Finland* 

## **1. Introduction**

26 Augmented Reality

114 Augmented Reality – Some Emerging Application Areas

Liberman, A. & Mattingly, I. (1985). The motor theory of speech perception revised, *Cognition*

MacLeod, A. & Summerfield, Q. (1990). A procedure for measuring auditory and

Massaro, D. (1998). *Perceiving Talking Faces: From Speech Perception to a Behavioral Principle*, MIT

Massaro, D., Bigler, S., Chen, T., Perlman, M. & Ouni, S. (2008). Pronunciation training: The

Massaro, D. & Light, J. (2003). Read my tongue movements: Bimodal learning to perceive and produce non-native speech /r/ and /l/, *Proceedings of Eurospeech*, pp. 2249–2252. Massaro, D. & Light, J. (2004). Using visible speech for training perception and production

McGurk, H. & MacDonald, J. (1976). Hearing lips and seeing voices, *Nature* 264(5588): 746–748. Perkell, J., Guenther, F., Lane, H., Matthies, M., Perrier, P., Vick, J., Wilhelms-Tricarico, R. &

Rizzolatti, G. & Arbib, M. (1998). Language within our grasp, *Trends Neuroscience* 21: 188–194. Siciliano, C., Williams, G., Beskow, J. & Faulkner, A. (2003). Evaluation of a multilingual

Sjölander, K. (2003). An HMM-based system for automatic segmentation and alignment of

Skipper, J., Wassenhove, V. v., Nusbaum, H. & Small, S. (2007). Hearing lips and seeing

Sumby, W. & Pollack, I. (1954). Visual contribution to speech intelligibility in noise, *Journal of*

Summerfield, Q. (1979). Use of visual information for phonetic perception, *Phonetica*

Wik, P. & Engwall, O. (2008). Can visualization of internal articulators support speech

and recommendations for use, *British Journal of Audiology* 24: 29–43.

role of eye and ear, *Proceedings of Interspeech*, pp. 2623–2626.

*of International Conference of Phonetic Sciences*, pp. 131–134.

speech, *Proceedings of Fonetik*, pp. 93–96.

speech perception, *Cerebral Cortex* 17: 2387 – 2399.

perception?, *Proceedings of Interspeech*, pp. 2627–2630.

*the Acoustical Society of America* 26: 212–215.

audiovisual speech-reception thresholds for sentences in noise. Rationale, evaluation

of speech for hard of hearing individuals, *Journal of Speech, Language, and Hearing*

Zandipour, M. (2000). A theory of speech motor control and supporting data from speakers with normal hearing and with profound hearing loss, *Journal of Phonetics*

synthetic talking face as a communication aid for the hearing impaired, *Proceedings*

voices: How cortical areas supporting speech production mediate audiovisual

21: 1–36.

Press.

*Research* 47: 304–320.

28: 233–272.

36: 314–331.

The Architecture, Engineering and Construction (AEC) sector is widely recognized as one of the most promising application fields for Augmented Reality (AR). Building Information Models (BIM) and in particular the Industry Foundation Classes (IFC) data format are another main technology driver increasingly used for data sharing and communication purposes in the AEC sector (Koo & Fischer 2000). For example, the Finnish state owned facility management company Senate Properties demands use of IFC compatible software and BIM in all their projects (Senate 2007).

At some advanced construction sites, 3D/4D Building Information Models are starting to replace paper drawings as reference media for construction workers. Thus, workers can check daily work tasks using BIM systems installed at site offices, sometimes with remote connections to BIM databases, and even annotate the virtual model with information relating to the construction site. However, the model data is mostly hosted on desktop systems in the site office, which is situated far away from the target location and not easily accessible. Combined with mobile Augmented Reality and time schedules, 4D BIMs could facilitate on-the-spot comparisons of the actual situation at the construction site with the building's planned appearance and other properties at the given moment.

Besides augmented visualization, the related camera tracking technologies open up further application scenarios, enabling mobile location-based feedback from the construction site to the CAD and BIM systems. Such feedback possibilities include adding elements of reality such as images, reports and other comments to the virtual building model, correctly aligned in both time and space. Our discussion thus addresses the complete spectrum of Mixed Reality as defined by (Milgram and Kishino 1994), with real world augmented with virtual model data, and digital building models augmented with real world data.

Shin and Dunston (2008) evaluated 17 classified work tasks in the AEC industry. They concluded that eight of them (layout, excavation, positioning, inspection, coordination, supervision, commenting and strategizing) could potentially benefit from the use of AR. Additionally, related application areas would be communication and marketing prior to construction work, as well as building life cycle applications after the building is constructed.

Among previous work, the first mobile AR system was developed by Feiner et al. (1997). Their application was to present an AR view of campus information at Columbia University.

Mobile Mixed Reality System for Architectural and Construction Site Visualization 117

Our system is divided into three parts; 4DStudio, MapStudio and OnSitePlayer. The Studio applications fulfill the authoring role of the system and are typically used at the office, while OnSitePlayer provides the augmented reality view and mobile feedback interface at the construction site. OnSitePlayer can be operated either as a stand-alone, or as a client-server solution, distributing heavy 3D computation to the OnSiteServer extension, and tracking

The tracking algorithms are based on our software library ALVAR – A Library for Virtual and Augmented Reality (VTT 2011), and the OpenCV computer vision library. The GUI is built using the wxWidgets framework. For rendering, the open-source 3D graphics library OpenSceneGraph (OSG) version is used. The applications can handle all OSG supported file formats via OSG's plug-in interface (e.g. OSG's internal format, 3DS, VRML). The TNO IFC

Engine3 (TNO 2010) is used as a platform to process IFC building model files.

and rendering to the OnSiteClient extension. See Figure 1.

**2.1 Software modules** 

Fig. 1. System architecture.

Gleue and Thaene (2001) presented the Archeoguide system to provide tourists an AR view to historical and cultural sites. More recently, Reitmayr and Drummond (2006) presented a robust feature based and hybrid tracking solution for outdoor mobile AR. Among the first to address practical AEC applications, (Schall et al. 2008) presented a mobile handheld AR system Vivente for visualizing underground infrastructure. Their work was extended with state-of-the-art sensor fusion methods for outdoor tracking in (Schall et al. 2009). For further references on mobile AR with building construction models, see the thesis by Behzadan (2008) and the review article (Izkara et al. 2009).

However, little research has been done to integrate mobile AR with real world building models, often containing millions of triangles and being hundreds of megabytes in size. Integrating the time component to mobile AR solutions is another topic that is seldom addressed in previous literature. Among non-mobile solutions, however, let us note the impressive work (Goldparvar-Fard et al. 2010). They provide off-line still image based tools to compare the situation at construction site against 4D plans, based on 3D reconstruction of the construction site created from photographs taken of the site.

Our long term research goal has been to prove the technical validity of bringing real world BIM models to the construction site, for augmenting with lightweight mobile devices. Our work on mobile AR dates back to 2003 with the client-server implementation on a PDA device (Pasman & Woodward 2003). The next generation implementation (Honkamaa et al. 2005) produced a marker-free UMPC solution by combining the building's location in Google Earth, the user's GPS position, optical flow tracking and user interaction for tracking initialization. This work lead to the first version of the current system architecture (Hakkarainen et al. 2009) to handle arbitrary OSG formats and IFC (instead of just Google Earth's Collada), 4D models for construction time visualization (instead of just 3D), and mobile feedback from the construction site to the design system ("augmented virtuality"). The system was further extended in (Woodward et al. 2010) to cover more accurate map representations, mobile interaction, operation with data glasses, efficient client-server architecture, tracking methods, as well as discussion on photorealistic visualization for mobile AR.

This article gives an overall presentation of our software system, its background, current state and future plans. Among the most recent developments, we present: the client implementation on mobile phones, based on a lightweight optical tracking solution; results of our field trials in different pilot cases, including application during the construction work and comparing previous visualization results with the appearance of a partially ready building; as well as conclusions of the present status of the research.

The article is organized as follows. Section 2 explains the general implementation and functionality of the core software modules. The mobile phone implementation is discussed in Section 3. Our lightweight feature-based tracking solution is presented in Section 4. The photorealistic rendering functionality for mobile AR is described in Section 5. Results from our field trials are presented in Section 6. Items for future work are pointed out in Section 7 and concluding remarks are given in Section 8.

#### **2. System overview**

This Section presents the general implementation of the system. The discussion is given mainly from functional point of view, while a more detailed discussion is provided in (Woodward et al. 2010).

#### **2.1 Software modules**

116 Augmented Reality – Some Emerging Application Areas

Gleue and Thaene (2001) presented the Archeoguide system to provide tourists an AR view to historical and cultural sites. More recently, Reitmayr and Drummond (2006) presented a robust feature based and hybrid tracking solution for outdoor mobile AR. Among the first to address practical AEC applications, (Schall et al. 2008) presented a mobile handheld AR system Vivente for visualizing underground infrastructure. Their work was extended with state-of-the-art sensor fusion methods for outdoor tracking in (Schall et al. 2009). For further references on mobile AR with building construction models, see the thesis by Behzadan

However, little research has been done to integrate mobile AR with real world building models, often containing millions of triangles and being hundreds of megabytes in size. Integrating the time component to mobile AR solutions is another topic that is seldom addressed in previous literature. Among non-mobile solutions, however, let us note the impressive work (Goldparvar-Fard et al. 2010). They provide off-line still image based tools to compare the situation at construction site against 4D plans, based on 3D reconstruction of

Our long term research goal has been to prove the technical validity of bringing real world BIM models to the construction site, for augmenting with lightweight mobile devices. Our work on mobile AR dates back to 2003 with the client-server implementation on a PDA device (Pasman & Woodward 2003). The next generation implementation (Honkamaa et al. 2005) produced a marker-free UMPC solution by combining the building's location in Google Earth, the user's GPS position, optical flow tracking and user interaction for tracking initialization. This work lead to the first version of the current system architecture (Hakkarainen et al. 2009) to handle arbitrary OSG formats and IFC (instead of just Google Earth's Collada), 4D models for construction time visualization (instead of just 3D), and mobile feedback from the construction site to the design system ("augmented virtuality"). The system was further extended in (Woodward et al. 2010) to cover more accurate map representations, mobile interaction, operation with data glasses, efficient client-server architecture, tracking methods,

This article gives an overall presentation of our software system, its background, current state and future plans. Among the most recent developments, we present: the client implementation on mobile phones, based on a lightweight optical tracking solution; results of our field trials in different pilot cases, including application during the construction work and comparing previous visualization results with the appearance of a partially ready

The article is organized as follows. Section 2 explains the general implementation and functionality of the core software modules. The mobile phone implementation is discussed in Section 3. Our lightweight feature-based tracking solution is presented in Section 4. The photorealistic rendering functionality for mobile AR is described in Section 5. Results from our field trials are presented in Section 6. Items for future work are pointed out in Section 7

This Section presents the general implementation of the system. The discussion is given mainly from functional point of view, while a more detailed discussion is provided in

(2008) and the review article (Izkara et al. 2009).

the construction site created from photographs taken of the site.

as well as discussion on photorealistic visualization for mobile AR.

building; as well as conclusions of the present status of the research.

and concluding remarks are given in Section 8.

**2. System overview** 

(Woodward et al. 2010).

Our system is divided into three parts; 4DStudio, MapStudio and OnSitePlayer. The Studio applications fulfill the authoring role of the system and are typically used at the office, while OnSitePlayer provides the augmented reality view and mobile feedback interface at the construction site. OnSitePlayer can be operated either as a stand-alone, or as a client-server solution, distributing heavy 3D computation to the OnSiteServer extension, and tracking and rendering to the OnSiteClient extension. See Figure 1.

Fig. 1. System architecture.

The tracking algorithms are based on our software library ALVAR – A Library for Virtual and Augmented Reality (VTT 2011), and the OpenCV computer vision library. The GUI is built using the wxWidgets framework. For rendering, the open-source 3D graphics library OpenSceneGraph (OSG) version is used. The applications can handle all OSG supported file formats via OSG's plug-in interface (e.g. OSG's internal format, 3DS, VRML). The TNO IFC Engine3 (TNO 2010) is used as a platform to process IFC building model files.

Mobile Mixed Reality System for Architectural and Construction Site Visualization 119

The models are imported from 4DStudio, and can be any OSG compatible format or IFC format. The model can either be a main model or a so-called block model, which is used to enrich the AR view, or to mask the main model with existing buildings. The system can also be used to add clipping information to the models, for example the basement can be hidden

The user can position the models on the map either by entering numerical parameters or by interactively positioning the model with the mouse (see Figure 3). Once all the model information has been defined, the AR scene information is stored as an XML based scene

OnSitePlayer is launched at the remote location by opening a MapStudio scene description, or by importing a project file containing additional information. The application then provides two separate views in tabs; a map layout of the site with the models including the user location and viewing direction (see Figure 4) and an augmented view with the models

The user is able to request different types of augmented visualizations of the model based on time, for example defining the visualization start-time and end-time freely, using clipping planes, and/or showing the model partially transparent to see the real and existing structures behind the virtual ones. OnSitePlayer also allows for storing augmented still

With OnSitePlayer, the user can also create mobile feedback reports consisting of still images annotated with text comments. Each report is registered in the 3D environment at

description, ready to be taken out for mobile visualization on site.

Fig. 3. Building placed in geo coordinates with MapStudio.

displayed over the real-time video feed (see Figures 5 and 6).

images and video of the visualization, to be later reviewed at the office.

in the on-site visualization.

**2.4 OnSitePlayer** 

## **2.2 4D studio**

The 4DStudio application takes the building model (in IFC or some other format) and the construction project schedule (in MS Project XML format) as input. 4DStudio can then be used to link these into a 4D BIM. 4D IFC models defined with Tekla Structures can also be read directly by 4DStudio. Once the model has been defined, 4DStudio outputs the project description as an XML file.

4DStudio has a list of all the building parts and project tasks, from which the user can select the desired elements for visualization. For interaction, 4D Studio provides various tools to select elements for visualization, user definable color coding, clip planes, and viewing the model along the time line. See Figure 2.

Fig. 2. Building model with construction schedule in 4DStudio.

Feedback report items generated with the mobile AR system describe for example tasks or problems that have been observed at the construction site by workers. These can also be viewed with 4DStudio. Each item contains a title, a task description, a time and location of the task, and optionally one or several digital photos. Selecting a report item in the list takes the 4D building model to the time and location of the report item in question.

#### **2.3 MapStudio**

The MapStudio application is used to position the models into a geo coordinate system, using an imported map image of the construction site. The geo map can be imported from Google Earth, or for more accurate representations geospatial data formats like GeoTiff. The image import is done using the open source Geospatial Data Abstraction Library (GDAL).

The models are imported from 4DStudio, and can be any OSG compatible format or IFC format. The model can either be a main model or a so-called block model, which is used to enrich the AR view, or to mask the main model with existing buildings. The system can also be used to add clipping information to the models, for example the basement can be hidden in the on-site visualization.

The user can position the models on the map either by entering numerical parameters or by interactively positioning the model with the mouse (see Figure 3). Once all the model information has been defined, the AR scene information is stored as an XML based scene description, ready to be taken out for mobile visualization on site.

Fig. 3. Building placed in geo coordinates with MapStudio.

## **2.4 OnSitePlayer**

118 Augmented Reality – Some Emerging Application Areas

The 4DStudio application takes the building model (in IFC or some other format) and the construction project schedule (in MS Project XML format) as input. 4DStudio can then be used to link these into a 4D BIM. 4D IFC models defined with Tekla Structures can also be read directly by 4DStudio. Once the model has been defined, 4DStudio outputs the project

4DStudio has a list of all the building parts and project tasks, from which the user can select the desired elements for visualization. For interaction, 4D Studio provides various tools to select elements for visualization, user definable color coding, clip planes, and viewing the

Feedback report items generated with the mobile AR system describe for example tasks or problems that have been observed at the construction site by workers. These can also be viewed with 4DStudio. Each item contains a title, a task description, a time and location of the task, and optionally one or several digital photos. Selecting a report item in the list takes

The MapStudio application is used to position the models into a geo coordinate system, using an imported map image of the construction site. The geo map can be imported from Google Earth, or for more accurate representations geospatial data formats like GeoTiff. The image import is done using the open source Geospatial Data Abstraction Library

the 4D building model to the time and location of the report item in question.

**2.2 4D studio** 

**2.3 MapStudio** 

(GDAL).

description as an XML file.

model along the time line. See Figure 2.

Fig. 2. Building model with construction schedule in 4DStudio.

OnSitePlayer is launched at the remote location by opening a MapStudio scene description, or by importing a project file containing additional information. The application then provides two separate views in tabs; a map layout of the site with the models including the user location and viewing direction (see Figure 4) and an augmented view with the models displayed over the real-time video feed (see Figures 5 and 6).

The user is able to request different types of augmented visualizations of the model based on time, for example defining the visualization start-time and end-time freely, using clipping planes, and/or showing the model partially transparent to see the real and existing structures behind the virtual ones. OnSitePlayer also allows for storing augmented still images and video of the visualization, to be later reviewed at the office.

With OnSitePlayer, the user can also create mobile feedback reports consisting of still images annotated with text comments. Each report is registered in the 3D environment at

Mobile Mixed Reality System for Architectural and Construction Site Visualization 121

Fig. 5. OnSitePlayer view showing viewfinder for the placemark.

Fig. 6. Building model augmented with OnSitePlayer, at two different locations.

the user's location, camera direction, and moment in time. The reports are attached to the BIM via XML files and are available for browsing with 4DStudio, as explained above.

#### **2.5 Interactive positioning**

As GPS positioning does not always work reliably (e.g. when indoors) or accurately enough, we provide the user with the option to indicate his/her location interactively. The system presents the user the same map layout as used in the MapStudio application. The user is then able to zoom into the map and place the camera icon to the his/her currently know location. Note by the way that by using manual positioning, possible errors in the model's and user's positioning are aligned and thus eliminated from the model orientation calculation. Additionally, the user's elevation from ground level can be adjusted with a slider.

Fig. 4. User position and placemark shown in OnSitePlayer.

Compass (if any) does not always provide sufficient grounds for automatic tracking initialization. As backup, interactive means are provided for model alignment. After the model is properly aligned the system switches to feature-based tracking.

The interactive alignment of the video and the building models can be achieved in several ways (Woodward et al. 2010). As one option, block models that represent existing buildings can be used as a reference for the inital alignment. However, this approach requires modeling parts of the surrounding environment which might not always be possible or feasible.

As a more generally applicable approach (Wither et al. 2006), known elements of the real world are marked in MapStudio as "placemarks" (see Figure 4). The mobile user then selects any of the defined placemarks with the "viewfinder" to initialize real time tracking (see Figure 5). Real time augmented view (Figure 6) is produced as the user "shoots" the placemark by pressing a button on the mobile device.

120 Augmented Reality – Some Emerging Application Areas

the user's location, camera direction, and moment in time. The reports are attached to the BIM via XML files and are available for browsing with 4DStudio, as explained above.

As GPS positioning does not always work reliably (e.g. when indoors) or accurately enough, we provide the user with the option to indicate his/her location interactively. The system presents the user the same map layout as used in the MapStudio application. The user is then able to zoom into the map and place the camera icon to the his/her currently know location. Note by the way that by using manual positioning, possible errors in the model's and user's positioning are aligned and thus eliminated from the model orientation calculation.

Compass (if any) does not always provide sufficient grounds for automatic tracking initialization. As backup, interactive means are provided for model alignment. After the

The interactive alignment of the video and the building models can be achieved in several ways (Woodward et al. 2010). As one option, block models that represent existing buildings can be used as a reference for the inital alignment. However, this approach requires modeling

As a more generally applicable approach (Wither et al. 2006), known elements of the real world are marked in MapStudio as "placemarks" (see Figure 4). The mobile user then selects any of the defined placemarks with the "viewfinder" to initialize real time tracking (see Figure 5). Real time augmented view (Figure 6) is produced as the user "shoots" the

parts of the surrounding environment which might not always be possible or feasible.

Additionally, the user's elevation from ground level can be adjusted with a slider.

Fig. 4. User position and placemark shown in OnSitePlayer.

placemark by pressing a button on the mobile device.

model is properly aligned the system switches to feature-based tracking.

**2.5 Interactive positioning** 

Fig. 5. OnSitePlayer view showing viewfinder for the placemark.

Fig. 6. Building model augmented with OnSitePlayer, at two different locations.

Mobile Mixed Reality System for Architectural and Construction Site Visualization 123

The functionality of our first mobile phone version is restricted to architect's visualization models, without time component or other advanced features. Positioning is done using the integrated GPS module, without any user interaction. On the other hand, the N900 does not

All the user interactions are done via the touch screen. The viewing direction is defined with a slightly modified version of the PC based viewfinder approach. On the mobile phone we show all of the pre-defined viewfinder positions (authored in MapStudio) first in arbitrary direction. The user is then able to swipe the screen and choose the valid viewfinder(s) for the final aligning. After locking the model in the correct position, the viewfinder images are

Model rendering is based on the sphere projection method, as described above. Downloading the sphere images from the server depends on the number of images (triangles) required. New sphere initialization typically takes some 5 seconds, though in the worst case scenario (20 images, model all around the user) it takes up to 30 seconds. The initialization phase could be improved (up to some 50 %) by compressing the raw images and also packing multiple images in one texture. Alternatively, "hot spot" viewing positions can be defined at office using OnSitePlayer. In this case the sphere images are stored beforehand in the OnSiteClient's scene description and no downloads or even connection to

We have developed altogether three vision based tracking methods to be used in different use cases. Two solutions were developed for the OnSiteClient application, one for PC and one for mobile phone. These solutions assume the user stands at one position, at least a few meters away from the target object, and explores the world by panning with the mobile device (camera). A separate solution was developed for the stand-alone OnSitePlayer on PC, allowing the user also to move freely while viewing. While the PC based tracking solutions

have a compass so the user is responsible for defining the viewing direction.

Fig. 7. OnSiteClient running on N900 phone.

removed from the view and tracking is started.

the server are required.

**4. Tracking** 

## **2.6 Client-Server Implementation**

Virtual building models are often too complex and large to be rendered with mobile devices at a reasonable frame rate. This problem is overcome with the client-server extension for the OnSitePlayer application. The client extension, OnSiteClient, is used at the construction site while the server extension, OnSiteServer, is running at the site office or at some other remote location. Data communication between the client and server can be done using either WLAN or 3G.

The client and server applications were basically obtained with relatively small modifications to the OnSitePlayer code. The client and server share the same scene description as well as the same construction site geospatial information. The client is responsible for gathering position and orientation information, but instead of rendering the full 3D model, the client just passes the user location and viewing direction to the server. The server uses this information to calculate the correct model view, which is then sent to the client for augmenting on the mobile device.

In our implementation, the view is represented as a textured spherical view of the virtual scene surrounding the user. The sphere is approximated by triangles. An icosahedron was chosen since it is a regular polyhedron formed from equilateral triangles, therefore simplifying the texture generation process. The icosahedron also provides a reasonable tradeoff between speed (number of faces) and accuracy (resolution of images).

As the scene is rendered into the sphere representation, alpha values are used to indicate transparent parts of each texture image. If some image does not contain any part of the 3D model to be rendered, the whole image can be discarded and not sent to the client. See (Woodward et al. 2010) for further implementation details.

The client augments the scene by aligning the sphere to the virtual camera coordinates according to the user's position and camera direction, and renders the alpha textured sphere over the video image. Camera tracking keeps the 2D visualization in place and the user may pan/tilt the view as desired.

The same sphere visualization can be used as long as the user remains at the same location. Our solution generally assumes that the user does not move about while viewing. This is quite a natural assumption, as viewing and interacting with a mobile device while walking would be quite awkward or even dangerous, especially on a construction site. The user is still free to rotate around 360/360º and view the entire sphere projection.

## **3. Mobile phone implementation**

In the PC based client-server implementation (Woodward et al. 2010), the client and server extensions were obtained by direct modifications to the OnSitePlayer application. With the mobile phone implementation this was not feasible due to the difference of platforms. Also, to create as lightweight solution as possible, we implemented a whole new client application for the Nokia N900 smart phone (see Figure 7).

The mobile phone client still supports the network connection and data stream provided by the original server on the PC. The application framework is built using Qt SDK 1.0 and Qt Mobility. Rendering is done with OpenGL ES 2.0. The network connection is ad-hoc WLAN.

Fig. 7. OnSiteClient running on N900 phone.

The functionality of our first mobile phone version is restricted to architect's visualization models, without time component or other advanced features. Positioning is done using the integrated GPS module, without any user interaction. On the other hand, the N900 does not have a compass so the user is responsible for defining the viewing direction.

All the user interactions are done via the touch screen. The viewing direction is defined with a slightly modified version of the PC based viewfinder approach. On the mobile phone we show all of the pre-defined viewfinder positions (authored in MapStudio) first in arbitrary direction. The user is then able to swipe the screen and choose the valid viewfinder(s) for the final aligning. After locking the model in the correct position, the viewfinder images are removed from the view and tracking is started.

Model rendering is based on the sphere projection method, as described above. Downloading the sphere images from the server depends on the number of images (triangles) required. New sphere initialization typically takes some 5 seconds, though in the worst case scenario (20 images, model all around the user) it takes up to 30 seconds. The initialization phase could be improved (up to some 50 %) by compressing the raw images and also packing multiple images in one texture. Alternatively, "hot spot" viewing positions can be defined at office using OnSitePlayer. In this case the sphere images are stored beforehand in the OnSiteClient's scene description and no downloads or even connection to the server are required.

## **4. Tracking**

122 Augmented Reality – Some Emerging Application Areas

Virtual building models are often too complex and large to be rendered with mobile devices at a reasonable frame rate. This problem is overcome with the client-server extension for the OnSitePlayer application. The client extension, OnSiteClient, is used at the construction site while the server extension, OnSiteServer, is running at the site office or at some other remote location. Data communication between the client and server can be done using either

The client and server applications were basically obtained with relatively small modifications to the OnSitePlayer code. The client and server share the same scene description as well as the same construction site geospatial information. The client is responsible for gathering position and orientation information, but instead of rendering the full 3D model, the client just passes the user location and viewing direction to the server. The server uses this information to calculate the correct model view, which is then sent to

In our implementation, the view is represented as a textured spherical view of the virtual scene surrounding the user. The sphere is approximated by triangles. An icosahedron was chosen since it is a regular polyhedron formed from equilateral triangles, therefore simplifying the texture generation process. The icosahedron also provides a reasonable

As the scene is rendered into the sphere representation, alpha values are used to indicate transparent parts of each texture image. If some image does not contain any part of the 3D model to be rendered, the whole image can be discarded and not sent to the client. See

The client augments the scene by aligning the sphere to the virtual camera coordinates according to the user's position and camera direction, and renders the alpha textured sphere over the video image. Camera tracking keeps the 2D visualization in place and the user may

The same sphere visualization can be used as long as the user remains at the same location. Our solution generally assumes that the user does not move about while viewing. This is quite a natural assumption, as viewing and interacting with a mobile device while walking would be quite awkward or even dangerous, especially on a construction site. The user is

In the PC based client-server implementation (Woodward et al. 2010), the client and server extensions were obtained by direct modifications to the OnSitePlayer application. With the mobile phone implementation this was not feasible due to the difference of platforms. Also, to create as lightweight solution as possible, we implemented a whole new client application

The mobile phone client still supports the network connection and data stream provided by the original server on the PC. The application framework is built using Qt SDK 1.0 and Qt Mobility. Rendering is done with OpenGL ES 2.0. The network connection is ad-hoc WLAN.

tradeoff between speed (number of faces) and accuracy (resolution of images).

still free to rotate around 360/360º and view the entire sphere projection.

**2.6 Client-Server Implementation** 

the client for augmenting on the mobile device.

(Woodward et al. 2010) for further implementation details.

pan/tilt the view as desired.

**3. Mobile phone implementation** 

for the Nokia N900 smart phone (see Figure 7).

WLAN or 3G.

We have developed altogether three vision based tracking methods to be used in different use cases. Two solutions were developed for the OnSiteClient application, one for PC and one for mobile phone. These solutions assume the user stands at one position, at least a few meters away from the target object, and explores the world by panning with the mobile device (camera). A separate solution was developed for the stand-alone OnSitePlayer on PC, allowing the user also to move freely while viewing. While the PC based tracking solutions

Mobile Mixed Reality System for Architectural and Construction Site Visualization 125

characteristics typical for mobile architectural visualization were identified in (Woodward et

We have experimented with the rendering and light source discovery methods described in (Aittala 2010) and integrated them into the OnSitePlayer application. Figure 8 shows an example of applying our rendering methods with a pilot project. The present implementation of the rendering methods covers: determining of sun light direction based on GPS, date and time of day; interaction with sliders to adjust day light intensities; screenspace ambient occlusion; soft shadows based on shadow maps; and adjusting the rendered

Automatic lighting acquisition from the real scene (Aittala 2010) has not been integrated into our system yet, and the current implementation has been done for the stand-alone OnSitePlayer system only. We plan to implement more advanced features also with the client-server solution, using separate feedback mechanisms for interaction and passing of

Several iterations of field trials have been performed with three pilot cases. The first mobile use experiments were done with a laptop PC device in summer 2009. We used the 4D model of the Koutalaki hotel in Lapland as an example and augmented it behind our Digitalo offices in Espoo. The experiment enabled us to verify that most of the intended functionality was already operational, including e.g. visualizing the building in various modes and along

 Uneven tesselation of 3D CAD building models Shadow mapping methods, related to the previous Complex and constantly changing lighting conditions Aliasing problems with highly detailed building models Sharp computer graphics vs. web camera image quality

image quality to web camera aberrations.

Fig. 8. Photorealistic AR rendering with OnSitePlayer.

lighting conditions of the real world scene to the server.

**6. Field trials** 

al. 2010):

have been described in our previous article (Woodward et al. 2010), the implementation on mobile phone is new and is described in the following.

#### **4.1 Tracking on mobile phone**

Our light-weight markerless tracking solution designed for the mobile phone client application is based on rotation-invariant fast features (RIFF) (Takacs et al. 2010) and the FAST interest point detector (Rosten & Drummond 2006). The implementation follows closely the tracking logic of (Takacs et al. 2010) with the following modification. Instead of matching detected RIFF descriptors between two consecutive frames, we maintain a set of 3D features and assign one descriptor for each 3D feature. For each camera frame we select a sub-set of these 3D features by projecting the features using a predicted camera orientation and choosing features evenly across the image. To maintain real-time performance, only a limited number of features are selected. For each selected feature, matching descriptors are then searched around the projected feature positions. We use the same search radius of 8 pixels as in (Takacs et al. 2010).

Since descriptor matching gives correspondences between image corners and 3D features, the camera orientation is estimated simply by minimizing the re-projection error of the features. We use the Levenberg-Marquardt optimization routine for orientation estimation as in our previous implementation. We process each image pyramid level separately and the optimized orientation of the previous pyramid level is used as the initial camera orientation for the next pyramid level. For the first pyramid level, the final result of the previous frame is used instead.

Once all image pyramid levels have been processed, the set of 3D features is updated. First, outliers are detected from the residual re-projection errors. Feature quality values are increased for inliers and decreased for outliers. Once the quality value of a feature drops below a threshold, the feature is completely removed from the feature set. New 3D features are created by choosing strong FAST corners and back-projecting the corners into a surface of a sphere centered at the camera. New features are created only in image regions where there are no existing features.

Compared to our previous lightweight implementation (Woodward et al. 2010), the use of RIFF descriptors and FAST corners gives two clear benefits. Firstly, detecting FAST corners is much faster than the previously used interest point detector (Shi & Tomasi 1994). With a carefully optimized implementation we are able to reach a real-time performance of 30 FPS on the N900 mobile phone. Secondly, by tracking features using descriptor matching instead of the optical flow method of Lucas and Kanade (1981), we gain some ability for local recovery. The orientation of the camera is not updated if the tracker fails to match enough feature descriptors. If the tracker fails to match enough feature descriptors, the user can rotate the camera to bring more inlier features back into the camera view, thus restoring the previously found orientation.

### **5. Rendering**

On-site visualization of architectural models differs somewhat from general purpose rendering (Klein & Murray 2008), (Aittala 2010) and the methods should be adapted to the particular characteristics of the application for optimal results. The following special characteristics typical for mobile architectural visualization were identified in (Woodward et al. 2010):

Uneven tesselation of 3D CAD building models

124 Augmented Reality – Some Emerging Application Areas

have been described in our previous article (Woodward et al. 2010), the implementation on

Our light-weight markerless tracking solution designed for the mobile phone client application is based on rotation-invariant fast features (RIFF) (Takacs et al. 2010) and the FAST interest point detector (Rosten & Drummond 2006). The implementation follows closely the tracking logic of (Takacs et al. 2010) with the following modification. Instead of matching detected RIFF descriptors between two consecutive frames, we maintain a set of 3D features and assign one descriptor for each 3D feature. For each camera frame we select a sub-set of these 3D features by projecting the features using a predicted camera orientation and choosing features evenly across the image. To maintain real-time performance, only a limited number of features are selected. For each selected feature, matching descriptors are then searched around the projected feature positions. We use the same search radius of 8

Since descriptor matching gives correspondences between image corners and 3D features, the camera orientation is estimated simply by minimizing the re-projection error of the features. We use the Levenberg-Marquardt optimization routine for orientation estimation as in our previous implementation. We process each image pyramid level separately and the optimized orientation of the previous pyramid level is used as the initial camera orientation for the next pyramid level. For the first pyramid level, the final result of the previous frame

Once all image pyramid levels have been processed, the set of 3D features is updated. First, outliers are detected from the residual re-projection errors. Feature quality values are increased for inliers and decreased for outliers. Once the quality value of a feature drops below a threshold, the feature is completely removed from the feature set. New 3D features are created by choosing strong FAST corners and back-projecting the corners into a surface of a sphere centered at the camera. New features are created only in image regions where

Compared to our previous lightweight implementation (Woodward et al. 2010), the use of RIFF descriptors and FAST corners gives two clear benefits. Firstly, detecting FAST corners is much faster than the previously used interest point detector (Shi & Tomasi 1994). With a carefully optimized implementation we are able to reach a real-time performance of 30 FPS on the N900 mobile phone. Secondly, by tracking features using descriptor matching instead of the optical flow method of Lucas and Kanade (1981), we gain some ability for local recovery. The orientation of the camera is not updated if the tracker fails to match enough feature descriptors. If the tracker fails to match enough feature descriptors, the user can rotate the camera to bring more inlier features back into the camera view, thus restoring the

On-site visualization of architectural models differs somewhat from general purpose rendering (Klein & Murray 2008), (Aittala 2010) and the methods should be adapted to the particular characteristics of the application for optimal results. The following special

mobile phone is new and is described in the following.

**4.1 Tracking on mobile phone** 

pixels as in (Takacs et al. 2010).

there are no existing features.

previously found orientation.

**5. Rendering** 

is used instead.


We have experimented with the rendering and light source discovery methods described in (Aittala 2010) and integrated them into the OnSitePlayer application. Figure 8 shows an example of applying our rendering methods with a pilot project. The present implementation of the rendering methods covers: determining of sun light direction based on GPS, date and time of day; interaction with sliders to adjust day light intensities; screenspace ambient occlusion; soft shadows based on shadow maps; and adjusting the rendered image quality to web camera aberrations.

Fig. 8. Photorealistic AR rendering with OnSitePlayer.

Automatic lighting acquisition from the real scene (Aittala 2010) has not been integrated into our system yet, and the current implementation has been done for the stand-alone OnSitePlayer system only. We plan to implement more advanced features also with the client-server solution, using separate feedback mechanisms for interaction and passing of lighting conditions of the real world scene to the server.

## **6. Field trials**

Several iterations of field trials have been performed with three pilot cases. The first mobile use experiments were done with a laptop PC device in summer 2009. We used the 4D model of the Koutalaki hotel in Lapland as an example and augmented it behind our Digitalo offices in Espoo. The experiment enabled us to verify that most of the intended functionality was already operational, including e.g. visualizing the building in various modes and along

Mobile Mixed Reality System for Architectural and Construction Site Visualization 127

the construction site. We could then verify that our solution also worked in practice with this rather demanding experiment. With some user interaction, we were able to augment the complex model on site, and display the construction elements to be installed at different time frames and from various view points. With respect to tracking initialization, managing altitude information interactively was considered to be the biggest problem. Stand-alone laptop PC version was used in these experiments. See Figure 10 and video (VTT 2010b).

Harsh winter interrupted our field tests for almost half a year. The most recent experiments with the Skanska pilot were done in May 2011 when the back part of the building was already completed and also the first version of our mobile phone implementation was ready. In these experiments we were able to verify that our mobile phone solution using the new tracking method and pre-defined placemarks on the scene provided a stable augmented view of the building (see Figure 7). Comparison of the OnSitePlayer view which we had computed nine months earlier (Figure 8) against the real situation at the site (Figure

11) also validated the quality of our photorealistic rendering methods.

Fig. 10. Mobile AR during construction work.

the timeline, masking the virtual model with the real one, creating and viewing of mobile feedback reports, etc. However some problems were noticed with the user interface; especially the PC screen brightness was far from sufficient in bright day light. Also, the poor accuracy of the compass as well as GPS was noticed to be a major problem in practice. This stimulated our decision to develop interactive positioning methods as backup for the sensors.

A second round of experiments was carried out in fall 2009 in a case of the Forchem oil refinery in Sweden, with the purpose of augmenting new equipment to be installed, using Sony Vaio UX as mobile device (see Figure 9). Video of these experiments is available in (VTT 2010a).

Fig. 9. Mobile AR view of Forchem factory on a UMPC.

In the Forchem case we relied completely on our 3D feature based tracking solution without sensors (Woodward et al. 2010). Tracking was initialized manually by having the user indicate point correspondences between the video image and the 3D model of the factory. As hypothesis for future work, we believe this initialization step could be avoided by first roughly aligning the video and the model using compass information, and based on that, finding the actual point correspondences automatically.

Our most comprehensive field tests were conducted in a series of experiments with the new Skanska offices in Helsinki 2010-2011. In summer 2010 before the building work started, we compared AR visualization of the planned building with different display devices: laptop PC on a podium, attached data glasses, and UMPC client. The two first devices were used in stand-alone mode while the UMPC was used in client-server mode. For rendering, we compared standard computer graphics without adjustments against our photorealistic rendering methods to account for light direction, intensity and other visual properties. See Figures 2-8 and video (VTT 2010b).

In October 2010 when the construction work had already started, we finally received the complete 4D model of the Skanska building (IFC model size 60 MB) and went out to try it at 126 Augmented Reality – Some Emerging Application Areas

the timeline, masking the virtual model with the real one, creating and viewing of mobile feedback reports, etc. However some problems were noticed with the user interface; especially the PC screen brightness was far from sufficient in bright day light. Also, the poor accuracy of the compass as well as GPS was noticed to be a major problem in practice. This stimulated our decision to develop interactive positioning methods as backup for the

A second round of experiments was carried out in fall 2009 in a case of the Forchem oil refinery in Sweden, with the purpose of augmenting new equipment to be installed, using Sony Vaio UX as mobile device (see Figure 9). Video of these experiments is available in

In the Forchem case we relied completely on our 3D feature based tracking solution without sensors (Woodward et al. 2010). Tracking was initialized manually by having the user indicate point correspondences between the video image and the 3D model of the factory. As hypothesis for future work, we believe this initialization step could be avoided by first roughly aligning the video and the model using compass information, and based on that,

Our most comprehensive field tests were conducted in a series of experiments with the new Skanska offices in Helsinki 2010-2011. In summer 2010 before the building work started, we compared AR visualization of the planned building with different display devices: laptop PC on a podium, attached data glasses, and UMPC client. The two first devices were used in stand-alone mode while the UMPC was used in client-server mode. For rendering, we compared standard computer graphics without adjustments against our photorealistic rendering methods to account for light direction, intensity and other visual properties. See

In October 2010 when the construction work had already started, we finally received the complete 4D model of the Skanska building (IFC model size 60 MB) and went out to try it at

Fig. 9. Mobile AR view of Forchem factory on a UMPC.

finding the actual point correspondences automatically.

Figures 2-8 and video (VTT 2010b).

sensors.

(VTT 2010a).

the construction site. We could then verify that our solution also worked in practice with this rather demanding experiment. With some user interaction, we were able to augment the complex model on site, and display the construction elements to be installed at different time frames and from various view points. With respect to tracking initialization, managing altitude information interactively was considered to be the biggest problem. Stand-alone laptop PC version was used in these experiments. See Figure 10 and video (VTT 2010b).

Fig. 10. Mobile AR during construction work.

Harsh winter interrupted our field tests for almost half a year. The most recent experiments with the Skanska pilot were done in May 2011 when the back part of the building was already completed and also the first version of our mobile phone implementation was ready. In these experiments we were able to verify that our mobile phone solution using the new tracking method and pre-defined placemarks on the scene provided a stable augmented view of the building (see Figure 7). Comparison of the OnSitePlayer view which we had computed nine months earlier (Figure 8) against the real situation at the site (Figure 11) also validated the quality of our photorealistic rendering methods.

Mobile Mixed Reality System for Architectural and Construction Site Visualization 129

believe that we have proven the technical validity of the concept. In particular, mobile AR visualization of architectural models is already quite manageable with the present system. We look forward to evaluating our system with user tests in the future, and eventually to

The main body of this work was conducted as part of the "AR4BC" project (2008-2010), with Skanska, Tekla, Pöyry, Buildercom, Adactive and Deskartes as industrial partners. The mobile phone implementation was done as part of the "DIEM3/MMR" project (2010-2011) with Nokia as industrial partner. The main funding for these projects was provided by

Tuomas Kantonen was responsible for developing and implementing the feature-based tracking solution for mobile phones. Alain Boyer gave a valuable contribution in making the AR loop run efficiently on the Nokia N900 device. We also thank our colleagues Kari Rainio, Otto Korkalo and Miika Aittala who have been involved earlier in the implementation.

Aittala M. (2010). Inverse lighting and photorealistic rendering for augmented reality. *The* 

Behzadan A.H. (2008). *ARVISCOPE: Georeferenced Visualisation of Dynamic Construction* 

Feiner S., MacIntyre B., Höllerer T., and Webster A. (1997). A touring machine: prototyping

Gleue T. and Daehne P. (2001). Design and implementation of mobile device for outdoor

Hakkarainen M., Woodward C. and Rainio K. (2009). Software Architecture for Mobile

Honkamaa P., Siltanen S., Jäppinen J., Woodward C. and Korkalo O. (2007). Interactive

Klein G. and Murray D. Compositing for small cameras. (2008). *Proc. The 7th IEEE* 

*Processes in Three-Dimensional Outdoor Augmented Reality*. PhD Thesis, The

3D mobile augmented reality systems for exploring the urban environment. In *Proc.* 

augmented reality in the Archeoguidee project. *Proc. of the 2001 Conference on Virtual Reality, Arceology and Cultural Heritage*, ACM, Glyfada, Greece, pp. 161-168. Goldparvar-Fard M., Pena-Mora F. and Savarese S. (2010). D4AR – 4 dimensional

augmented reality – tools for automatred remote progress tracking and support of decision-enabling tasks in the AEC/FM industry. In *Proc. of The 6th Int. Conf. on* 

Mixed Reality and 4D BIM Interaction. In *Proc. 25th CIB W78 Conferenc*e, Istanbul,

outdoor mobile augmentation using markerless tracking and GPS. *Proc. Virtual Reality International Conference (VRIC)*, Laval, France, April 2007, pp. 285-288. Izkara J. L., Perez J., Basogain X. and Borro D. (2009). Mobile augmented reality, an

advanced tool for the construction sector. *Proc. 24th CIB W78 Conference*, Maribor,

*International Symposium on Mixed and Augmented Reality (ISMAR 2008)*, Cambridge,

bringing our solutions to real production use.

Tekes (Finnish Funding Agency for Technology and Innovation).

*Visual Computer*. Vol 26, No. 6-8, pp. 669-678.

*ISWC'97*, Cambridge, MA, USA, October 13, 1997.

*innovations in AEC*, State College, PA, Jun 9-11, 2010.

University of Michigan.

Turkey, Oct 2009.

Slovakia, June 2009, pp. 453-460.

UK, September 2008, pp. 57-60.

**9. Acknowledgements** 

**10. References** 

Fig. 11. Photo of the Skanska building partly ready.

## **7. Future work**

For practical reasons, we still have a number of stand-alone OnSitePlayer features yet to be integrated in the client-server solution. Also, integration of our feature based tracking methods with sensor data as well as photorealistic rendering technology into the AR system is still under way. Some near term plans for interaction, tracking and rendering enhancements were discussed above, and previously in (Woodward et al. 2010). Positioning accuracy could also be improved by applying more accurate methods, e.g. differential GPS, Real Time Kinematics (RTK) and other measurement tools that are routinely employed at construction sites.

In future, we look forward also to obtaining feedback from different user groups. The first formal user studies with the system will be performed in our next outdoors visualization project in September 2011. Handing out the system to actual end users will certainly bring up various proposals and wishes for improvements to the system. Instead of adding new functionality however, we anticipate a general request to simplify the user interface and limit it to the most essential features.

## **8. Conclusions**

In this article, we have described a software system for mobile mixed reality interaction with complex 4D Building Information Models. Our system supports various native and standard CAD/BIM formats, combining them with time schedule information, fixing them to accurate geographic representations, using augmented reality with feature based tracking to visualize them on site, applying photorealistic rendering, with various tools for mobile user interaction and feedback. The client-server solution is able to handle complex models on mobile devices, and an efficient tracking solution enables implementation also on mobile phones.

While there is still some way to go until the technology is in daily use at real construction sites, and there are some general concerns for applicability such as weather conditions, we believe that we have proven the technical validity of the concept. In particular, mobile AR visualization of architectural models is already quite manageable with the present system. We look forward to evaluating our system with user tests in the future, and eventually to bringing our solutions to real production use.

## **9. Acknowledgements**

128 Augmented Reality – Some Emerging Application Areas

For practical reasons, we still have a number of stand-alone OnSitePlayer features yet to be integrated in the client-server solution. Also, integration of our feature based tracking methods with sensor data as well as photorealistic rendering technology into the AR system is still under way. Some near term plans for interaction, tracking and rendering enhancements were discussed above, and previously in (Woodward et al. 2010). Positioning accuracy could also be improved by applying more accurate methods, e.g. differential GPS, Real Time Kinematics (RTK) and other measurement tools that are routinely employed at

In future, we look forward also to obtaining feedback from different user groups. The first formal user studies with the system will be performed in our next outdoors visualization project in September 2011. Handing out the system to actual end users will certainly bring up various proposals and wishes for improvements to the system. Instead of adding new functionality however, we anticipate a general request to simplify the user interface and

In this article, we have described a software system for mobile mixed reality interaction with complex 4D Building Information Models. Our system supports various native and standard CAD/BIM formats, combining them with time schedule information, fixing them to accurate geographic representations, using augmented reality with feature based tracking to visualize them on site, applying photorealistic rendering, with various tools for mobile user interaction and feedback. The client-server solution is able to handle complex models on mobile devices,

While there is still some way to go until the technology is in daily use at real construction sites, and there are some general concerns for applicability such as weather conditions, we

and an efficient tracking solution enables implementation also on mobile phones.

Fig. 11. Photo of the Skanska building partly ready.

**7. Future work** 

construction sites.

**8. Conclusions** 

limit it to the most essential features.

The main body of this work was conducted as part of the "AR4BC" project (2008-2010), with Skanska, Tekla, Pöyry, Buildercom, Adactive and Deskartes as industrial partners. The mobile phone implementation was done as part of the "DIEM3/MMR" project (2010-2011) with Nokia as industrial partner. The main funding for these projects was provided by Tekes (Finnish Funding Agency for Technology and Innovation).

Tuomas Kantonen was responsible for developing and implementing the feature-based tracking solution for mobile phones. Alain Boyer gave a valuable contribution in making the AR loop run efficiently on the Nokia N900 device. We also thank our colleagues Kari Rainio, Otto Korkalo and Miika Aittala who have been involved earlier in the implementation.

## **10. References**


**7** 

**NeuAR – A Review of the VR/AR** 

Pedro Gamito1,2,3, Jorge Oliveira1,2, Diogo Morais1,2,

*2Centro de Estudos em Psicologia Cognitiva e da Aprendizagem,* 

*1Universidade Lusófona de Humanidades e Tecnologias,* 

Pedro Rosa1,2,4 and Tomaz Saraiva1

*3Clínica S. João de Deus,* 

*4ISCTE-IUL/CIS* 

*Portugal* 

**Applications in the Neuroscience Domain** 

Since the 1980's, computational applications based on virtual reality (VR) aimed at treating mental disorders and rehabilitating individuals with cognitive or motor disabilities have been around. They started off by focusing on simple phobias like acrophobia (Emmelkamp et al., 2002) and agoraphobia (Botella et al., 2004), fear of flying (Rothbaum, Hodges, Smith, Lee & Price, 2000), and evolved to fear of driving (Saraiva et al., 2007) or posttraumatic stress disorder (PTSD) (Gamito et al., 2010), schizophrenia (Costa & Carvalho, 2004) or traumatic brain injuries (Gamito et al., 2011a), among many

VR holds two chief properties that enable patients to experience the synthetic environment as being real: immersion and interaction. The first relates to the sensation of being physical present and perceptually included in the VR world. The second stands for the ability to change the world properties, i.e. the environment and its constituents react according to participants actions. Along with imagination, interaction and immersion concur to create the

This characteristic of VR settings has been acknowledged by the psychotherapists as a media to expose patients with anxiety disorders (AD) to anxiogenic cues within an ecologically sound and controlled environment. VR designed for therapeutic purposes can replicate any of the ansiogenic situations, enabling a better approximation to the ansiogenic world and inducing higher levels of engagement when compared to traditional imagination exposure (Riva et al., 2002). Hyperrealistic threatening stimuli provided by VR lead to higher attention, and subsequent encapsulation, which means, once the fear system is activated the participant perceives the synthetic world as being real (Hamm & Weike, 2005). Also, VR reduces the decalage between reality and imagination, by diminishing potential distraction or cognitive avoidance to the threatening stimuli (Vincelli & Riva, 2000). These and other studies revealed that VR exposure therapy (VRET) may be an alternative to *in vivo*

**1. Introduction** 

others (Gamito et al., 2011b).

and imagination exposure.

so called "sense of being there" or presence.

