**2. Background concept and methodology**

The technical survey of facilities, as a long and costly process, aims at building a digital model based on geometric analysis since the modeling of a facility as a set of vectors is not sufficient in most cases. To resolve this problem, a new standard was developed over ten years by the International Alliance for Interoperability (IAI). It is named the IFC format (IFC - Industry Foundation Classes) (Vanland, et al., 2008). The specification is a neutral data format to describe exchange and share information typically used within the building and facility management industry. This norm considers the building elements as independent objects where each object is characterized by a 3D representation and defined by a semantic normalized label. Consequently, the architects and the experts are not the only ones who are able to recognize the elements, but everyone will be able to do it, even the system itself. For instance, an IFC Signal is not just a simple collection of lines and geometric primitives recognized as a signal; it is an "intelligent " object signal which has attributes linked to a geometrical definition and function. IFC files are made of objects and connections between these objects. Object attributes describe the "business semantic" of the object. Connections between objects are represented by "relation elements". This format and its semantics are the keystone of our solution.

The problematic of 3D object detection and scene reconstruction including semantic knowledge was recently treated within a different domain, basically the photogrammetry one (Pu, et al., 2007), the construction one, the robotics (Rusu, et al., 2009) and recently the knowledge engineering one (Ben Hmida, et al., 2010). Modeling a survey, in which low-level point cloud or surface representation is transformed into a semantically rich model is done in three tasks where the first is the data collection, in which dense point measurements of the facility are collected using laser scans taken from key locations throughout the facility; Then data processing, in which the sets of point clouds from the collected scanners are processed. Finally, modeling the survey in which the low-level point cloud is transformed into a semantically rich model. This is done via modeling geometric knowledge, qualifying topological relations and finally assigning an object category to each geometry (Boochs, et al., 2011). Concerning the geometry modeling, we remind here that the goal is to create simplified representations of facility components by fitting geometric primitives to the point cloud data. The modeled components are labeled with an object category. Establishing relationships between components is important in a facility model and must also be established. In fact, relationships between objects in a facility model are useful in many scenarios. In addition, spatial relationships between objects provide contextual information to assist in object recognition (Cantzler, 2003). Within the literature, three main strategies are described to rich such a model where the first one is based on human interaction with provided software's for point clouds classifications and annotations (Leica, 2011). While the second strategy relies more on the automatic data processing without any human interaction by using different segmentation techniques for feature extraction (Rusu, et al., 2009). Finally, new techniques presenting an improvement compared with the cited ones by integrating semantic networks to guide the reconstruction process have seen the light.

### **2.1 Manual survey model creation**

In current practice, the creation of a facility model is largely a manual process performed by service providers who are contracted to scan and model a facility. In reality, a project may require several months to be achieved, depending on the complexity of the facility and the modeling requirements. Reverse engineering tools excel at geometric modeling of surfaces, but with the lack of volumetric representations, while such design systems cannot handle the massive data sets from laser scanners. As a result, modelers often shuttle intermediate results back and forth between different software packages during the modeling process, giving rise to the possibility of information loss due to limitations of data exchange standards or errors in the implementation of the standards within the software tools (Goldberg, 2005). Prior knowledge about component geometry, such as the diameter of a column, can be used to constrain the modeling process, or the characteristics of known components may be kept in a standard component library. Finally, the class of the detected geometry is determined by the modeler once the object is created. In some cases, relationships between components are established either manually or in a semi-automated manner.

#### **2.2 Semi-Automatic and Automatic methods**

216 Semantics – Advances in Theories and Mathematical Models

The technical survey of facilities, as a long and costly process, aims at building a digital model based on geometric analysis since the modeling of a facility as a set of vectors is not sufficient in most cases. To resolve this problem, a new standard was developed over ten years by the International Alliance for Interoperability (IAI). It is named the IFC format (IFC - Industry Foundation Classes) (Vanland, et al., 2008). The specification is a neutral data format to describe exchange and share information typically used within the building and facility management industry. This norm considers the building elements as independent objects where each object is characterized by a 3D representation and defined by a semantic normalized label. Consequently, the architects and the experts are not the only ones who are able to recognize the elements, but everyone will be able to do it, even the system itself. For instance, an IFC Signal is not just a simple collection of lines and geometric primitives recognized as a signal; it is an "intelligent " object signal which has attributes linked to a geometrical definition and function. IFC files are made of objects and connections between these objects. Object attributes describe the "business semantic" of the object. Connections between objects are represented by "relation elements". This format and its semantics are the

The problematic of 3D object detection and scene reconstruction including semantic knowledge was recently treated within a different domain, basically the photogrammetry one (Pu, et al., 2007), the construction one, the robotics (Rusu, et al., 2009) and recently the knowledge engineering one (Ben Hmida, et al., 2010). Modeling a survey, in which low-level point cloud or surface representation is transformed into a semantically rich model is done in three tasks where the first is the data collection, in which dense point measurements of the facility are collected using laser scans taken from key locations throughout the facility; Then data processing, in which the sets of point clouds from the collected scanners are processed. Finally, modeling the survey in which the low-level point cloud is transformed into a semantically rich model. This is done via modeling geometric knowledge, qualifying topological relations and finally assigning an object category to each geometry (Boochs, et al., 2011). Concerning the geometry modeling, we remind here that the goal is to create simplified representations of facility components by fitting geometric primitives to the point cloud data. The modeled components are labeled with an object category. Establishing relationships between components is important in a facility model and must also be established. In fact, relationships between objects in a facility model are useful in many scenarios. In addition, spatial relationships between objects provide contextual information to assist in object recognition (Cantzler, 2003). Within the literature, three main strategies are described to rich such a model where the first one is based on human interaction with provided software's for point clouds classifications and annotations (Leica, 2011). While the second strategy relies more on the automatic data processing without any human interaction by using different segmentation techniques for feature extraction (Rusu, et al., 2009). Finally, new techniques presenting an improvement compared with the cited ones by integrating semantic networks to guide the reconstruction process have seen the light.

In current practice, the creation of a facility model is largely a manual process performed by service providers who are contracted to scan and model a facility. In reality, a project may

**2. Background concept and methodology** 

keystone of our solution.

**2.1 Manual survey model creation** 

The manual process for constructing a survey model is time consuming, labor-intensive, tedious, subjective, and requires skilled workers. Even if modeling of individual geometric primitives can be fairly quick, modeling a facility may require thousands of primitives. The combined modeling time can be several months for an average-sized facility. Since the same types of primitives must be modeled throughout a facility, the steps are highly repetitive and tedious (Hajian, et al., 2009). The above mentioned observations and others illustrate the need semi-automated and automated techniques for facility model creation. Ideally, a system could be developed that would take a point cloud of a facility as input and produce a fully annotated as-built model of the facility as output. The first step within the automatic process is the geometric modeling. It presents the process of constructing simplified representations of the 3D shape of survey components from point cloud data. In general, the shape representation is supported by Constructive Solid Geometry (CSG) (Corporation, 2006) or Boundary representation B-Rep representation (CASCADE, 2000). The representation of geometric shapes has been studied extensively (Campbell, et al., 2001). Once geometric elements are detected and stored via a specific presentation, the final task within a facility modeling process is the object recognition. It presents the process of labeling a set of data points or geometric primitives extracted from the data with a named object or object class. Whereas the modeling task would find a set of points to be a vertical plane, the recognition task would label that plane as being a wall, for instance. Often, the knowledge describing the shapes to be recognized is encoded in a set of descriptors that implicitly capture object shape. Research on recognition of facility's specific components related to a facility is still in its early stages. Methods in this category typically perform an initial shape-based segmentation of the scene, into planar regions, for example, and then use features derived from the segments to recognize objects. This approach is exemplified by Rusu et al. who use heuristics to detect walls, floors, ceilings, and cabinets in a kitchen environment (Rusu, et al., 2009). A similar approach was proposed by Pu and Vosselman to model facility façades (Pu, et al., 2009). To reduce the search space of object recognition algorithms, the use of knowledge related to a specific facility can be a fundamental solution. For instance, Yue et al. overlay a design model of a facility with the asbuilt point cloud to guide the process of identifying which data points belong to specific objects and to detect differences between the as-built and as-designed conditions (Yue, et al., 2006). In such cases, object recognition problem is simplified to be a matching problem between the scene model entities and the data points. Another similar approach is presented in

From Unstructured 3D Point Clouds to Structured Knowledge - A Semantics Approach 219

handling the huge information that exists in the Web. The term "*Semantic Web*" has been defined numerous time. Though there is no formal definition of Semantic Web, some of its most used definitions are "*The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. It is a source to retrieve information from the Web (using the Web spiders from RDF files) and access the data through Semantic Web Agents or Semantic Web Services. Simply Semantic Web is data about data or metadata*" (Lee, et al., 2001). "*A Semantic Web is a Web where the focus is placed on the meaning of words, rather than on the words themselves: information becomes knowledge after semantic analysis is performed. For this reason, a Semantic Web is a network of knowledge compared with what we have today that can be defined as a network of information*" (Huynh, et al., 2007). "*The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise and community boundaries*" (Decker, et al., 2000). In the next subsection, we discuss the different issues related to the definition of such a technology where we focus mainly on the Description Logic theory (DL) and its

Actually, the convergence of formal foundations for extensible, semantically understood structure within description logic and the overall usability targets of the predecessor of DL and the Web languages for broader usability of Web has led to the effort such as Ontology Interface Language (OIL) (Fensel, et al., 2001). It presents the first major effort to develop a language which has its base in Description Logic. It was a part of the broader project called On-To-Knowledge funded by European Union. This is the first time that the concept within ontology is explicitly used within a Web based environment. However, it did not completely leave out the primitives of frame base languages with the formal semantics and reasoning capabilities by including them within the language. The syntax of OIL is based on RDF and XML with their limitations to provide complete semantic foundations at that time. However, it has started a trend of mapping description logic within the Web based language for Semantic Web. It maps description logic through *SHIQ*. The derivation of *SHIQ* with respect

*S*: Used for all *ALC* with transitive roles R+

Complex descriptions can be built up through the above mentioned elementary descriptions of concepts and roles. These descriptions are given different notations over the time. The Attributive Language (*AL*) has been introduced in 1991 as minimal language that is of practical interest (Schmidt-Schauß, et al., 1991). It is further complemented through Attributive Concept Language with Complements (*ALC*) to allow any concepts or roles to be included and not just atomic concepts and atomic roles which were the previous elements of descriptions. *ALC* is the important notation format to express Description Logics. Fig 2

*I*: Inverse Role R-(isPartOf = hasPart-) *Q*: Qualified number restrictions

*H*: Role inclusion axioms R1⊑ R2 (is\_component\_of ⊑ is\_part\_of)

impact on the semantic web technology.

to naming convention of the Description Logic is given as:

illustrates the syntax rules on describing the concept.

**3.1 The description logics** 

**3.1.1 The base languages** 

(Bosche, et al., 2008). Other promising approaches have only been tested on limited and very simple examples, and it is equally difficult to predict how they would fare when faced with more complex and realistic data sets. For example, the semantic network methods for recognizing components using context work well for simple examples of hallways and barren, rectangular rooms (Cantzler, 2003), but how would they handle spaces with complex geometries and clutter.
