**1. Introduction**

Cave survey data is collected for several reasons. Cave maps, 3D models, and parametric data for many disciplines of science can be created and extracted from the raw measurements. The

© 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

amount of the collected data during a cave survey is much larger today than it was decades ago, thanks to the evolution of surveying tools and methods [1]. Such amount of data is created and handled in several ways depending on the skills and intents of the surveyor/processing staff or the scientists involved, and sometimes due to the lack of geographic information system (GIS) experience, the aims cannot be fully achieved [2]. However, the resolution of the collected information does not always match with the aims of the project, and this also may cause improper conclusions in a scientific project or inaccurate cave maps [3, 4]. In the classic times of speleological surveys—using measure tape and compass—the collected data was rather insufficient for scientific purpose and now with the use of terrestrial laser scanning (TLS) stations, it is rather too numerous to efficiently work with for many users [5].

Methodological papers about the analysis of cave data concentrate usually on the theoretical and not the practical aspects of data processing. For example, in the case of a new cave survey, the comparison of the newly collected data and the archive data is reported in several studies [3, 4], but the details of how the different data packages were incorporated in one system remain in the background. For several spelunkers—however—these details are possible sources for errors due to the lack of experience with coordinate transformations or management of data files, and programs, which are involved in this process. Such comparisons are often used to illustrate the higher accuracy of the new surveying methods, and while this is reasonable, the archive data often preserves notes and observations, which bear scientific importance (e.g., locations of bat colonies, archeological specimens, water sampling, etc.). Moreover, archive data often preserved cave conditions that have changed since the time of the survey. In an optimally built GIS, both the new and the archive data are positioned correctly and the database of the surveys can be queried simultaneously.

The aim of this paper is to highlight the cave data processing from the aspect of a geographic information system (GIS), and to demonstrate that such information system can be used to help scientific projects to combine newly measured and archived data. Using a GIS means using database tables, data transformation tools, statistical and spatial analysis, filtering and extracting certain parameters from the raw data, and finally (but not necessarily) visualizing the results in 2D or 3D. To do this efficiently, one must possess knowledge about such processes, and the know-how is so complex that it has become an advancing discipline: the GIScience.

## **2. The nature of cave survey data**

All cave survey project starts with a plan to do something for a certain purpose. At this point, the surveyors' knowledge and the available instruments will determine the quality of their future data. Although, contemporary surveyors have already upgraded their instruments, still there are lots of data from the predigital times. Most of the caves in the world were surveyed (at least partially) with measure tape and compass, by speleologists progressing in the cave passages from station to target points [6]. The data consists of several records containing an array of distance, dip, and azimuth measurements making a 2–10 m long spatial vector from each record. The sequentially joining vectors form a 3D network (**Figure 1**), and each

amount of the collected data during a cave survey is much larger today than it was decades ago, thanks to the evolution of surveying tools and methods [1]. Such amount of data is created and handled in several ways depending on the skills and intents of the surveyor/processing staff or the scientists involved, and sometimes due to the lack of geographic information system (GIS) experience, the aims cannot be fully achieved [2]. However, the resolution of the collected information does not always match with the aims of the project, and this also may cause improper conclusions in a scientific project or inaccurate cave maps [3, 4]. In the classic times of speleological surveys—using measure tape and compass—the collected data was rather insufficient for scientific purpose and now with the use of terrestrial laser scanning

(TLS) stations, it is rather too numerous to efficiently work with for many users [5].

rectly and the database of the surveys can be queried simultaneously.

**2. The nature of cave survey data**

26 Cave Investigation

Methodological papers about the analysis of cave data concentrate usually on the theoretical and not the practical aspects of data processing. For example, in the case of a new cave survey, the comparison of the newly collected data and the archive data is reported in several studies [3, 4], but the details of how the different data packages were incorporated in one system remain in the background. For several spelunkers—however—these details are possible sources for errors due to the lack of experience with coordinate transformations or management of data files, and programs, which are involved in this process. Such comparisons are often used to illustrate the higher accuracy of the new surveying methods, and while this is reasonable, the archive data often preserves notes and observations, which bear scientific importance (e.g., locations of bat colonies, archeological specimens, water sampling, etc.). Moreover, archive data often preserved cave conditions that have changed since the time of the survey. In an optimally built GIS, both the new and the archive data are positioned cor-

The aim of this paper is to highlight the cave data processing from the aspect of a geographic information system (GIS), and to demonstrate that such information system can be used to help scientific projects to combine newly measured and archived data. Using a GIS means using database tables, data transformation tools, statistical and spatial analysis, filtering and extracting certain parameters from the raw data, and finally (but not necessarily) visualizing the results in 2D or 3D. To do this efficiently, one must possess knowledge about such processes, and the know-how is so complex that it has become an advancing discipline: the GIScience.

All cave survey project starts with a plan to do something for a certain purpose. At this point, the surveyors' knowledge and the available instruments will determine the quality of their future data. Although, contemporary surveyors have already upgraded their instruments, still there are lots of data from the predigital times. Most of the caves in the world were surveyed (at least partially) with measure tape and compass, by speleologists progressing in the cave passages from station to target points [6]. The data consists of several records containing an array of distance, dip, and azimuth measurements making a 2–10 m long spatial vector from each record. The sequentially joining vectors form a 3D network (**Figure 1**), and each

**Figure 1.** Typical line plot of a survey network from the Pál-völgy Cave (Budapest). The shades (colors) indicate different surveys in various times. Alphanumeric characters indicate the identification codes of the measured stations.

node of this network can be represented with x, y, z coordinates. Although, the survey produces its own coordinate system, where the origin is the survey base point (entrance station), the whole set can be inserted in real geographic position defined in a Euclidian geodetic reference system (X, Y, Z) if the cave entrance was measured with geodetic instruments (i.e., total station or GPS). Additionally, the width and height of the cave passage can also be measured at each station to provide data for 3D cave models [7, 8]. The punctuality of the measurements can be enhanced if the network is measured backwards too (back to the original station), or if loops are present in the cave where the closing point of the loop is identical to one of the previously measured stations. This latter measuring sequence in the loop may also be measured reversely, and the average error can be distributed among the stations of the loop [3, 6]. This method (*loop closure*), however, was not always followed by the surveyors in the past due to several reasons. Most often, the backward measurements are missing, and the whole passage system was probably measured in different times by different people, which means that the condition of the data is varying. Usually, the most problematic part of creating a consistent GIS is the harmonization of several unit systems, data structures, content types, and coordinate systems.

#### **3. Inputs and outputs**

The condition of the created survey data will determine how much time is necessary to build a working GIS. The GIS is always built to serve as a "tool" to achieve the aims of a certain project (even if onward use is also in the hat), so the planning should consider the possible sources of the cave survey data. Basically three scenarios are the most frequents: (1) the data is to be salvaged from archives, (2) new survey is done, and (3) partly new and partly archive data are processed. In all cases the created data will be *processed*, *stored,* and *visually represented* in some way.


These main functionalities are accessed through programs directly developed for cave survey data management (Compass, Visual Topo, TopoRobot, and Therion), or with well-known applications developed for general data management (Excel, ArcGIS, AutoCAD, and many other). In either way, the application itself becomes organic part of the information system of the survey project with all of its data formats and processing algorithms. In some cases, the applications themselves are only applying external programs or scripts [9], increasing the complexity of the whole system. Selecting the tools for these functionalities is one of the most crucial parts of planning of the GIS. The aspects should include the consideration of the aim of the project firstly, but later the usability and the spatial dimensions should also be tackled. The more we rely on archive data, the more we should involve noncave mapping applications to bring the data into acceptable form.

#### **3.1. Data types of an archive survey**

Using only archive data is appropriate in the case when the project cannot afford new surveys either because of time or financial limits, or the cave has an endangered environment where even the survey may cause serious damages. The spatial resolution of the data is usually suitable for morphometric analysis [10, 11], but realistic 3D models cannot be created. *Archive survey* data can be collected from paper-based documentations (reports, notebooks, maps, and published papers). In this case, the GIS will be composed of the digitalized forms of these documents, and will always incorporate the following components: database, digital map, original documents (scanned), and the tools (programs). In a general purpose GIS, the main characteristics of these components are the followings:

**1.** The *database* is to be built from the notebook records. Although, the temptation is usually great to skip seemingly the irrelevant information (e.g., the condition of measuring) during digitization, it is always useful to fill the database with attributes like "CONDITION", or at least put such remarks in a "NOTE". The database usually contains the following attributes: date of survey, station-id, target-id, distance between station and target, dip, azimuth (angle from the north in clockwise rotation), width and height of the passage at the station. If the entrance points are measured with GPS, the database can be completed with the latitude, longitude, and elevation of each point using Euclidian geometry and the vector data.

project (even if onward use is also in the hat), so the planning should consider the possible sources of the cave survey data. Basically three scenarios are the most frequents: (1) the data is to be salvaged from archives, (2) new survey is done, and (3) partly new and partly archive data are processed. In all cases the created data will be *processed*, *stored,* and *visually represented*

• The data *processing* tool is responsible for the digital recording of text (notes, reports), alphanumeric and logical (true/false) information, images (photos, scanned documents), and

• Data storage devices provide secure data *store* and availability of alphanumeric and logical data, images, and texts in digital or digitized documents. Digital data is located on mass storage devices in its appropriate format and accessed via database and file management tools. • The tool, which *represents* the data as a 2D map or a 3D model is a complex application, which not only visualize the data, but most often serve as the GIS environment. It makes possible to access not only the visual representation of the data, but all the collected information

These main functionalities are accessed through programs directly developed for cave survey data management (Compass, Visual Topo, TopoRobot, and Therion), or with well-known applications developed for general data management (Excel, ArcGIS, AutoCAD, and many other). In either way, the application itself becomes organic part of the information system of the survey project with all of its data formats and processing algorithms. In some cases, the applications themselves are only applying external programs or scripts [9], increasing the complexity of the whole system. Selecting the tools for these functionalities is one of the most crucial parts of planning of the GIS. The aspects should include the consideration of the aim of the project firstly, but later the usability and the spatial dimensions should also be tackled. The more we rely on archive data, the more we should involve noncave mapping applications

Using only archive data is appropriate in the case when the project cannot afford new surveys either because of time or financial limits, or the cave has an endangered environment where even the survey may cause serious damages. The spatial resolution of the data is usually suitable for morphometric analysis [10, 11], but realistic 3D models cannot be created. *Archive survey* data can be collected from paper-based documentations (reports, notebooks, maps, and published papers). In this case, the GIS will be composed of the digitalized forms of these documents, and will always incorporate the following components: database, digital map, original documents (scanned), and the tools (programs). In a general purpose GIS, the main

**1.** The *database* is to be built from the notebook records. Although, the temptation is usually great to skip seemingly the irrelevant information (e.g., the condition of measuring) during digitization, it is always useful to fill the database with attributes like "CONDITION", or at

in some way.

28 Cave Investigation

from the database.

to bring the data into acceptable form.

**3.1. Data types of an archive survey**

characteristics of these components are the followings:

vector geometry (line plot maps, sections).

**2.** The *digital map* is created from scanned paper maps usually to provide additional information about the morphology of the cave. Maps contain the outlines of the passage levels indicating the characteristic morphology of the walls and the main artifacts. Additionally pointlike objects, names, transversal, and longitudinal sections are also displayed (**Figure 2**). The maps which are suitable for morphological analysis have usually large scale (>1:1000), but rarely contain geographic or geodesic coordinates, so the first step is the "georeferencing", which means that identifiable points (e.g., the marked stations) on the map are associated with their geographic coordinates. These coordinates can be calculated from the base point. The map processing starts with the determination of the data types, which are selected to be digitized. Each map data types (points, texts, lines, polygons) will be stored as a graphical element associated in one or more files. The processing tool will determine if the created

**Figure 2.** Part of a typical cave map indicating transversal sections. The map [12] shows a small part of the Pál-völgy Cave System (Budapest).

files are suitable for GIS, thus, it is most appropriate to use a GIS program directly (e.g., ArcGIS, MAPINFO, or QGIS). These programs are also suitable to do the "georeferencing" with the help of the previously processed *database*.


The program and the file-folder structure, will be a part of the information system. If more programs and people are involved in the data processing, a unified nomenclature of files and thoughtful folder structure can help to avoid file access failures. The reliability of the resulting data depends mainly on the quality of the original survey, but due to the manual acquisition of data and sketching of passage morphology, it is always biased compared to the new methods [13]. Furthermore, the digitization of archive data is also prone to transcription errors.

#### **3.2. Data types of new surveys**

New surveys are still carried out mostly with the station-target approach, but with modern (fast and accurate) instrumentation. The accuracy and resolution of the collected data is usually much larger1 compared to the traditional methods,2 but still requires human expertise both in the data collecting and the processing phase. The most widespread surveying tool today is the DistoX, which is based on the combination of a laser measuring tool and a handheld computer [15]. The two devices are connected via Bluetooth, and the mapping program on the mobile device (PDA, tablet, or a smartphone) handles the database of the measurements providing graphical user interface for the on-site map compilation too. The software running on the handheld computer—automatically handles the loop closures if new survey tracks are measured, modifying the coordinates of the existing stations too. The method is based on the algebraic minimization of the root mean squares (rms) of the differences.

<sup>1</sup> Precision of the Leica DistoX is 2 mm within 10 m range, with an angular error of 0.5° RMS [14].

<sup>2</sup> The spatial accuracy of the traditional measuring method is 1% of the distance from the entrance point in good conditions, but it can be even 10% [6].

Although, this mapping system is a GIS application in itself, it is not designed for the postsurvey processes (e.g., map making or morphometry analysis). For these purposes, several external component programs are used that can import the surveying program's output file types. The output formats are the common vector graphics (e.g., dxf—a simple text type file describing shapes and geometry in a well-documented syntax [16]), and the text-type database with rows and columns are compatible with the usual cave survey managing programs. In the case of a DistoX survey, the raw data structure is quite similar to the previously introduced traditional database (**Figure 3**), having the advantage of being in digital form natively.

files are suitable for GIS, thus, it is most appropriate to use a GIS program directly (e.g., ArcGIS, MAPINFO, or QGIS). These programs are also suitable to do the "georeferencing"

**3.** The *digital archiving* of the original data is advised in the case of the notebooks and necessary in the case of the maps. The scanned documents are usually in pdf or jpg format, and obviously stored on a hard drive. However, the location of the archived data is highly relevant from the aspect of the GIS, because the map processing tools will record the name and the source (folder) of the map file during the digitization process, so it will cause problems if the folder structure or the file name is changed after the process has started. This is also true for the created files at the end of the digitization: we will face with file access

**4.** The *preferred tool* can be one of the cave surveying programs, but depending on the condition of the recorded measurements other processing tools can also be appropriate. If the records are accurate (all the previously listed attributes are present), the surveying program can produce a representation too and calculates most of the morphometric parameters. In this case, the maps and vertical sections—if digitized in convenient format—can refine the results. If the records are incomplete (e.g., the passage width and height data are missing), the maps and sections can be used to complete them. This is more easily achieved in a

The program and the file-folder structure, will be a part of the information system. If more programs and people are involved in the data processing, a unified nomenclature of files and thoughtful folder structure can help to avoid file access failures. The reliability of the resulting data depends mainly on the quality of the original survey, but due to the manual acquisition of data and sketching of passage morphology, it is always biased compared to the new methods [13]. Furthermore, the digitization of archive data is also prone to transcription errors.

New surveys are still carried out mostly with the station-target approach, but with modern (fast and accurate) instrumentation. The accuracy and resolution of the collected data is usu-

both in the data collecting and the processing phase. The most widespread surveying tool today is the DistoX, which is based on the combination of a laser measuring tool and a handheld computer [15]. The two devices are connected via Bluetooth, and the mapping program on the mobile device (PDA, tablet, or a smartphone) handles the database of the measurements providing graphical user interface for the on-site map compilation too. The software running on the handheld computer—automatically handles the loop closures if new survey tracks are measured, modifying the coordinates of the existing stations too. The method is

based on the algebraic minimization of the root mean squares (rms) of the differences.

The spatial accuracy of the traditional measuring method is 1% of the distance from the entrance point in good condi-

Precision of the Leica DistoX is 2 mm within 10 m range, with an angular error of 0.5° RMS [14].

but still requires human expertise

compared to the traditional methods,2

with the help of the previously processed *database*.

problems if their location or name is changed.

standard GIS program (e.g., QGIS).

**3.2. Data types of new surveys**

ally much larger1

30 Cave Investigation

tions, but it can be even 10% [6].

1

2

In contemporary surveys, the ultimate aim is to increase the speed and accuracy of the measurements using digital data. With DistoX, the transcription errors can be bypassed by direct recording in the handheld device [15], but there are still biases: (1) shooting the laser beam to a few selected points in the spelunker's field of view and (2) the manual generalization of the cave morphology by drawing the map on-site. Although, the process can be enhanced shooting more and more targets—the resolution of the survey will always be lower than the surveys' done with a TLS.

The use of static terrestrial LiDAR instruments—despite of their impractical nature in harsh environments—is on the rise [5, 18, 19]. These tools produce thousands of range and angular data in few minutes measured from the station's location. The target points—similarly to the DistoX—are measured with one laser beam, but in this case the instrument repeatedly shots the beam to new targets swaying almost the whole field of view during one session. The point cloud of a scanning session at one station consists of nonoverlapping points forming a grid when using 3D polar coordinates (yaw, pitch, and range) or a data table when using Cartesian (x, y, z) coordinates [20, 21]. The former one is considered as a raster type data and can be easily fitted with panoramic photos shot from the same position. To do this, the scanning instrument should be equipped with an optical camera too. However, it is more common to export the scanned data with x, y, z coordinates in binary.las files [21]. Although, other formats are also exist, most of the point cloud processing programs (e.g., MeshLab, ReCap, Microstation, and CloudCompare) are able to import and export las-files.

**Figure 3.** The screenshots of the Pocket Topo program, which handles the data received from the DistoX tool [17]. The table view on the right shows the records of each measurement, the middle image is a map view displaying the connected stations, and the right one demonstrates the sketching options and the survey shots initiated from the stations.

Concerning the coordinates, the point cloud data is in a local reference system relative to the station. Data from multiple stations can be combined if the scanned surfaces overlap with each other, and if artificial backsight targets (i.e., regular-shaped objects) are placed into the common field of view of the subsequent scans. This process is done either automatically or manually within a desktop application after downloading the data from the TLS instrument. Both processes are based on "best fitting" approximations defined mathematically in the algorithms of the processing tools. The error of the fitting depends on the method we choose in the fitting approximation—usually the least-squares method—and the range of error is usually documented in the programs' description, although it also depends on several other factors. According to Lichti and Gordon [22], there are five error types which are distinguishable in a TLS survey: (i) the placement of the survey stations and the backsight target object; (ii) instrument leveling and centring; (iii) backsight target centring; (iv) raw scanner observation noise; and (v) laser beamwidth.

The precision parameters for the instrument, and the method can be obtained from the documentation if we know the range of the shots (beamwith of the laser beam is calculated) and the magnitude of the de-noise algorithm (removing outliers from the point cloud) relative to the range. Yet, at least two of the above listed errors are not independent from the human factor during the survey: the placing and the leveling of the instrument. However, attempts to reduce the chance of human errors are already made in the TLS procedure too; some LiDAR tools do not require manual fitting of backsight target objects to position themselves at the subsequent stations, and the precise leveling of the instrument is also done with automatic sensors and motors. Sometimes, though, this is quite problematic because of the size of the TLS instrument and the positioning is still the decision of the surveyor (**Figure 4**).

If the fitted sequence of the survey sessions contain at least one (but rather two: entrance and exit) positions where the geographic coordinates are measured with GPS (or other geodetic instruments), the whole survey can be transferred from a local (x, y, z) to a geodetic coordinate reference system (X, Y, Z) and can be referenced to other data (i.e., maps) surveyed previously (**Figure 5**).

Leaving behind the station-target method, techniques of high-resolution mobile mapping such as the Zebedee—are also emerging [4]. Two data types are generated from this approach: a point cloud and a trajectory. The point cloud data is a huge list of x, y, z coordinates enriched with a set of attributes (the intensity of the reflecting beam or the precision of the calculated coordinate) associated to each of the points. The trajectory data is a much smaller set of coordinates in a strict sequence, defining the movement of the surveyor in the Euclidian space. This data is quite similar to the polygon network of the archive surveys—with the distinction that the vectors of the trajectory do not necessarily join in one single node at the branching points of the passages. The instrument is a lightweight handheld LiDAR station combined with an inertial measurement unit (IMU), which provides measurements of angular velocities and linear accelerations. The IMU also contains a three-axis magnetometer. Based on the incoming data from the measuring instruments, a portable computer calculates the trajectory of the surveyor and the position of the point cloud relative to this trajectory. With this instrument, several thousands of point data are collected within seconds; and obviously, the method,

Concerning the coordinates, the point cloud data is in a local reference system relative to the station. Data from multiple stations can be combined if the scanned surfaces overlap with each other, and if artificial backsight targets (i.e., regular-shaped objects) are placed into the common field of view of the subsequent scans. This process is done either automatically or manually within a desktop application after downloading the data from the TLS instrument. Both processes are based on "best fitting" approximations defined mathematically in the algorithms of the processing tools. The error of the fitting depends on the method we choose in the fitting approximation—usually the least-squares method—and the range of error is usually documented in the programs' description, although it also depends on several other factors. According to Lichti and Gordon [22], there are five error types which are distinguishable in a TLS survey: (i) the placement of the survey stations and the backsight target object; (ii) instrument leveling and centring; (iii) backsight target centring; (iv) raw scanner observation noise;

The precision parameters for the instrument, and the method can be obtained from the documentation if we know the range of the shots (beamwith of the laser beam is calculated) and the magnitude of the de-noise algorithm (removing outliers from the point cloud) relative to the range. Yet, at least two of the above listed errors are not independent from the human factor during the survey: the placing and the leveling of the instrument. However, attempts to reduce the chance of human errors are already made in the TLS procedure too; some LiDAR tools do not require manual fitting of backsight target objects to position themselves at the subsequent stations, and the precise leveling of the instrument is also done with automatic sensors and motors. Sometimes, though, this is quite problematic because of the size of the

If the fitted sequence of the survey sessions contain at least one (but rather two: entrance and exit) positions where the geographic coordinates are measured with GPS (or other geodetic instruments), the whole survey can be transferred from a local (x, y, z) to a geodetic coordinate reference system (X, Y, Z) and can be referenced to other data (i.e., maps) surveyed

Leaving behind the station-target method, techniques of high-resolution mobile mapping such as the Zebedee—are also emerging [4]. Two data types are generated from this approach: a point cloud and a trajectory. The point cloud data is a huge list of x, y, z coordinates enriched with a set of attributes (the intensity of the reflecting beam or the precision of the calculated coordinate) associated to each of the points. The trajectory data is a much smaller set of coordinates in a strict sequence, defining the movement of the surveyor in the Euclidian space. This data is quite similar to the polygon network of the archive surveys—with the distinction that the vectors of the trajectory do not necessarily join in one single node at the branching points of the passages. The instrument is a lightweight handheld LiDAR station combined with an inertial measurement unit (IMU), which provides measurements of angular velocities and linear accelerations. The IMU also contains a three-axis magnetometer. Based on the incoming data from the measuring instruments, a portable computer calculates the trajectory of the surveyor and the position of the point cloud relative to this trajectory. With this instrument, several thousands of point data are collected within seconds; and obviously, the method,

TLS instrument and the positioning is still the decision of the surveyor (**Figure 4**).

and (v) laser beamwidth.

32 Cave Investigation

previously (**Figure 5**).

**Figure 4.** A TLS station positioned as close, as possible to the cave wall to "see" into a vertical chimney. The white spheres are the artificial backsight targets kept in position till the next session is measured from a subsequent station.

**Figure 5.** Archive map (left) and TLS survey image (right) in a common reference system. The map is oriented to the North. Green arrow indicates the position of the TLS station pointing toward the line of sight of the intensity image shown in the right panel. Geodetic coordinates of the station and the cursor position is displayed. The figure is created with a web viewer, which was designed to explore the TLS data beforehand thorough data processing [20].

which estimates the trajectory may produce errors. To correct these errors, the comparison of the overlapping areas helps—like the loop closure method in traditional cave surveying—to minimize the differences. The software uses best fitting algorithms to automatically localize similar patches of scanned cave parts [23]. The point cloud data—similarly to the data types of a TLS—is a .las-file or a zip-compressed .laz-file.

#### **3.3. Combined data types of archive and new surveys**

When new survey is done with modern instrumentation, usually the subject is a cave, where spelunkers worked previously and produced several kinds of archive data. The newly measured and the archive data both provide valuable information for scientist, thus, they should be integrated with each other. The two datasets can be paired along well defined spatial constraints—like identifiable morphology or artifacts (**Table 1**). For example, if some points of the archive survey are marked permanently in the cave, the installed artifacts can be identified on the LiDAR point cloud as regular-shaped objects. If the markings are too small, more apparent objects can be mounted temporarily on the cave wall, where old markings are found (e.g., uniform-sized disks).

In some cases, the structure of the archive dataset (i.e., column sequence in the data table) may have similar characteristics to the new one, but the reference systems of them are different. To avoid errors in later phases, the two sets should be checked at overlapping parts before unifying the two databases.

Archive data is not necessarily old data. Point clouds of several TLS survey sometimes are given to scientists to process the data and extract new information from it, but the surveys may come from different groups, who worked with different instruments. It is also possible that the point cloud data (las-files) are not accessible, only the 3D model of the cave—derived from the point cloud. Such models can be created in several ways—basically using stochastic methods—and the


**Table 1.** Typical setting of various types of archive and new data types, and the necessary actions to put them into one data system.

generalization of the surface (i.e., the level of details) mainly depends on the used method. If the method is unknown for some reason, the relation of the 3D model and the original point cloud data has an uncertainty, and in practice, the correct position of the model can be achieved only if overlapping cave parts are present in both the model and in the newly measured point cloud.

Various documents may exist if a cave is well known for a long time, and the overarching aim of a GIS is to integrate these data into a *common spatial context*. Depending on the type of the archive data, the integration takes different amount of time. The process involves the same methodology, what is described in the case of the archive data (i.e., scanning and structuring), but it can reach better results due to the presence of the new survey. The most challenging task, however, is the positioning of archive photos, which requires deep knowledge of the subject cave. The archive photos can usually be located simply to one spatial position, but in some cases, both the subject's and the photographer's position can be reconstructed. This information is stored in a separate data table, or in the header of the image file, which can be extracted with a photoediting program. Using this information, the photo can be draped on the cave surface model.

Not only archive photos are the subjects of the systematic process of cave-related tasks. Closeranged photogrammetry is an emerging method of cave modeling besides TLS or combined with it [19]. While TLS is preferable if the morphology of a cave is the subject of the survey, while in the case of an archeological site, the texture of the cave wall is the prior aim of documentation. Even 3D models can be created simply from the images, if abundant overlapping photos are taken with the same interior parameters (focal length, distortion parameters). This is achieved with programs (e.g., Photoscan), which can reconstruct the relative spatial positions of the photographer via the comparison of the texture of the images.
