*2.1.1. Environment representation*

The first environment representation presented here is the 2D satellite image map. In general, it is downloaded beforehand from any imagery map source, either entirely or divided into many small images to be stitched later, and then used by the localization approaches to the pose estimation [18, 21, 26, 31]. Even though the research community does not that much explore it, it is also possible to fly the MAV over the region of interest and build the 2D image from the environment, to then use it as a map for the localization estimation. Also, another common choice between the approaches that adopt this kind of map is to point the MAV camera downwards. Then, the MAVs images are compared to different patches from the satellite image map by different comparison methods. The advantages of using the 2D satellite image as a map are the free access to this kind of data through Google Maps or any geographical imagery system (GIS), the excellent representation of the environment by colorful images, and the world coverage. Usually, a GIS also provides the geographic coordinates of satellite images, and then, it is possible to infer the latitude and longitude of each pixel of that image. Therefore, at the same time that a localization system estimates in which pixel from the satellite image map the MAV's pose is, it also estimates the pose in relation to the world, due to the latitude and longitude information in the pixel. In contrast, the disadvantages of such kind of map are the limited point of view (2D) and that some places of the world are not often visited by satellites, and hence, the images are not updated. The comparison methods from the works mentioned above are proposed aiming to be robust against such differences between the outdated 2D satellite images and the MAV image [18, 21, 26, 31].

The disadvantage of a limited point of view from the 2D satellite images motivated the researches to investigate the benefits of 3D maps [19, 20, 23, 30]. The authors argue that by using 3D maps, it is possible to take advantage of the environment structures to estimate the localization, besides the color of the map. Usually, the localization estimation is made based on the 3D structure alignment or even the point cloud matching. For the 3D map case, the MAV camera can be set in different angles, exploring different sights. As well as the 2D satellite images, this kind of 3D representation can be either built right before the localization estimation, as done by the works [20, 23], or downloaded from a GIS, as the case of [19]. Despite these advantages in comparison to the 2D maps, 3D maps generally allocate more computational resources than the 2D one, both to be stored and manipulated, and is not as easy to be found as the 2D maps, what limits the places that it is possible to estimate the MAV's pose.

It is important to highlight that even though flying the MAV before the localization estimation to build the map provides a certainly updated map, for both 2D and 3D ones, this option presents a trade-off. First, a human must pilot the MAV to gather 2D or 3D data from the environment, to then submit it to a mapping approach. Second, it demands more time to start the localization algorithm, since flying the MAV over an area takes more time than downloading a map from a GIS.

In contrast to these two types of maps that represent the whole environment, other map options are more straightforward in terms of details and what is represented. Instead of having a map illustrating all the obstacles, free spaces, and etc., these simple maps only show the position of a few markers. In this case, the idea is to measure the distance between the MAV and all the markers within the map and then estimate the MAV's pose. The type of the markers also varies considerably, such as the case of WLAN access points [28], which are fixed in some spots of the environment and whose received signal strength is measured as part of the localization estimation, and ultraviolet LED markers [24], which emit light in frequencies that are less common in nature than the visible light or infrared radiation. Then it increases the precision of the distance measurement. In this work, the LED markers are not fixed, they are embedded in every MAVs, and they have a mutual relative localization [24]. In more details, they estimate a MAV's pose to another MAV, instead of the global coordinate system. Another marker that it is worth to be mention is the use of tether [27]. The tether reel is fixed in a specific position, and the tether is attached to the MAV. For this case, the MAV is localized to the tether reel by using mechanics model. In general, the use of markers map is adopted for indoor localization, since the sensors that measure the markers have a more limited range than cameras for the 2D or 3D maps presented earlier. The work that relies on ultraviolet LED markers is one exception for this indoor limitation, but this occurs because the MAV's pose is the estimation concerning other MAVs, not to the environment.

Besides the maps presented in this section, other types were tested in the MAV localization problem by the research community. However, they are really specific for a kind of sensor or configuration, and our goal here is to cover the most popular and recent ones. About the types of maps presented here, each one has its advantages and disadvantages, as well as its specific constraints that fit better in some situations. For instance, the 2D satellite image is available online and free but is not suitable for indoor localization. On the other hand, markers map is the option that is most used for indoor localization, but usually, it requires many markers spread through the environment, and it has a short range to be detected. Given that the map of a MAV localization system is essential for the estimation, the type of the MAV, the environment, and the embedded sensor must be taken into account to choose the type of map that fits the constraints better.
