**4. Building fault tolerance cloud infrastructure**

This section covers the key fault tolerance mechanisms at the cloud infrastructure component level and covers the concept of service availability zones. This section also covers automated service failover across zones along with zone configurations such as active/active and active/passive.

#### **4.1 Single points of failure (SPOF)**

Highly available infrastructures are typically configured without single points of failure as shown in **Figure 7** to ensure that individual component failures do not

*Network Function Virtualization over Cloud-Cloud Computing as Business Continuity Solution DOI: http://dx.doi.org/10.5772/intechopen.97369*

**Figure 7.** *Single Point of Failure.*

result in service outages. The general method to avoid single points of failure is to provide redundant components for each necessary resource, so that a service can continue with the available resource even if a component fails. Service provider may also create multiple service availability zones to avoid single points of failure at data center level.

Usually, each zone is isolated from others, so that the failure of one zone would not impact the other zones. It is also important to have high availability mechanisms that enable automated service failover within and across the zones in the event of component failure, data loss, or disaster.

N + 1 redundancy is a common form of fault tolerance mechanism that ensures service availability in the event of a component failure. A set of N components has at least one standby component. This is typically implemented as an active/passive arrangement, as the additional component does not actively participate in the service operations. The standby component is active only if any one of the active components fails. N + 1 redundancy with active/active component configuration is also available. In such cases, the standby component remains active in the service operation even if all other components are fully functional. For example, if active/ active configuration is implemented at the site level, then a cloud service is fully deployed in both the sites. The load for this cloud service is balanced between the sites. If one of the site is down, the available site would manage the service operations and manage the workload.

#### **4.2 Avoiding single points of failure**

Single points of failure can be avoided by implementing fault tolerance mechanisms such as redundancy:


It is important to have high availability mechanisms that enable automated service failover.
