**1. Introduction**

Any system fault of data center is decreasing in total system performance when consider with the minimum requirements of system specification. Therefore, the fault may incur from many reasons such as design error, erroneous installation, machine malfunctions, device defectiveness, miss operations, human error, over operating conditions, or an amalgamation of all of those incidents. In case if the error is not detected within a timely manner and correct response, system failure may happen. Mostly, data center downtime had occurred from cascading failure from devices to sub-system and system. As the results, preventive and predictive mechanism probe to detect the error before it become a failure. The best practice of data center operations, corrective maintenance is not acceptable, for instance in case of New York Stock Exchange in 2015, within the 4 hours downtime, after an upgrade failed, at the stock exchange will result in the consequence of one's action at least \$2.5 million per hour. The data center downtime is not only costly in financial compensated but also ruin reputation that sometime cannot be evaluated.

The research from Ponemon Institute [1] reports that the total average cost of data center downtime soared by 38 percent in 2010 from \$505,502 to \$740,357 per unplanned downtime in 2016. Thereby, to evade these costs of data center downtime; they require deploying more procedures of intensive training and operations, modern maintenance strategies, and experiencing data center's operators.

downtime, uneconomical costs from unpredicted system failure, and preserve the

based preventive maintenance, because exertion will execute only when

PM and no failure and no repairs [2], as presented in **Figure 1**.

**2.2 Condition-based maintenance**

CBM comprehends as predictive maintenance. It is a useful mechanism of strategic approach for preventive maintenance that collaborates with monitoring and controlling conditions of critical devices and equipment parameters. This process will operate in order to predict device failure, to assess the RUL, and to avoid system risks, which could be happened if minimum conditions are exceeded. This strategy demonstrates the economical savings over observation of lessons or time-

A RUL defines based on the maintenance policy for single unit deteriorating system that all conditions are continually monitoring with deploying A-B-C analysis to device criticality build up on early successful diagnostics. The A-B-C analysis will diagnose and categorized level of system maintenance into 3 groups; reactive maintenance and excessive repairs and failures; proactive maintenance; and excessive

The CBM imposes as the predictive maintenance strategy, which executes device or system maintenance based on setting up conditions, performance, parameter monitoring and the subsequent actions before device or system failures happened. The CBM is a maintenance pattern that advises for maintenance decisions refer to the data and information collecting from condition monitoring system processes. During operating condition, CBM is executing as monitoring appliance through sensing device, which can gauge parameter based on various monitoring attribute s, for example temperature, humidity, vibration, noise levels, contaminants, CO2 and CO scale, and lubricating oil concentration. The usefulness of CBM is the application of the condition monitoring process, where the signals and data are online monitoring by applying many types of sensors inform of wire and wireless technologies. The core of CBM is executing in a real-time assessment of devices and systems conditions in order to analyze all data to perform the decision analysis for maintenance conditions and solutions, while reduces an planned or unplanned

operation running efficiency and effectiveness.

*DOI: http://dx.doi.org/10.5772/intechopen.93945*

*Condition-Based Maintenance for Data Center Operations Management*

guaranteed.

**Figure 1.**

**37**

*Total maintenance related costs.*

Downtime costs are a part of operating expenditure (OPEX) subject to lawsuit or penalty costs that result of any incident. The legal punishment can avoid by PPM approach or called insurance investment, that help reduce TCO in long-term operations. TCO consists of the sum total of operational and capital expenses involved in erecting and maintaining a data center. PPM approaches is not just only protecting downtime costs but also preventing reputation costs of the company that may not be estimated.

The traditional approach to avoid a downtime is applying the action plan through time-based maintenance (TBM). This means that the maintenance team plan for maintenance or upgrade systems by monitoring and controlling up on the schedule time of weeks, months, or annually based on the supplier's recommend. Moreover, TBM approach prevents the system downtime by following these maintenance schedules; regular inspection, easy to deployment, no condition monitoring needed; decision-maker control (maintenance age or MTBF) maintenance performed when the device reaches MTBF. On the other hand, the condition-based maintenance (CBM) strategic approach relies on an online/offline data collection and continuous measurable condition of devices or systems entirely during they are executing. By applying sensor devices and tools, gathering information that can perform to establish database system for trend analysis, gathering information prediction, and estimated remaining useful lifetime (RUL) of a device or system. The CBM takes action when reaches over the condition of the measurable point that system performance is directly degrading or most likely failure. A prognostic approach of online performance monitoring needs for the throughout degrading processes, from the outset of the system design, installation, operations, and until system failure. This difference approach from scheduled intervals recommends with preventive maintenance.

Since 21st century, the technological advancement, data-driven approach to PDS is predictable and precise. For this reason, many of these data center outages can avoid or mitigate with the properly maintenance approaches and deploying sensing technologies. Predictive maintenance is the complementary of preventive maintenance. Predictive maintenance imposes on the device working condition and tracking operating environment before system breakdown happens. With online condition monitoring system, the predictive maintenance takes action when the deterioration level M reached. (Decision variable: M/threshold deterioration level).

In this research, researcher proposes the preventive and predictive maintenance (PPM) which determines the CBM as systematic strategy of data center operations and maintenance. Use case examples of PDS of data center had examined to ensure their proper functionality and to reduce their deterioration rate. PPM approach can insure devices, sub-systems and systems operating safety, operate as their functional reliability and efficiency, reduce failure rates, and prevent unscheduled downtimes.
