Increasing Complexities, Outages – Data Centers Need to Change

Uptime Institute is an advisory organization focused on improving the performance, efficiency, and reliability of business-critical infrastructure. They’ve recently announced their key findings in its 10th annual Global Data Center Survey, which is the largest and most comprehensive in the data center industry.

Generally speaking, the results show a growing sector adapting to rapid change on multiple levels. In almost every area under discussion, including outages, there is considerable variation in the strategies being employed.

What the Study Entails

Uptime Institute has reported on data center outages for several years. It includes surveying operators on their experiences with outages, and closely tracking publicly recorded incidents. Although it’s difficult to collect and interpret this data and information, it’s clear that several trends have emerged.

Uptime Institute conducts its comprehensive global survey across the data center industry each year. The 2020 survey was conducted March-April and includes responses from roughly 850 managers at organizations that own and operate data centers in more than 50 countries.

The Findings

Avoiding unplanned downtime continues to be a top technical and business challenge for all owners and operators. What the surveys from 2018, 2019, and now 2020 found was that outages occur with disturbing frequency. The bigger outages are becoming more damaging and expensive, and what has been gained in improved processes and engineering has been partially offset by the challenges of maintaining ever more complex data centers and complex systems.

“Our 2020 survey results reflect a strong, growing sector facing increased change and complexity,” said Andy Lawrence, Executive Director of Research, Uptime Institute. “The growing complexity, along with the greater consequences of failure, creates the need for more vigilance and more sophisticated approaches to resiliency, performance, and operations.” Chuck Gaw, president of Gaw Technology, has seen an uptick in the demand for high capacity seismic data cabinets that help create a more stable base that houses the technology.   

What’s alarming is that almost half of all respondents had a significant (or greater) outage within the past three years. Also, about 20 percent of these respondents reported that the outage was serious or severe. The effects are that these outages can cause substantial financial and reputation damage, which tangibly impacts the organization.

What this Means

What the data shows is that there are bigger and more frequent outages occurring in data centers all over. They’re not only becoming more frequent but are also more damaging and expensive over the years. This is a fact given the results of Uptime Institute’s survey findings for three years running. Over the previous 3-year period, more than three-quarters said they had experienced such an outage.

Another important aspect of the 2020 survey to point out is that Uptime Institute also dove deeper into the impact of each outage, and even the smaller ones that often go undocumented or unreported. While they may not have a significant impact because they are smaller, they happen more frequently which does create bigger problems in the long run.

What’s even more alarming is that three-quarters of the organizations have reflected on these results and have admitted that the most recent significant outages were preventable. The good news is that the number of outages and their effects can decrease with careful attention and investment in addressing them. Power problems continue to be the largest single cause of major outages.

Transparent Clouds Good for Business

Organizations surveyed are increasingly embracing public cloud, which isn’t too surprising. As a venue for IT workloads, public cloud is expected to increase from 8 percent of all workloads today to 12 percent within two years. Public cloud providers should see this as a big growth opportunity for them. However, some roadblocks and issues are standing in the way of adoption. For instance, organizations report the lack of visibility, lack of transparency, and accountability of public cloud services as a few of the major issues. The study finds that if there were a higher level of visibility into the operational resiliency of the service that one-fifth of managers said they would be more likely to run their critical workloads in a public cloud.

Conclusion & Additional Findings

The average site energy efficiency flat-lined and rack densities are rising. Due to more work now being done in big, efficient facilities, the overall energy efficiency of IT has improved. In 2020, the average PUE (power usage effectiveness) for a data center was 1.59, a slight improvement. While densities are rising, the conclusion is that it’s not enough to drive wholesale site-level changes in power distribution or cooling. A few more pieces of knowledge to consider are:

·         The enterprise data center is neither dead nor dying, while the Edge is still on the edge.

·         Artificial intelligence will not take over — yet.

·         The data center staffing crisis is getting worse.