High Availability – A Beginner’s Guide

Published On: 23rd June 20207.8 min readTags: , , , , , ,

What is High Availability?

High Availability (HA) refers to a system’s ability to ensure uninterrupted operation and accessibility, typically measured as a percentage of uptime. Decreased downtime, the elimination of single points of failure, and replication and distribution of data across multiple locations, are all contributing factors that come together to create a high availability architecture.

Availability is the metric that defines the “uptime” of an IT solution and is generally included in a Service Level Agreement (SLA). An ideal solution that is impervious to failure or downtime would receive a 100% availability score. Typically, availability is referred to by the number of nines (9s) which systems or applications have. See below example of levels of availability and associated downtime.

AVAL (# of 9s) AVAL (%) DT / Year DT / Month
1 90% 36.5 Days 72 Hours
2 99% 3.65 Days 7.2 Hours
3 99.9% 8.76 Hours 43.8 Minutes
4 99.99% 52.56 Minutes 4.38 Minutes
5 99.999% 5.26 Minutes 25.9 Seconds
6 99.9999% 31.5 Seconds 2.59 Seconds
AVAL – Availability
DT – Downtime
Availability (no. of 9s) Availability (%) Downtime Per Year Downtime Per Month
1 90% 36.5 Days 72 Hours
2 99% 3.65 Days 7.2 Hours
3 99.9% 8.76 Hours 43.8 Minutes
4 99.99% 52.56 Minutes 4.38 Minutes
5 99.999% 5.26 Minutes 25.9 Seconds
6 99.9999% 31.5 Seconds 2.59 Seconds

How Does High Availability Work?

In order to ensure that IT systems are highly available, it’s important to design and build the necessary levels of resiliency and redundancy into their architectures, from end to end.

Resiliency refers to a system’s ability to withstand or spring back from operational disruption, and it is achieved by building redundancy into a solution.

Redundancy describes the inclusion of extra components (i.e. hardware and software) within an infrastructure, and replication of data across locations. These practices ensure that a system continues to function, even in the event of a component failure. They also ensure that data can be accessed at any time and from any location.

For instance, say that a manufacturing firm suffers a disk drive failure in one node of its highly available cluster. Because it has additional nodes in the cluster that contain exact copies of the data held on the failed node, the business is not impacted. Applications can be migrated or restarted on the remaining operational nodes without disruption, and the firm’s production lines continue to operate as if nothing had happened. The firm built a resilient system for high availability by providing redundant server nodes that maintain business continuity in the wake of a disruption.

High Availability - one node offline

What Are the Benefits of High Availability?

Reduced Downtime

Nowadays, organizations are highly reliant on technology for day-to-day operations. In the event that a server has to be brought offline for maintenance, updates, or repairs, business functions are often hindered. Adding high availability to your infrastructure is like adding in an insurance policy, protecting your organization from disruptive downtime. When one node fails or is taken offline, the others remain operational, enabling your employees to keep working. This not only helps you avoid loss of productivity, but also loss of revenue. Using advanced hypervisor features, such as VMware’s Fault Tolerance, can also ensure applications suffer zero downtime if running on a node that suffers failure.

Business Continuity / Disaster Recovery

Implementing high availability architecture within your IT environment can help your business remain resilient in the face of physical disruptions and natural disasters. By eliminating single points of failure, and adding redundancy into your infrastructure, your system is able to stay up and running, even if one component, such as a server node, is taken offline. Stretch clusters are another option that help organizations maintain high availability during disruptive circumstances. They enable organizations to install nodes across two or more different physical locations, so that if one is disrupted, the others remain operational. Your nodes can be separated across your office, campus, or even the entire city. Discover more about the benefits of stretch clusters in this white paper.

Performance

The nature of high availability architecture can allow for applications to be distributed across the nodes in a cluster. This improves compute performance, as your organization can utilize the extra resources available from multiple nodes while still ensuring storage high availability. A further configuration option is to build a high availability storage-only cluster which then presents the mirrored storage to compute-only nodes that handle the application workloads. Read more about storage-only clusters here.

White Paper: Building a Highly Available System

Guidance and best practices on how to ensure a high availability solution

High Availability Solutions and Edge Computing

Organizations operating at the network edge often have a large number of locations, are based in remote places, and / or are operating within environments that have problematic network connectivity. They do not typically have the IT personnel onsite to fix problems as they arise and have to wait hours or even days for repairs to be made, resulting in loss of productivity and revenue. Within these environments, high availability is important to keep IT systems up and running, and help businesses remain operational.

When you’re faced with these limitations, you want to make sure your IT infrastructure is as resilient and reliable as possible. For example, in the event that a server is damaged or requires maintenance, you need to have an insurance policy (extra node) in place, to ensure that your entire system doesn’t go down; it’ll stay up and running.

Here are two examples of edge environments that benefited from implementing a high availability architecture.

High Availability at Airports

How do airports ensure their IT systems are highly available? This infographic explores two examples.

High Availability for Wind Farms

Wind Turbine Farm

One of the largest energy firms in the world has hundreds of wind farm facilities that require constant management. When there’s no wind and each turbine’s blades stop turning, the weight of the blades can cause expensive damage to the turbine shafts. The software required to ensure the blades keep turning, even when there’s no wind, must therefore remain online at all costs. Given their remote locations, it can take up to six days to have an engineer carry out repairs. To avoid long periods of downtime and prevent damage to their turbines, the company required a solution that enabled high availability. Learn more in our customer case study.

High Availability for Retail Chain

US Nationwide Retail Chain

A well-known retail chain, located in the US, was losing revenue as a result of system downtime. They were averaging 100 outages per year and 6 hours of downtime per outage, which was severely affecting their business. Every time a store’s system went down, they were losing hours of productivity, customer loyalty, and revenue. They needed a high availability solution to eliminate downtime, as well as the need for onsite support, and help them maintain business continuity across more than 2,000 stores. Read our customer case study for additional information.

 

High availability solutions are ideal for edge computing environments, as they help them combat downtime and remain up and running. This is especially beneficial for businesses that don’t have the in-house IT staff needed to attend to their systems.

Interested in learning more about what defines an ‘edge’ environment? Explore our Beginner’s Guide on Edge Computing.

Additional High Availability resources you may find helpful:

High Availability with SvSAN and SvKMS

StorMagic SvSAN and High Availability Storage

StorMagic SvSAN is a virtual SAN solution that creates highly available storage across two nodes or more. Through active-active synchronous mirroring between two servers, SvSAN ensures there is always an exact copy of data on each server. In the event that one server is taken offline for maintenance or suffers a failure, the remaining server continues to operate. SvSAN enables high availability by eliminating single points of failure and ensuring there is no downtime or disruption to service in the organization.

SvSAN’s ability to provide highly available shared storage on a minimum of just two nodes is unique, and is made possible by the use of its lightweight witness. The witness can be sited locally or remote to the cluster, can provide quorum for hundreds of clusters at a time, and will run on as little as a Raspberry Pi.

Visit the StorMagic SvSAN page to learn more about the product, and download our SvSAN data sheet for more information about its unique features and capabilities.

StorMagic SvKMS and High Availability Architecture

StorMagic SvKMS is an encryption key management solution with flexible options for high availability. Customer applications require uninterrupted access to their encryption keys, and SvKMS maintains this access through a powerful, highly available architecture.

SvKMS supports both a unique active-passive two-node HA setup and an active-active 2- node+1 clustering configuration that can create tiered levels of redundancy that scales for added assurance against any loss of encryption key access.

SvKMS HA uses shards to partition and replicate data, nodes to distribute those shards across multiple locations, and clustering to contain data and provide failover in the event that a node is disconnected from the network. It significantly reduces the possibility of a disruption in service that can result in a customer being unable to access their encryption keys.

Discover more about StorMagic SvKMS by visiting the SvKMS page. To learn more about SvKMS’s capabilities and product features, download our SvKMS data sheet.

The latest on High Availability from StorMagic:

Share This Post, Choose Your Platform!