A more in-depth version of one of our previous posts – check out the new diagram!
Maximizing uptime of key systems in your IT infrastructure, whether it’s a server, email, or Internet access, is of utmost concern for IT departments. But, as with most IT buzzwords, there’s a lot to consider where uptime is concerned.
Maximizing uptime can also be viewed as minimizing downtime; so let’s begin by looking at different types of downtime. Unplanned downtime is what most companies focus their energy on reducing. Unplanned downtime is typically thought of as being caused by hurricanes, earthquakes, or extended power outages at the macro level. At the micro level, server or operating system failure or human error are much more common occurrences. Planned downtime is incurred every time an OS patch is installed, a hardware component is upgraded, or a system is migrated. Typically, these types of planned downtime occur during an approved ‘Maintenance Window’ to minimize any potential impact on business operations.
Next, let’s look at protecting uptime from a higher level: Disaster Recovery vs. Business Continuity. Many times, these two terms are used, incorrectly, interchangeably. Disaster recovery refers to the ability to revert back to normal operations (the same level of performance as existed prior to the disaster) after a disaster has been declared. Business continuity is the ability to maintain business operations (usually at a reduced level of service) throughout and in the direct aftermath of a disaster. Typically, business continuity involves the failover of services to a contingency location, while disaster recovery is failing back to the primary datacenter. In the ‘real world’, these two concepts can be as simple as using cell phones and a notebook after a disaster and restoring data from tapes once power is restored to multiple collocation datacenters.
Consequently, let’s review the means of protecting uptime: High availability (HA). HA is any method that ensures a certain component or service remains operational for an extended period of time. HA should really be thought of as accessibility within the same office and within one system as depicted below. You can achieve this through replication, clustering, redundant components, etc.

Finally, keep in mind that uptime can also be architected into your site by leveraging virtualization and replication to maintain a standby location should your primary site go offline as shown below.

There are many different ways to achieve uptime, most within a reasonable budget, but it all boils down to understanding the cost of downtime to your business (for example, $5,000 per hour) and determining the likelihood of a downtime event (the probability of a disaster and server failure or human error). Once this is estimated, the justification for an investment in uptime becomes clear.
