KB84 Print this KB      
What is the InfraWare Disaster Recovery Plan?

Answer / Solution

InfraWare Disaster Recovery & Prevention

Overview

New and established customers alike inquire about InfraWare Disaster Recovery (DR) plan. This is a wise line of inquiry because every information technology (IT) system is at risk from failed components. An organization that plans for contingencies can prevent downtime and minimize the impact when disaster strikes.

Solution

The actual InfraWare Disaster Recovery Plan cannot be released for customer or third party consumption because it is a sensitive document that contains information that is both proprietary and secure. (Hackers could use it to attempt to compromise the platform.) This document outlines information that can be published and enough detail to establish confidence.

Disaster planning is about both recovery and prevention. The line between prevention and recovery is fluid. A well designed server infrastructure can prevent many of the most common failures from becoming disasters which require recovery. Two basic examples:

Hard Disks

  • All computer systems contain hard disks to which software is installed and for storage. Normally, if a disk fails, the computer loses all information. This is often known as a crash. The computer system cannot run again until the disk is replaced. Then, it needs to be reloaded with the operating system and a back-up needs to be restored. This is a recovery process that usually takes many hours (downtime). Top tier server builders like IBM and HP offer a technology known as RAID (Redundant Array of Inexpensive Disks, the inexpensive part is a misnomer. They are quite expensive compared to hard disks installed in desktop PCs). RAID is available in many levels, such as RAID1 and RAID5. With this technology, data is striped across more than one disk drive. One drive can fail without the loss of any data and without the need for the server to be shut down. The best RAID systems use hot plug drives so failed drives can be replaced without the need to shut down the server. All production InfraWare servers incorporate RAID systems with hot plug drives.

Power Supplies

  • Every computer has a device known as a power supply that converts regular electric power to the DC power required by the internal components. If that device fails, the computer cannot continue to run. Top tier hardware providers offer servers with two redundant power supplies installed. If one power supply fails, the server continues to operate without missing a beat. With the best models, the failed unit can even be replaced without server downtime. Every production InfraWare server supporting the platform has dual, hot plug power supplies. This way, a hardware failure can occur without an impact to platform uptime or performance. No disaster (or recovery) occurs. Replacing a power supply is a simple maintenance event.

There are many such hardware investments that go into prevention, and prevention goes further than investment in good assets. Administrative practices including backup, media rotation, password requirements, change control and network documentation are all part on the InfraWare operation.

Data Center

ASP (Application Service Provider) companies which provide mission critical services must locate their servers in hardened facilities called Data Centers. Top tier data centers have many common infrastructure characteristics which are designed to support high availability, or uptime.

InfraWare’s production servers are located in such a facility called n|Frame (nframe.com). This facility promotes uptime with many exceptional features including:

Electrical Power Redundancy

    • Connection to two independent electric power companies. Even if an entire power company or generating station goes down power is still available from the second. Better than mere battery back-up, power is protected by inertial flywheels. In addition large scale diesel generators stand-by in the event that both power companies would fail for any period of time.

Internet Connection Redundancy

    • Connection to multiple Internet Service Providers (ISP) over the BGP4 (Border Gateway Protocol version 4). This means that the IP addresses bound to InfraWare’s servers can be routed via at least two major Internet companies. In addition, those ISP connections are via separate fiber optic cables to protect against to potential failure due to a break to a telecommunication conduit from construction workers or severe weather.

Cooling

    • Servers generate significant heat and are at risk of failure due to overheating. This facility maintains redundant cooling systems that move clean, cool air through each server cabinet from bottom to top to maintain ideal operating conditions.

Fire Protection

    • The facility employs early warning detectors and gas-based fire suppression systems to quickly contain any combustion.

Physical Access

    • Getting to physical access to InfraWare servers and other sensitive assets in the facility requires being on a registration list with identification and three locks: a biometric scanner, an RF (radio frequency) key card and a mechanical key. (All three are required.)

Additional information about n|Frame’s data center is available at their website: nframe.com.

Firewall

Protection of servers from malicious connections, hackers, spyware and viruses is critical. This is accomplished with a state-of-the-art Cisco firewall and tightly authored communications algorithms which are encrypted. This area of the DR plan is sensitive because the details could be used by criminal hackers to attempt to compromise the system. Additional details cannot be disclosed.

Virtualization

Each of the above layers of redundancy work together to prevent downtime. A severe disaster could potentially destroy the entire data center which would require restoration of services at another site. A powerful technology which supports fast migration of services to secondary and tertiary sites is virtualization. InfraWare relies upon VMware (vmware.com) for this purpose.

Using the virtualization layer of abstraction, InfraWare keeps images of each production server which can be quickly started at different sites.

Conclusion

InfraWare maintains a significant investment in redundancy and high availability infrastructure assets to ensure uptime by minimizing the impact of typical systems failures. The company employs software and communications best practices to promote reliability and has mature plans to quickly restore services in the event of a serious disaster.

Disclaimer

The InfraWare network is fluid. The information enclosed herein was believed to be accurate at the time of publication, but it is not warranted.



Direct Link to This KB
http://kb.infraware.com//KB/?f=84

Last Updated
Wednesday, July 11, 2012

Tags
DR disaster recovery servers reliability uptime downtime KB84
How would you rate this article?

Poor
1
2
3
4
5

Great
Submit

Back to Top