Disaster Recovery and Active/Passive Replication Systems

Not all applications require the same level of business continuity protection. Less critical applications/data can tolerate longer recovery times and amounts of lost data, while highly critical applications may not be able to tolerate any downtime or data loss. To satisfy this range of needs, the Shadowbase business continuity product suite supports both high and continuous availability solutions.

To measure the characteristics of a business continuity solution, the parameters Recovery Time Objective (RTO) and Recovery Point Objective (RPO) are used. RTO is the time taken to perform a failover recovery and resumption of business services following an outage. RPO is the amount of lost data resulting from an outage. The closer these parameters are to zero (faster time to recovery, less amount of lost data), the more effective is the business continuity solution.

As shown in the Business Continuity Continuum depicted in Figure 1, HPE Shadowbase solutions support a range of Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO). Availability level and amount of data loss improve as you move from the lower left to the upper right: as you move horizontally from an active/passive through sizzling-hot-takeover to an active/active architecture and vertically from asynchronous to synchronous data replication technology. RTO’s of minutes to seconds is considered “High Availability” while RTO’s of seconds to sub-seconds is considered “Continuous Availability.”

Figure 1 — The HPE Shadowbase Business Continuity Continuum

*Please note that these products are planned future developments; any specifications are subject to change without notice and delivery dates are not guaranteed. If interested in learning more, please contact us to see our presentation on Shadowbase synchronous replication or speak with us about the features currently available.

The asynchronous HPE Shadowbase Active/Passive (A/P) business continuity solution offers from several seconds to minutes to hours of recovery time and recovery of data in seconds to minutes. It is worth noting that, while this approach is popular, it is more risky than the approaches discussed below and is not acceptable for mission-critical solutions and many modern-day implementations. (For more details on why, please read the article, Why an Active/Passive Business Continuity Solution is Not Good Enough.)

The asynchronous HPE Shadowbase Sizzling-Hot-Takeover (SZT) business continuity solution offers a more-reliable recovery time of seconds to minutes, and recovery of data in seconds to minutes, while the asynchronous HPE Shadowbase Active/Active (A/A) business continuity solution offers:

  • milliseconds to seconds in recovery time (some users will experience no outage at all),
  • less data loss than either the A/P or SZT architectures (e.g., when compared to a two-node A/P or SZT architecture, a two-node A/A architecture puts half as much data at risk when a failure occurs).

The HPE Shadowbase product suite also supports synchronous replication for all three architectures (A/P, SZT, and A/A). While the availability characteristics are the same as for asynchronous technology, the big advantages of synchronous replication are, a) zero data loss, and b) the elimination of data collisions in an A/A environment. HPE Shadowbase Zero Data Loss (ZDL) provides zero data loss for A/P, SZT*, and A/A* architectures. HPE Shadowbase ZDL+* also offers data collision elimination for A/A environments.

HPE Shadowbase software provides this range of business continuity solutions by using data replication technology, as shown in Figure 2. The purpose of data replication is to keep a target database synchronized with the changes that an application is making against a source database, in real-time. (The source database is hosted on the source node and the target database is hosted on the target node.) As the application makes changes to the source database, the HPE Shadowbase data replication engine captures the data changes and sends them to a target system, where they are applied to the target database. In this way, the source and target databases are kept synchronized.

Diagram of HPE Shadowbase Data Replication Engine. Please see the paragraph that starts with "HPE Shadowbase software provides" for a full image description.

Figure 2 — The HPE Shadowbase Data Replication Engine

The two (or more) source and target nodes comprise a redundant distributed data processing system. The target database typically resides on another independent node that may be hundreds or thousands of miles away. (The distance should be sufficient to provide geographic fault-tolerance.) If a failure of the source system occurs, application processing can continue on the remote, unaffected target system, since HPE Shadowbase software is keeping the source and target databases synchronized.

Uni-directional (Active/Passive) HPE Shadowbase Replication for Disaster Recovery (High Availability)

Achieving high levels of service availability requires that a backup node exists which can take over in subseconds or seconds in the event of an active node failure. Shadowbase replication provides this service by using, uni-directional data replication, which is the simplest form of data replication (see Figure 2). Since an active node processes all transactions and replicates the database changes that it makes to a remote standby database, the two databases are in (or are nearly in) synchronization. If the active node fails, the backup (or passive) node is available with a current copy of the database, ready to take over processing.

A Shadowbase asynchronous uni-directional (active/passive) system has an RPO of greater than zero (it depends upon the replication latency of the data replication channel). If Shadowbase synchronous replication is used, no data is lost following a source node failure, and an RPO of zero is achieved. The RTO of an active/passive system is measured in minutes or longer as applications are started following a failure of the active node, the databases are mounted, and the network is reconfigured. Additional recovery time is typically required for the management decision time to failover to the backup system and for testing to ensure that the backup is performing properly.

In an active/passive configuration, the passive system is typically idle as far as update-processing is concerned. However, applications may also be up and running in read-only mode in the standby node, and the standby database may be actively used for query and reporting purposes. Shadowbase replication provides for the target database to be a consistent copy of the source database, though delayed by the replication latency. If the active node fails, the applications at the backup node can remount the database for read/write access and take over the role of the original active node. This process typically takes only a few minutes, leading to RTOs measured in minutes. Therefore, uni-directional architectures provide high availability — RTOs measured in minutes and RPOs measured in subseconds (or zero if synchronous replication is used).

This replication method is used for classic disaster recovery, active/passive configurations. It supports applications that must be highly available but where some small data loss is tolerable. Customer relationship management (CRM) and human resources (HR) corporate applications are examples of this class of application, as are ATM transactions, which have a low value. If the ATM machine is down, the customer can go to a different ATM machine serviced by a different bank.

While an active/passive architecture certainly offers high availability, there are other Shadowbase business continuity solutions which should also be considered (please read Active/Active—Why Choose HPE Shadowbase?), particularly a Shadowbase SZT architecture (please read Continuous Availability and Active/Active Replication Systems), providing a small step-up from an active/passive configuration, but offering significant advantages.

Related Solutions:
Related White Paper:
Related Case Studies: