Failure

If the network from clients to the primary fails, and the network from the clients to the secondary is still functioning, this warrants a failover from the primary to the secondary.

If the network from clients to both the primary and secondary fails, the application is no longer available.

If the network between primary and secondary fails, no action is required to manage GT.M replication. The primary will continue to make the application available. The secondary will catch up with the primary when the network is restored.

If the network from the clients to the secondary fails, no action is required to manage GT.M replication, although it would be prudent to make the network operational as soon as possible.

If the primary (A) fails, the secondary should take over; when the primary comes back up, it should come up as the new secondary.

  • The external control mechanism should detect that the primary has failed, and take action to switch the secondary to primary mode, and either route transactions to the new primary (former secondary) or notify clients to route transactions to the new primary.

  • If the former primary did not respond to certain transactions, one cannot be certain whether they were processed and whether or not the database updates were committed to the former secondary. These transactions must now be processed on the new secondary. When the former primary comes up as the new secondary, processed transactions and database updates will be rolled off the primary and must be reconciled.

On B:

  • Stop the Replication Server.

  • Create new journal files.

  • Switch the Source Server from passive to active mode to start replicating to the new secondary (former primary) when it comes back up.

  • Start the application servers, or if they were passive, they should be activated. The new primary is now ready to receive transactions from clients.

  • If the state of the database indicates that batch operations were in process, restart batch operations.

  • When Site A comes back up, query Site B as to the journal sequence number at which it became primary, and roll back the secondary database on A to this point. Transmit the transactions that were backed out of the database by the rollback to the primary for reconciliation/reapplication.

  • Create new journal files.

  • Start the Source Server in passive mode.

  • Start the Receiver Server to resume replication as the new secondary. Dual-site operation is now restored.

  • As appropriate, start the passive application servers.