Messaging

For online operations, application servers on the primary respond to messages from clients delivered over a network. Each message is assumed to result in zero (inquiry) or one (updated) transaction on the server. The network that delivers messages must be robust. This means that each message must either be delivered exactly once to an application server on the primary, or have a delivery failure with the client notified of the failure. It is recommended that the message delivery system be integrated with the logic that determines which system is primary and secondary at any time. The messaging system must be able to handle situations such as failure on the primary after the client transmits the message but before the primary receives it.

Application servers will typically respond to client messages with a reply generated immediately after the TCOMMIT for a transaction. It is necessary to make provisions in the application and message architecture to handle the scenario in which the primary fails after the TCOMMIT, but before the system generates a reply and transmits it to the client. In this case, the client would eventually time out and retry the message.

A logical dual-site application can handle this situation by designing the message structure to have a unique message identifier (MSGID), and the application to include the MSGID in the database as part of the TCOMMIT.

If the primary crashes after committing the transaction and the failover logic makes the former secondary the new primary, the retried message that has the same MSGID from the client will be received by the new primary. In this case, one of the following scenarios may occur:

  • The database shows that the transaction corresponding to the MSGID in the message has already been processed. The server could then reply that this transaction was processed. A more sophisticated approach would be to compute the response to the client within the transaction, and to store it in the database as part of the transaction commit. Upon receipt of a message that is a retry of a previously processed message, the server could return the previous response in the database to the client.

  • The database shows the transaction as unprocessed. In this case, the new primary will process the transaction. At this time, it is unknown whether or not the former primary processed the transaction before going down. If it was not processed, there is no issue. If it was processed, it would be rolled back when the former primary comes up as a secondary, and must be reconciled either manually or automatically, from the rollback report (since the result of processing the first time may be different from the result of processing the second time).