Theory of Operation

GT.M Processes
Journal Pool
Source Server
Receiver Server
Server Shutdown
The Update Process
Application Instance
Filters
Statistics
Failover and Database Synchronization

GT.M database replication provides the ability to implement continuous application availability, using a primary and secondary system, in case of complete system failure in one or more of the following components:

The MUPIP utility program can enable or disable replication independently for each database region. Refer to the "MUPIP" chapter for more information. When replication is turned on for a database region, all updates to that region on a primary system replicate in near real-time on the database of a secondary system.

The following steps characterize database updates. The first two steps occur with or without replication:

  1. The journal file is written.

  2. The database is updated.

  3. The logical (M-level) journal file entry is delivered to a replication Source Server which in turn delivers it to the secondary system.

Once the first step completes, the transaction is recoverable even if the primary system crashes.

GT.M replicates the database by transporting the control records and M-level update journal records generated at the primary system to the secondary system and applying them there.

Since the secondary system may be very distant from the primary system, the GT.M database replication design allows the primary system to commit transactions before it receives acknowledgement from the secondary system. Therefore, transaction commits at the primary system and data transfers to the secondary system occur asynchronously. This process affects the design of applications used to export the benefits of logical dual-site operation. The process requirements are discussed later in this chapter.

If the secondary system or the communication link fails, it can lag behind the primary system until the two systems reestablish communication. Then, GT.M will automatically cause the secondary system to then catch up from the point of failure.

The M database is replicated via M-level journal records. The journal records are replicated as units related to a database transaction, i.e., within a transaction fence (TStart, TCommit, ZTStart, ZTCommit). Replication to the secondary system is asynchronous with the transaction on the primary system. This means that the primary system transaction will complete, creating appropriate journal records independent of the replication (movement) of the database updates to the secondary system.

Of the three events that occur as a function of a database update transaction, completion of the first step ensures that the transaction is recoverable. The completion of the transaction is independent of the delivery of the journal records either to the primary system's replication Source Server or to the secondary system.

For optimum recovery, the replicated updates are moved to the secondary system at a rate as close to their creation rate as possible, since they must be protected from loss to the secondary system in case of loss of the primary. For optimum performance on the primary system, disk I/O related to the replication process should be zero (i.e., the replication Source Server process should be able to operate without reading from or writing to disk). To achieve this, the network connection and the software subsystem of the secondary must have adequate bandwidth for peak update rates on the primary.

Illustrates the basic configuration and components of the GT.M recovery architecture.
Shaded areas denote additional system components in logical dual site operation of a GT.M application, not found in single site operation.

Figure 7.1. GT.M Recovery Architecture

[Note] Note

All GT.M processes accessing a replicated database must use the same Global Directory. M-extended references are not allowed.

Database replication is a general-purpose tool. Although the primary emphasis in this manual is logical dual-site operation, GT.M database replication can also be used to do the following:

  • Provide a real-time data feed from a database to another system where a different application runs. Restrictions on updating replicated regions on the secondary system do not apply if the secondary is not used as a backup for the primary.

  • Implement a logical, triple-site operation to provide for continuous application availability in scenarios where two sites can fail.