CS.Lectures: Commit Protocols- 2Phase & 3Phase (M3.1)

Commit Protocols are used to ensure atomicity, across the sites in which a transaction T executed must either commit at all sites, or it must abort at all sites. To ensure this property, the transaction coordinator(manages distributed transactions) of T, must execute a commit protocol.

Among the simplest and most widely used commit protocols is the two-phase
commit protocol (2PC).

An alternative is the three-phase commit protocol (3PC) , which avoids certain disadvantages of the 2PC protocol but adds to complexity and overhead.

Atomicity: The term atomicity defines that the data remains atomic. It means if any operation is performed on the data, either it should be performed or executed completely or should not be executed at all. It further means that the operation should not break in between or execute partially.

Two-phase commit

In the Two-phase commit protocol, there are two phases (that’s the reason it is called Two-phase commit protocol) involved. The first phase is called prepare. In this phase, a coordinator (either a separate node or the node initiating the transaction) makes a request to all the participating nodes, asking them whether they are able to commit the transaction. They either return yes or no in the response. They return yes if they can successfully commit the transaction or no if they unable to do so.

In the second phase, coordinator decides based on the votes whether to send the commit or abort request to the participating nodes. This phase is called the commit phase.

If all the participating nodes said yes, then commit request is sent to all the nodes
If any of the participating nodes said no, then abort request is sent to all the nodes

Microservices: It's not (only) the size that matters, it's (also) how you use them – part 2 – Jeppe Cramon's Software development blog

Advantages

The two phase commit protocol is a distributed algorithm which lets all sites in a distributed system agree to commit a transaction.
The protocol results in either all nodes committing the transaction or aborting, even in the case of site failures and message losses.

Disadvantages

The greatest disadvantage of the two-phase commit protocol is that it is a blocking protocol. If the coordinator fails permanently, some participants will never resolve their transactions: After a participant has sent an agreement message to the coordinator, it will block until a commit or rollback is received.

Handling of Failures

The 2PC protocol responds in different ways to various types of failures:

Failure of a participating site

If the coordinator C i detects that a site has failed, it takes these actions: If the site fails before responding with a ready T message to C i , the coordinator assumes that it responded with an abort T message. If the site fails after the coordinator has received the ready T message from the site, the coordinator executes the rest of the commit protocol in the normal fashion, ignoring the failure of the site.

Let T be one such transaction. We consider each of the possible cases:

The log contains a <commit T> record. In this case, the site executes redo(T).
The log contains an <abort T> record. In this case, the site executes
undo(T).
The log contains a <ready T> record. In this case, the site must consult C i to determine the fate of T.
The log contains no control records (abort, commit, ready) concerning T.

Failure of the coordinator

If the coordinator fails in the midst of the execution of the commit protocol for transaction T, then the participating sites must decide the fate of T.

We shall see that, in certain cases, the participating sites cannot decide whether to commit or abort T, and therefore these sites must wait for the recovery of the failed coordinator.

If an active site contains a <commit T> record in its log, then T must be committed.
If an active site contains an <abort T> record in its log, then T must be aborted.
If some active site does not contain a <ready T> record in its log, then
the failed coordinator C i cannot have decided to commit T because a site
that does not have a <ready T> record in its log cannot have sent a ready
T message to C i . However, the coordinator may have decided to abort T,
but not to commit T. Rather than wait for C i to recover, it is preferable to
abort T.
If none of the preceding cases holds, then all active sites must have a
<ready T> record in their logs, but no additional control records (such as <abort T> or <commit T>). Since the coordinator has failed, it is
impossible to determine whether a decision has been made, and if one
has, what that decision is, until the coordinator recovers. Thus, the active
sites must wait for C i to recover. Since the fate of T remains in doubt, T may
continue to hold system resources. For example, if locking is used, T may
hold locks on data at active sites. Such a situation is undesirable, because
it may be hours or days before C i is again active. During this time, other
transactions may be forced to wait for T. As a result, data items may be
unavailable not only on the failed site (C i ), but on active sites as well. This
situation is called the blocking problem, because T is blocked pending
the recovery of site C i .

Network partition

When a network partitions, two possibilities exist:

1. The coordinator and all its participants remain in one partition. In this
case, the failure has no effect on the commit protocol.

2. The coordinator and its participants belong to several partitions. From
the viewpoint of the sites in one of the partitions, it appears that the
sites in other partitions have failed. Sites that are not in the partition
containing the coordinator simply execute the protocol to deal with
failure of the coordinator. The coordinator and the sites that are in the
same partition as the coordinator follow the usual commit protocol,
assuming that the sites in the other partitions have failed.

Thus, the major disadvantage of the 2PC protocol is that coordinator failure may result in blocking, where a decision either to commit or to abort T may have to be postponed until C i recovers.

Two Phase Commit (2PC) is one of the failure recovery protocols commonly used in distributed database management system. It has a disadvantage of getting blocked under certain circumstances. For example, assume a case where the coordinator of a particular transaction is failed, and the participating sites have all sent <READY T> message to the coordinator. Now, participating sites do not have either <ABORT T> or <COMMIT T>. At this stage, no site can take a final decision on its own. Only solution is to wait for the recovery of the coordinator site. Hence, 2PC is a blocking protocol.

Three Phase Commit (3PC) Protocol

3PC is a protocol that eliminates this blocking problem on certain basic requirements;

No network partitioning

At least one site must be available

At most K simultaneous site failures are accepted

2PC has two phases namely voting phase and decision phase. 3PC introduces pre-commit phase (serves as a buffer phase) as the third phase.

3PC works as follows;

Phase 1 (WAIT/VOTING):

Transaction Coordinator (TC) of the transaction writes BEGIN_COMMIT message in its log file and sends PREPARE message to all the participating sites and waits.

Upon receiving this message, if a site is ready to commit, then the site’s transaction manager (TM) writes READY in its log and send VOTE_COMMIT to TC.

If any site is not ready to commit, it writes ABORT in its log and responds with VOTE_ABORT to the TC.

Phase 2 (PRE-COMMIT):

If TC received VOTE_COMMIT from all the participating sites, then it writes PREPARE_TO_COMMIT in its log and sends PREPARE_TO_COMMIT message to all the participating sites.

On the other hand, if TC receives any one VOTE_ABORT message, it writes ABORT in its log and sends GLOBAL_ABORT to all the participating sites and also writes END_OF_TRANSACTION message in its log.

On receiving the message PREPARE_TO_COMMIT, the TM of participating sites write PREPARE_TO_COMMIT in their log and respond with READY_TO_COMMIT message to the TC.

If they receive GLOBAL_ABORT message, then TM of the sites write ABORT in their logs and acknowledge the abort. Also, they abort that particular transaction locally.

Phase 3 (COMMIT/DECIDING):

If all responses are READY_TO_COMMIT, then TC writes COMMIT in its log and send GLOBAL_COMMIT message to all the participating sites’ TMs. The TM of those sites then writes COMMIT in their log and sends an acknowledgement to the TC. Then, TC writes END_OF_TRANSACTION in its log.

Advantages of 3PC Protocol

The Blocking problem found in 2PC can be avoided (in certain occasions, especially when at least not more than k sites failed)

Disadvantages of 3PC Protocol

Network partitions (network segments) would cause Blocking Problem, especially if more than k sites are part of any partitions.
Long latency due to the number of messages to be transferred between sites on taking decision. That is, it involves 3 phases and all the 3 phases involve communication between sites.

Sunday, July 4, 2021

Commit Protocols- 2Phase & 3Phase (M3.1)

Two-phase commit

Three Phase Commit (3PC) Protocol

3PC is a protocol that eliminates this blocking problem on certain basic requirements;

Advantages of 3PC Protocol

Disadvantages of 3PC Protocol