Concurrency Control Protocols in DBMS

Introduction

Concurrency Control in management System is a procedure of managing simultaneous operations without conflicting with one another. It ensures that Database transactions are performed concurrently and accurately to supply correct results without violating data integrity of the respective Database. Concurrent access is sort of easy if all users are just reading data. There is no way they will interfere with each other. Though for any practical Database, it might have a mixture of READ and WRITE operations and hence the concurrency may be a challenge. DBMS Concurrency Control is employed to deal with such conflicts, which mostly occur with a multi-user system. Therefore, Concurrency Control is that the most vital element for correct functioning of a management System where two or more database transactions are executed simultaneously, which require access to the same data.


Concurrent Execution in DBMS

  • In a multi-user system, multiple users can access and use an equivalent database at just one occasion , which is understood because the multiprogramming of the database. It means that the same database is executed simultaneously on a multi-user system by different users.
  • While performing on the database transactions, there occurs the need of using the database by multiple users for performing different operations, and therein case, multiprogramming of the database is performed.
  • The thing is that the simultaneous execution that's performed should be wiped out an interleaved manner, and no operation should affect the opposite executing operations, thus maintaining the consistency of the database. Thus, on making the multiprogramming of the transaction operations, there occur several challenging problems that require to be solved.


Potential problems of Concurrency

Here, are some issues which you'll likely to face while using the DBMS Concurrency Control method:

  • Lost Updates occur when multiple transactions select an equivalent row and update the row supported the worth selected
  • Uncommitted dependency issues occur when the second transaction selects a row which is updated by another transaction as dirty read
  • Non-Repeatable Read occurs when a second transaction is trying to access an equivalent row several times and reads different data whenever.
  • Incorrect Summary issue occurs when one transaction takes summary over the worth of all the instances of a repeated data-item, and second transaction update few instances of that specific data-item. In that situation, the resulting summary doesn't reflect an accurate result.

Why use Concurrency method ?

Reasons for using Concurrency control method is DBMS:

  • To apply isolation through mutual exclusion method between conflicting database transactions
  • To resolve read-write also as write-write conflict issues
  • To preserve the database consistency through the constantly preserving execution obstructions
  • The system needs to control the interaction between the concurrent transactions. This control is achieved using concurrent-control schemes.
  • Concurrency control helps to make sure serializability

Example
Assume that two people that attend electronic kiosks at an equivalent time to shop for a movie ticket for an equivalent movie and therefore the same show time. However, there's just one seat left certain the movies therein particular theatre. Without concurrency control in DBMS, it's possible that both moviegoers will find yourself purchasing a ticket. However, concurrency control method doesn't allow this to happen. Both viewers can still access information written within the movie seating database. But concurrency control only provides a ticket to the moviegoer who has completed the transaction process first.


DBMS Concurrency Control Protocols

Different concurrency control protocols offer different benefits between the quantity of concurrency they permit and therefore the amount of overhead that they impose. Following are the Concurrency Control techniques in DBMS:

  • Lock-Based Protocols
  • Two Phase Locking Protocol
  • Timestamp-Based Protocols
  • Validation-Based Protocols

1. Lock-based Protocols

Lock Based Protocols in DBMS is a system wherein an exchange can't Read or Write the information until it gets a proper lock. Lock based conventions help to kill the simultaneousness issue in DBMS for concurrent exchanges by locking or detaching a distinct exchange to a one client. A lock might be an information variable which is related with an information thing. This lock implies that activities which will be performed on the information thing. Secures DBMS help synchronize admittance to the data set things by simultaneous exchanges. All lock demands are made to the simultaneousness control chief. Exchanges continue just once the lock demand is conceded.

Binary Locks: A Binary lock on an item can either locked or unlocked states.

Shared/exclusive: This sort of locking mechanism isolates the secures DBMS dependent on their uses. If a lock is acquired on an item to perform a write operation, its called an exclusive lock.

1. Shared Lock (S):

A shared lock is additionally called a Read-only lock. With the common lock, the information thing can be divided among exchanges. This is on the grounds that you'll never have consent to refresh information on a thing. For instance, consider a situation where two exchanges are perusing the record equilibrium of an individual. The information base will allow them to peruse by setting a common lock. However, if another transaction wants to update that account's balance, shared lock prevents it until the reading process is over.

2. Exclusive Lock (X):

With the Exclusive Lock, an information thing can be perused just as composed. This is selective and can't be held simultaneously on a similar information thing. X-lock is mentioned utilizing lock-x guidance. Exchanges may open the information thing in the wake of completing the 'compose' activity. For instance, when an exchange needs to refresh the record equilibrium of an individual. You can permit this exchange by putting X lock on it. Subsequently, when the subsequent exchange needs to peruse or compose, selective lock forestalls this activity.

3. Simplistic Lock Protocol

This kind of lock-based conventions permits exchanges to get a lock on each item prior to starting activity. Transactions may unlock the item after finishing the 'write' operation.

4. Pre-claiming Locking

Pre-claiming lock convention assists with assessing tasks and make a rundown of required information things which are expected to start an execution cycle. In the circumstance when all locks are without a doubt, the exchange executes. From that point forward, all locks discharge when the entirety of its tasks are finished.

Fig. Pre-claiming locking protocol

Reference: https://www.javatpoint.com/dbms-lock-based-protocol

Starvation

Starvation is the situation when a transaction needs to wait for an indefinite period to acquire a lock.

Following are the reasons for Starvation:

  • When waiting scheme for locked items is not properly managed
  • In the case of resource leak
  • The same transaction is chosen as a victim repeatedly
Deadlock

A deadlock is a condition that occurs when two or more different database tasks are waiting for each other and none of the task is willing to give up the resources that other task needs. It is an undesirable circumstance that may result when at least two exchanges are each hanging tight for locks held by the other to be delivered. In stop circumstance, no undertaking at any point gets completed and is in holding up state until the end of time.


Reference: https://www.tutorialride.com/dbms/deadlock-in-databases.htm

In the above diagram, Process P1 holds Resource R2 and waits for resource R1, while Process P2 holds resource R1 and waits for Resource R2. So, the above process is in deadlock state. There is the only way to break a deadlock, is to abort one or more transactions. Once, a transaction is aborted and rolled back, all the locks held by that transaction are released and can continue their execution. So, the DBMS should automatically restart the aborted transactions.

Deadlock Conditions

  1. Mutual Exclusion: In Mutual prohibition expresses that in any event one asset can't be utilized by more than each interaction in turn. The assets can't be divided among measures. 
  2. Hold and Wait: Hold and Wait expresses that a cycle is holding an asset, mentioning for extra assets which are being held by different cycles in the framework.
  3. No Preemption: No Preemption expresses that an asset can't be coercively taken from a cycle. Just an interaction can deliver an asset that is being held by it.
  4. Circular Wait: Circular Wait expresses that one cycle is sitting tight for an asset which is being held by second interaction and the subsequent interaction is hanging tight for the third interaction, etc. and the last interaction is hanging tight for the main cycle. It makes a circular chain of pausing.

Deadlock Prevention

  1. No Mutual Exclusion: No Mutual Exclusion implies eliminating every one of the assets that are sharable. 
  2. No Hold and Wait: Removing hold and wait condition should be possible if a cycle secures every one of the assets that are required prior to beginning
  3. Allow Preemption: Allowing preemption is just about as great as eliminating common prohibition. The lone need is to reestablish the condition of the asset for the acquired cycle as opposed to giving it access simultaneously as the preemptor.
  4. Removing Circular Wait: The circular wait can be taken out just if the assets are kept up in a chain of command and interaction can hold the assets in expanding the request for priority. 

Deadlock Avoidance

  • Deadlock Avoidance helps in avoiding the rolling back clashing transactions.
  • It is not acceptable way to abort a transaction when a deadlock occurs.
  • Rather deadlock avoidance should be utilized to detect any deadlock situation in advance.


2. Two Phase Locking Protocol

Two Phase Locking Protocol also known as 2PL protocol is a strategy for simultaneousness control in DBMS that guarantees serializability by applying a lock to the exchange information which blocks different exchanges to get to a similar information at the same time. Two Phase Locking convention assists with wiping out the simultaneousness issue in DBMS. 

This locking mechanism separates the execution phase of a transaction into three different parts.

  • In the first phase, when the transaction starts to execute, it requires authorization for the locks it needs.
  • The second part is where the exchange acquires every one of the locks. At the point when an exchange delivers its first lock, the third stage begins. 
  • In this third phase, the exchange can't request any new locks. All things considered, it just deliveries the gained locks.

Fig. Two phase locking protocol

Reference: https://www.guru99.com/dbms-concurrency-control.html

Two-Phase Locking convention allows each transaction to make a lock or unlock request in two stages:

  • Growing Phase: In this phase transaction may acquire locks but may not delivery any locks. 
  • Shrinking Phase: In this phase, an exchange may deliver locks yet not get any new lock 
The facts confirm that the 2PL convention offers serializability. In any case, it doesn't guarantee that stops don't occur. In the above-given chart, you can see that neighborhood and worldwide stop finders are looking for halts and tackle them with continuing exchanges to their underlying states.

Strict Two-Phase Locking Method

Strict-Two phase locking system is almost similar to 2PL. The only difference is that Strict-2PL never delivers a lock subsequent to utilizing it. It holds every one of the locks until the submit point and deliveries every one of the locks at one go when the cycle is finished.

Fig. Strict two phase locking protocol

Reference: https://www.javatpoint.com/dbms-lock-based-protocol

Centralized 2PL

In Centralized 2 PL, a single site is liable for lock the executives interaction. It has just one lock director for the whole DBMS.

Primary copy 2PL

Primary copy 2PL mechanism, many lock managers are conveyed to various destinations. From that point forward, a specific lock chief is answerable for dealing with the lock for a bunch of information things. At the point when the essential duplicate has been refreshed, the change is engendered to the slaves.

Distributed 2PL

In this kind of two-phase locking mechanism, Lock supervisors are dispersed to all locales. They are liable for overseeing locks for information at that site. In the event that no information is imitated, it is comparable to essential duplicate 2PL. Correspondence expenses of Distributed 2PL are very higher than essential duplicate 2PL.


3. Timestamp-based Protocols

Timestamp based Protocol in DBMS is an algorithm which uses the System Time or Logical Counter as a timestamp to serialize the execution of simultaneous exchanges. The Timestamp-based convention guarantees that each clashing peruse and compose activities are executed in a timestamp request. The more established exchange is constantly given need in this strategy. It utilizes framework time to decide the time stamp of the exchange. This is the most usually utilized simultaneousness convention. Lock-based conventions assist you with dealing with the request between the clashing exchanges when they will execute. Timestamp-based conventions oversee clashes when an activity is made.

Example:

Suppose there are their transactions T1, T2, and T3.

T1 has entered the system at time 0010

T2 has entered the system at 0020

T3 has entered the system at 0030

Transaction T1 will be execute first, then transaction T2 and lastly Transaction T3.

Advantages:

  • Schedules are serializable very much like 2PL protocols
  • No waiting for the transaction, which removes the possibility of deadlocks!
Disadvantages:
  • Starvation is possible if the same transaction is restarted and continually aborted


4. Validation-based Protocols

Validation based Protocol in DBMS also known as Optimistic Concurrency Control Technique is a strategy to stay away from simultaneousness in exchanges. In this convention, the neighborhood duplicates of the exchange information are refreshed as opposed to the actual information, which brings about less impedance while execution of the exchange. 

The Validation based Protocol is executed in the following three phases:

  • Read Phase
  • Validation Phase
  • Write Phase
Read Phase
In the Read Phase, the data values from the database can be read by a transaction but the write operation or updates are only applied to the local data copies, not the actual database.

Validation Phase
In Validation Phase, the data is checked to ensure that there is no violation of serializability while applying the transaction updates to the database.

Write Phase
In the Write Phase, the updates are applied to the database if the validation is successful, else; the updates are not applied, and the transaction is rolled back.

Here each phase has the following different timestamps:

Start(Ti): It contains the time when Ti began its execution.

Validation(Ti): It contains the time when Ti completes its read phase and starts its validation phase.

Finish(Ti): It contains the time when Ti completes its write phase.

  • This protocol is used to determine the time stamp for the exchange for serialization utilizing the time stamp of the approval stage, as it is the real stage which decides whether the exchange will submit or rollback.
  • Hence TS(T) = validation(T).
  • The serializability is resolved during the validation process. It can't be decided in advance.
  • While executing the transaction, it guarantees a greater degree of concurrency and furthermore less number of conflicts.
  • In this way it contains transactions which have less number of rollbacks.


Concurrency Control Problems

There are multiple problems that can arise in concurrent transaction scenarios. Some of the common problems are:

1. Dirty Read

Dirty read or temporary update problems occur while there is an incomplete transaction. In this situation, the data element or item got updated by one transaction and filed before completing it. Also another transaction attempts to access the data element before it is modified or rolled back to its last value.

Transaction T1Transaction T2
Read(X)

X=X-n

Write(X)

operation Failed

Read(X)

# X Value X-n

X=X+n1

Write(X)

Reference: https://www.educba.com/concurrency-control-in-dbms/?source=leftnav

Explanation: As shown in the table the transaction T1 reading a data item X as Read(X) operation and performs some arithmetic equation on the Value X  using a numeric value n with Write (X-n) operation. While the write operation in action, it got interrupted and not yet reverted to the databases, The Transaction T2 attempts to read the Value X through Read(X) operation that represents the value as X-n. This results in a dirty read problem.


2. Unrepeatable Read

Unrepeatable Read is the situation where at least two read operations read the same variable as different values and that value is modified by a alternate transaction by writing operations.

Transaction T1Transaction T2
Read(X)

X=X-n

Write(X)

Read(X)

Read(X)

Reference: https://www.educba.com/concurrency-control-in-dbms/?source=leftnav

Explanation: The table defines two transactions T1 and T2 where T1 reading the X variable and performs an arithmetic equation as X-n with numeric value n, At the Same time T2 reads the value X and captures the initial value of X. Next T1 executes a Write(X) operation and modified the value of X in the database. Thereafter T2 reads the X values again and this time it finds a different value of X due to the T1. This outcomes in an unrepeatable read problem.


3. Phantom Read

Phantom read problem refers to the situation where the Transaction reads a variable once and when it attempts to read the variable again it gets an error showing the variable doesn't exist, as the variable is deleted by another transaction.

Transaction T1Transaction T2
Read(X)

Delete(X)

Read(X)

Read(X)

Reference: https://www.educba.com/concurrency-control-in-dbms/?source=leftnav

Explanation: The Table shows T1 reads Variable X, simultaneously T2 reads X. The T1 Deletes X with Delete(X) operation, without T2 acknowledgment. While, T2 tried to read the variable X again, it not able to find the variable. This results in the phantom read problem.


4. Lost updates

Lost updates are the concurrency problem situation where modification to the variable done by a transaction is lost due to write operation by another transaction.

Transaction T1Transaction T2
Read(X)

X=X+n

X=X+n1

Write(X)

Reference: https://www.educba.com/concurrency-control-in-dbms/?source=leftnav

Explanation: The Table shows the T1 reads the variable X and changes the values by adding a arithmetical value n in the operation X=X+n statement. However, T2 performs X=X+n1 statement that overwrites the T1 arithmetic equation. This outcomes in a lost update problem for the T1 transaction.


5. Incorrect Summary

An incorrect Summary problem in concurrency control situation appears while a transaction applies an aggregate function to some of the variables while another transaction tries to modify the variable.

Transaction T1Transaction T2
Read(X)

Sum=0

Sum=Sum+X

Read(X)

X=X+n

Write(X)

Reference: https://www.educba.com/concurrency-control-in-dbms/?source=leftnav

Explanation: The tables show Transaction T1 reads the variable X and utilizations the Value of X to produce the aggregate value of Sum=Sum+X, whereas T2 reading the value of X, Modifies it by X=X+n statement and writes it to the database using Write(X) operation. It outcomes in an incorrect summary problem in T1.


Characteristics of Good Concurrency Protocol

Concurrency Control DBMS mechanism has the following objectives:

  • Must be versatile to site and communication failures.
  • It allows the equal execution of transactions to achieve maximum concurrency.
  • Its storage components and computational strategies should be modest to minimize overhead.
  • It must implement some constraints on the structure of atomic actions of transactions.

Conclusion

  • Concurrency Control in DBMS is a very useful technique to maintain mutually exclusive transactions for database operations. It handles the requests and streamlines the operations where more systems or processes trying to get the same database resource. It manages data integrity across systems and avoids the occurrence of transaction conflicts.
  • Lost Updates, dirty read, Non-Repeatable Read, and Incorrect Summary Issue are problems occurred due to lack of concurrency control.
  • Lock-Based, Two-Phase, Timestamp-Based, Validation-Based are various types of Concurrency handling protocols in DBMS
  • The lock could be either Shared (S) or Exclusive (X)
  • Two-Phase locking protocol which is also called as a 2PL protocol requires transaction should obtain a lock after it releases one of its locks. It has 2 stages growing and shrinking.
  • The timestamp-based algorithm utilizes a timestamp to serialize the execution of concurrent transactions. The protocol utilizes the System Time or Logical Count as a Timestamp.

Comments

Post a Comment