Concurrency Problems in Databases

Volume 1, Issue 1 ISSN: 2320-5288 International Journal of Engineering Technology & Management Research Journal homepage: www.ijetmr.org Concurrency Problems in Databases Preeti Sharma Parul Tiwari Abstract Day by day the need of database maintenance is increasing at an alarming rate. Databases are required at every level of an organization and moreover its maintenance and usability is the most important part. Data stored in the database is the core and soul of the organization and its processing. To preserve data integrity and to allow changes to be made by the user database models are used. Concurrency is an agreement of results to share resources by multiple users in multi-user applications that have a database on the backend, Concurrency issues are common. The task of detecting and handling concurrency issues is not only important, but also quite complicated. A database can have multiple transactions running at the same time, known as concurrent transactions or processes. Keywords- database, serialiazable, transactions. 1. INTRODUCTION Multi-user applications are mostly used in databases all over the world; this can allow random person/user to make or update the database. It is only a matter of time before two separate persons/ processes will try to update the same Paritosh Deore Richa Gupta piece of data simultaneously. There will be occasions where two users will both read the same data into memory, database concurrency issue can occur both with humans or automated processes or a combination of the two. A concurrency issue is more likely to occur with human users because the read/update/write cycle is likely to take much longer. However, that said, the same concurrency issue can occur between automated processes and then it's harder to solve because in the case of an update by a human you can ask what the user wants. As multiple users access the same data, there is always the possibility that one user's changes to a specific piece of data will be unwittingly overwritten by another user's changes. If this situation occurs, the accuracy of the information in the database is corrupted, which can render the data useless or, even worse, misleading. At the same time, the techniques used to prevent this type of loss can dramatically reduce the performance of an application system, as users wait for other users to complete their work before continuing. You can't solve this type of performance problem by increasing the resources available to an application because it's caused by the traffic visiting a piece of data, not by any lack of horsepower in the system that's International Journal of Engineering Technology & Management Research Vol 1 Issue 1 February 2013 166

handling the data. Fig.1 Concurrency Problems 2. TYPES OF CONCURRENCY PROBLEM There are four categories of concurrency problems : A. Lost Update Problem Consider a case of two users are about to change the same document in some data storage. For example, let user A retrieve a row first. After that, assume that user B retrieves the same row again ; however, B writes his update immediately, and in particular before A writes. Then, the changes made by user B are silently overwritten by the update performed by user A. This is known as the lost update problem. Such problems could also be avoided by writing the queries in another fashion. Lost updates occur when two/ more transactions select the same row and then update the row based on the value originally selected. Each transaction is unaware of other transactions. The last update overwrites updates made by the other transactions, which results in lost data. For example, two editors make an electronic copy of the same document. Each editor changes the copy independently and then saves the changed copy, thereby overwriting the original document. The editor who saves the changed copy last overwrites changes made by the first editor. This problem could be avoided if the second editor could not make changes until the first editor had finished. B. Dirty (uncommitted) Read Problem The dirty/ uncommitted read problem, also called uncommitted dependency problem, is caused when a transaction (say A) retrieves, or even worse, update a row updated by another uncommitted transactions (say B). We need to discuss two cases in this context. Quite often in database processing, we come across a situation where one transaction can change a value, and another transaction can read this value before the original change has been committed or rolled back. This is known as a dirty read scenario because there is always the possibility that the first transaction may rollback the change, resulting in the second transaction having read an invalid value. While you can easily command a database to disallow dirty reads, this usually degrades the performance of your application due to the increased locking overhead. Disallowing dirty reads also leads to decreased system concurrency. C. Non Repeatable Read Problem The non-repeatable read problem, also known as the the Inconsistent analysis problem, occurs when a transaction sees two different values for the same row within its lifetime. For example let us assume that transaction A retrieves a row from a table, after which transaction B updates the same row and commits its changes. Immediately transaction A once again retrieves the same row from the same table. As transaction B has updated the row in the meantime, transaction A would now see a value that is different from the value it had retrieved earlier. We can see that transaction B cannot repeat its reading operation (actually, the International Journal of Engineering Technology & Management Research Vol 1 Issue 1 February 2013 167

results of its read operation do not repeat). Hence the name non-repeatable read. Dirty Read occurs when one transaction updates a database item and then the transaction fails This problem is also known as Un- Committed Data Is consequent of reading updates made by a transaction before it has successfully finished Non Read Repeatable occurs when a transaction calculate some aggregate function over a set. This problem is known as Inconsistent Analys is Is a computational anomaly associated with interleaved execution of transactions. Fig.2 Dirty Read and Non Repeatable Read D. Phantom Read Problem The phantom read problem also known as phantom insert problem is a special case of the non-repeatable read problem. We know problem occurs when transaction does some updates before another transactions. A phantom read occurs when, in the course of a transaction, two identical queries are executed, and the collection of rows returned by the second query is different from the first. This can occur when range locks are not acquired. transaction B inserts a new row in the table, and performs a COMMIT operation. A typical update cycle consists of a sequence of actions: Read data into memory, Update data in memory, and Write data back to the database. While this degree of isolation between transactions is generally desirable, running many applications in serializable mode can seriously compromise application throughput. Complete isolation of concurrently running transactions could mean that one transaction cannot perform an insertion into a table being queried by another transaction. In short, real-world considerations usually require a compromise between perfect transaction isolation and performance. E. Serializable Isolation: Serializable isolation is for surroindings; With huge databases and small transactions to update only a little amount of tupples, Where the option that two concurrent transactions will modify same rows is low, Where relatively longrunning transactions are primarily read only. Serializable isolation allows concurrent transactions to make only those database modifications which are made if the transactions had been scheduled to run one after another. When a serializable transaction fails with the cannot serialize access error, the application can take any of actions; Commit the work executed to that point, Execute more (but different) statements (perhaps after rolling back to a save point established earlier in the transaction), delete the entire transaction. 3. DATABASE TRANSACTION: ACID RULE Fig.3 Phantom Read Problem In the given case, the sequence of events is as follows; At time t 1, transaction A reads some rows of a table. At time t 2, A database transaction is a combination of interaction with database management system or same system that is treated in a coherent and reliable way free of other transactions. In general, a database International Journal of Engineering Technology & Management Research Vol 1 Issue 1 February 2013 168

transaction must be atomic, it means that it must be either entirely completed or aborted. Ideally, a database system will guarantee the properties of Atomicity, Consistency, Isolation and Durability (ACID) for every transaction. In practice, these properties are relaxed to provide better performance. In database products the ability to handle transactions allows the user to ensure that integrity of a database is maintained. A transaction might need several queries, each reading or writing information in the database. When this happens it is usually important to be sure that the database is not left with only some of the queries outcome. For instance, when doing a money transfer, if the money was debited from one account, it is important that it also be gained to the depositing account. Also, transactions should not interfere with each other. For more information about desirable transaction properties, see ACID. 4. CONTROL MECHANISM A database system optimized for inmemory storage can support much higher transaction rates than current systems. However, standard concurrency control methods used day to day cannot scale to the high transaction rates achievable by such systems. Two efficient concurrency control methods specifically designed for main-memory databases. Both use multi versioning to separate read-only transactions from updates but differ in how atomicity is ensured: one is optimistic and one is pessimistic. To avoid expensive context switching, transactions never block during normal processing but they may have to wait before commit to ensure approved serialization ordering. We also implemented main-memory optimized version of single-version locking. Experimental results prove that while single-version locking works well when transactions are short and contention is low performance degrades under many demanding conditions. The multisession schemes have higher overhead but are much less sensitive to hotspots and the presence of long-running transactions. The purpose of concurrency control is to prevent two no similar users (or two different connections by the same user) from trying to update the same data at the same time. Concurrency control can also prevent one person from seeing out-ofdate data while another user is updating the same data 5. CONCLUSION Currency control overlaps the commit period of an earlier transaction with the execution of later transactions, works well provided that workload is made of one partition transactions, abort rate is low, and only a few transactions involve multiple rounds of communication. Concurrency Control is a problem that arises when multiple processes are involved in any part of the system. Earlier issues of notions of serializability and the study of two-phase locking were discussed. We continue to learn of new ideas such as flexible transactions, valuedates, prewritten, degrees of commitment and view serializability [9]. In large scale systems, it is difficult to block access to database objects for transactions. If a system has to perform 10,000 transactions per second. International Journal of Engineering Technology & Management Research Vol 1 Issue 1 February 2013 169

REFRENCES : [1] Atul Kahate, Introduction to Database Management System, 2012. [2] Database Management Systems 3 Rev ed Edition, Raghu Ramakrishna, Tata McGraw-Hill Education (2009). [3] Database Management System, Seema Keda, Technical Publications. International Journal of Engineering Technology & Management Research Vol 1 Issue 1 February 2013 170