The Java Memory Model

Size: px

Start display at page:

Download "The Java Memory Model"

Jack Clifford Carroll
5 years ago
Views:

1 The Java Memory Model Presented by: Aaron Tomb April 10, Introduction 1.1 Memory Models As multithreaded programming and multiprocessor systems gain popularity, it is becoming crucial to define a set of rules to uniformly describe data sharing between multiple threads (which can also be extended to data sharing between two processors). To put it very simply, this set of rules which will define how memory accesses from various threads should interact is known as a memory model. A memory model should specify How the reads and the writes mentioned in the program will be executed, so that the programmer is able to correctly predict what value each read operation will return The optimizations of transformations that can be safely applied at each level of the system (compiler, virtual machine and thr hardware) The possible outcomes for various situations in programs which can help the programmer to identify which design is legal for communicating between threads. 1.2 The early Java Memory Model Java was touted as a programming language with an inbuilt support for multithreading, (unlike C, C++ which use external libraries to implement it) and it was defined with a memory model. However, this early memory model was extremely cyrptic and misunderstood by most. This resulted in violations of rules, due to which integrity of data could not be guaranteed, and common compiler optimizations involving reordering of instructions had to be prohibited. The new Java Memory model which has been incorporated in Java 5.0, proposes a new memory model which provides a simple interface for correctly synchronized programs. It clearly defines a boundary for incorrectly synchronized code based on Causality which ensures safety and integrity of data, and at the same time adds flexibility for legal compiler optimizations. 2 Evolution of the memory model The easiest and most intutive way of deciding how instructions are to be executed was to specify that memory actions must appear to execute one at a time in a single total order and actions of the 1

same thread must appear in the same order in which they appear in the program.this is known as a sequentially consistent model and it is illustrated in the figures shown below.

Figure 1-a shows a sample code while Figure 1-b shows the possible sequences of program execution This model is consistent with a programmers intution as to how the execution should occur.

2 same thread must appear in the same order in which they appear in the program.this is known as a sequentially consistent model and it is illustrated in the figures shown below. The figure on the left shows a random code sequence and the figure on the right shows the possible orders in which the lines can be executed to maintain sequential consistency (a) (b) Figure 1: Figure 1-a shows a sample code while Figure 1-b shows the possible sequences of program execution This model is consistent with a programmers intution as to how the execution should occur. The glaring disadvantage of this model is that (very common) compiler optimizations involving reordering have to be restricted, even if there are no data or control dependencies, since there are possible reorderings which can destroy sequential consistency. The main goal in defining new JMM was to allow transformations and optimizations that are used in current compilers today, since this restriction on optimizations (both compilers and hardware) is expensive. Since these transformations are a direct violation of sequential consistency, the JMM cannot use the sequentially consistent memory model. An alternate model which stressed on correct synchronization thereby avoiding data race conditions was formulated. This data-race-free model approach is used by the new Java memory model. As long as the code is correctly synchronized, the new JMM would guarantee sequential consistency. Another possible (futuristic) optimization in compilers was to emulate the speculative execution of writes (as in hardware). This possibility should also be addressed by the JMM, particularly to guarantee that speculative write execution did not result in some secret value being produced out of thin air and causing some serious security violation. 2.1 What is synchronized and data-race-free code? To define synchronization and data-races, a couple of conditions and/or definitions are listed Conflicting Accesses An Access is either a write or a read to a variable. If there are two accesses to a variable, and at least one is a write, there is a conflict.

3 Synchronization Actions A set of tools used to ensure sync. i.e. locks, unlocks, read/write to a volatile variable. Synchronization Order As per this condition, once code is synchronised, there is a total order over all these synchronization actions. A synchronization order which guaranteed that the program order was maintained and a read to a volatile variable return the most recent value written to it by the previous ordered write was said to be data-race-free. Synchronizes with order This relation between two actions, which declares that the two are aware of each other, and are synchronized. Happens-before order This is also a relation between two actions, which establishes the order in which they have to be executed. With the help of thse definitions, it is easy to define a Data-race. Two accesses x and y form a data-race if and only if they are from different threads, they conflict and are not ordered by happens-before. A program is said to be data-race free if and only if all sequentially consistent executions of the program are free of data races. 2.2 Happens-before memory model Figure 2: Sample Code 2 Using the definitions listed above, a simple model can be outlined, viz. the Happens-before memory model. In this model, the synchronization actions are are ordered in a total order,and within each execution, the actions are ordered in the program order. In addition, we also impose the rule that reads to a variable can see the writes to that variable, unless the write follows the read,

or there is another interposing write which ordered by the happens-before relationship with the read. Consider the sample code shown in Figure 2. In this case, r1 can assume a value of either 2 or 4.

4 or there is another interposing write which ordered by the happens-before relationship with the read. Consider the sample code shown in Figure 2. In this case, r1 can assume a value of either 2 or 4. The other three values are not possible if we follow the Happens-before model. While the Happens-before-model seems to satisfy our requirements, there are cases when there are violations of sequential consistency as required by the model. Now, consider the sample code shown in Figure 3 Figure 3: Sample Code 3 The above sample code in Figure 3 is correctly synchronized, if we consider r1=r2=0, and if we allow only this result. However, if compilers were able to speculate the write, there is a possibility that an out-of-thin-air value is predicted, which can compromise data security. This clearly tells us that we have to add additional conditions/semantics to this model to handle such situations. 2.3 Causality If we carefully observe the sample lines of code shown in Figure 2 and Figure 3, we will notice that each time, the action that caused the illegal writes to occur was caused by the writes themselves. In Figure 2, the value written by the write is used to justify the value it writes, while in Figure 3, the occurence of the write is used to justify the fact that the write will execute. This is referred to as circular causality, and this behaviour should strictly be disallowed. However, it is not possible to label all instances of circular causality as illegal. For example, look at the code shown in Figure 4. This is not an incorrect instance circular causality, since a valid compiler optmization is possible: Replace the redundant variable a with r2=r1. Now r1 == r2 is always true and the conditional branch will be eliminated. 2.4 Integrating Causality with the JMM The main challenge is now identifying a means to classify causality, so that valid outcomes (e.g. Figure 4) are allowed and violations (e.g. Figure 2 and Figure 3 ) are prevented. The first step is to identify the (subtle) difference between the valid and the invalid lines of code. If we try to execute the code in Figure 4 sequentially, we see that we endup with the result b = 42, whereas the same is not the case with Figure 3 or 2. Putting it simply, a valid execution is one which allows an action or a sequence of actions to be committed if there is some valid sequential execution, which contains the actions committed. Refining this observation further, the authors conclude that: The early execution of an action does not result in an undesirable causal cycle if its occurence is

5 Figure 4: Redundant read elimination not depending on a read returning a value from a data race The new java memory model has been defined within the framework of the Happens-before memory model, with the additional clause of causality as defined above.

Java Memory Model. Jian Cao. Department of Electrical and Computer Engineering Rice University. Sep 22, 2016

Java Memory Model. Jian Cao. Department of Electrical and Computer Engineering Rice University. Sep 22, 2016 Java Memory Model Jian Cao Department of Electrical and Computer Engineering Rice University Sep 22, 2016 Content Introduction Java synchronization mechanism Double-checked locking Out-of-Thin-Air violation