Unit Testing in Java with an Emphasis on Concurrency Corky Cartwright Rice and Halmstad Universities Summer 2013
Software Engineering Culture Three Guiding Visions Data-driven design Test-driven development Mostly functional coding (no gratuitous mutation) Codified in Design Recipe taught in How to Design Programs by Felleisen et al (available for free online: www.htdp.org [first edition], www.ccs.neu.edu/home/matthias/htdp2e/draft, [second edition]) and Elements of Object-Oriented Design (available online at. The target languages are Scheme and Java.
Moore s Law
Extrapolate the Future
Timeliness CPU clock frequencies stagnate Multi-Core CPUs provide additional processing power, but multiple s needed to use multiple cores. Writing concurrent programs is difficult!
Tutorial Outline Introduce unit testing in single-ed (deterministic) setting using lists Demonstrate problems introduced by concurrency and their impact on unit testing Show how some of the most basic problems can be overcome by using the right policies and tools.
(Sequential) Unit Testing Unit tests Test parts of the program (including( whole!) Integrate with program development; commits to repository must pass all unit tests Automate testing during maintenance phase Serve as documentation Prevent bugs from reoccurring Help keep the code repository clean Help keep the code repository clean Effective with a single of control
Universal Test-Driven Design Recipe Analyze the problem: define the data and determine top level operations. Give sample data values. Define type signatures, contracts, and headers for all top level operations. In Java, the type signature is part of the header. Give input-output examples including critical boundary cases for each operation. Write a template for each operation, typically based on structural decomposition of primary argument (the receiver in OO methods). Code each method by filling in templates Test every method (using I/O examples!) and ascertain that every method is tested on sufficient set of examples. White-box testing matters!
Sequential Case Studies: Functional Lists and Bi-Lists A List<E> is either Empty<E>(), or Cons<E>(e, l) where e is an E and l is List<E> A BiList<E> is a mutable data structure containing a possibly empty sequence of objects of type E that can be traversed in either direction using a BiListIterator<E>.
Review Elements of Sequential Unit Testing Unit tests depend on deterministic behavior Known input, expected output Success Failure correct behavior flawed code Outcome of test is meaningful if test is deterministic
Problems Due to Concurrency Thread scheduling is nondeterministic and machine-dependent Code may be executed under different schedules Different schedules may produce different results Known input, expected output(s?) Success correct behavior in this schedule, may be flawed in other schedule Failure flawed code Success of unit test is meaningless
Recommended Resources on Concurrent Programming in Java Explicit Concurrency: Comp 402 web site from 2009 Brian Goetz, Java Concurrency in Practice (available onlne at this website) Coping with Multicore Emerging parallel extensions of Java/Scala that guarantee determinism (in designated subset) and do not require explicit synchronization and avoid JMM issues Habanero Java Habanero Scala
Problems Due to Java Memory Model JMM is MUCH weaker than sequential consistency Writes to shared data may be held pending indefinitely unless target is declared volatile or is shielded by the same lock as subsequent reads. Why not always use locking (synchronized)? Significant overhead Increases likelihood of deadlock Extremely difficult to reason about program execution for specific inputs because so many schedules are allowed. A model that accommodates compiler writers rather than software developers.
Hidden Pitfalls in Using JUnit to Test Concurrent Java Junit Is Completely Broken for Concurrent Code Units: Fails to detect exceptions and failed assertions in s other than the main (!) Fails to detect if auxiliary is still running when main terminates; all execution is aborted when main terminates. Fails to ensure that all auxiliary s were joined by main before termination. (In Habanero Java, all programs are implicity enclosed a comprehensive join called finish() but not in Java.)
Possible Solutions to Concurrent Testing Problems Programming Language Features Ensure that bad things cannot happen; perhaps ensure determinism (reducing testing to sequential semantics!) May restrict programmers Comprehensive Testing Testing if bad things happen in any schedule All schedules may be too stringent for programs involving GUIs Does not limit space of solutions but testing burden is greatly increased. Good testing tools are essential.
Coping with the Java Memory Model Avoid using synchronized and minimize the size of synchronized blocks to reduce likelihood of deadlock. Identify all classes that can be shared and make all fields in such classes either final or volatile. Ensures sequential consistency (almost). Array elements are still technically a problem because they cannot be marked as volatile. The ConcurrentUtilities library includes a special form of array with volatile elements.
Improvements to Junit ConcJUnit developed by my former Uncaught exceptions and failed assertions graduate student Mathias Ricken fixes Not caught in child s all of the problems with Junit. Developed for Java 6; Java 7 not yet supported. Mathias developed some other tools to help test concurrent programs but none of them have yet reached production quality (e.g., random delays/yields). Research idea: JVM from Hell.
Sample JUnit Tests public class Test extends TestCase { public void testexception() { throw new RuntimeException("booh!"); public void testassertion() { assertequals(0, 1); if (0!=1) throw new AssertionFailedError(); Both tests fail.
Problematic JUnit Tests Main public class Test extends TestCase { public void testexception() { new Thread(new Runnable() { public void run() { throw thrownew new RuntimeException("booh!"); ).start(); Child Main Child spawns uncaught! end of test success!
Problematic JUnit Tests Main public class Test extends TestCase { public void testexception() { new Thread(new Runnable() { public void run() { throw thrownew new RuntimeException("booh!"); ).start(); Child Main Child spawns uncaught! end of test success!
Problematic JUnit Tests Main public class Test extends TestCase { public void testexception() { new Thread(new Runnable() { public void run() { throw thrownew new RuntimeException("booh!"); ).start(); Child Uncaught exception, test should fail but does not!
Problematic JUnit Tests Main public class Test extends TestCase { public void testfailure() { new Thread(new Runnable() { public void run() { throw fail("this thrownew new RuntimeException("booh!"); fails!"); ).start(); Child Uncaught exception, test should fail but does not!
Thread Group for JUnit Tests Test public class Test extends TestCase { public void testexception() { new Thread(new Runnable() { public void run() { throw thrownew new RuntimeException("booh!"); ).start(); Child invokes checks TestGroup s Uncaught Exception Handler
Thread Group for JUnit Tests Test public class Test extends TestCase { public void testexception() { new Thread(new Runnable() { public void run() { throw thrownew new RuntimeException("booh!"); ).start(); Child spawns and waits resumes Main Test Child spawns uncaught! end of test invokes group s handler check group s handler failure!
Improvements to JUnit Uncaught exceptions and failed assertions Not caught in child s Thread group with exception handler JUnit test runs in a separate, not main Child s are created in same group When test ends, check if handler was invoked Detection of uncaught exceptions and failed assertions in child s that occurred before test s end Past tense: occurred!
Child Thread Outlives Parent Test public class Test extends TestCase { public void testexception() { new Thread(new Runnable() { public void run() { throw thrownew new RuntimeException("booh!"); ).start(); Child spawns and waits resumes Main Test Child spawns uncaught! end of test invokes group s handler check group s handler failure!
Child Thread Outlives Parent Test public class Test extends TestCase { public void testexception() { new Thread(new Runnable() { public void run() { throw thrownew new RuntimeException("booh!"); ).start(); Child check group s spawns and waits resumes handler Main Test Child spawns end of test success! uncaught! Too late! invokes group s handler
Enforced Join public class Test extends TestCase { public void testexception() { Thread new Thread(new t = Thread(new Runnable() Runnable() { { public void run() { throw throw new new RuntimeException("booh!"); RuntimeException("booh!"); ); t.start(); t.join(); Test Child
Testing Using ConcJUnit Replacement for junit.jar or as plugin JAR for JUnit 4.7 compatible with Java 6 (not 7 or 8) Available as binary and source at http://www.concutest.org/ Results from DrJava s unit tests Child for communication with slave VM still alive in test Several reader and writer s still alive in low level test (calls to join() missing) DrJava currently does not use ConcJUnit Tests based on a custom-made class extending junit.framework.testcase Does not check if join() calls are missing
Conclusion Improved JUnit now detects problems in other s Only in chosen schedule Needs schedule-based execution Annotations ease documentation and checking of concurrency invariants Open-source library of Java API invariants Support programs for schedule-based execution
Future Work Adversary scheduling using delays/yields (JVM from Hell) Schedule-Based Execution (Impractical?) Replay stored schedules Generate representative schedules Dynamic race detection (what races bugs?) Randomized schedules (JVM from Hell) Support annotations from Floyd-Hoare logic Declare and check contracts (preconditions & postconditions for methods) Declare and check class invariants
Extra Slides
Tractability of Comprehensive Testing Test all possible schedules Concurrent unit tests meaningful again Number of schedules (N) t: : # of s, s: : # of slices per detail
Extra: Number of Schedules Product of s-combinations For 1: choose s out of ts time slices For 2: choose s out of ts-s time slices For t-1: choose s out of 2s time slices For t-1: choose s out of s time slices W C L back
Tractability of Comprehensive Testing If program is race-free, we do not have to simulate all switches Threads interfere only at critical points : lock operations, shared or volatile variables, etc. Code between critical points cannot affect outcome Simulate all possible arrangements of blocks delimited by critical points Run dynamic race detection in parallel Lockset algorithm (e.g. Eraser by Savage et al)
Critical Points Example Local Var 1 Thread 1 lock access unlock All lock accesses access unlock protected by lock Shared Var Lock Thread 2 All accesses protected by lock lock access unlock All accesses protected by lock Local variables don t need locking Local Var 1
Fewer Schedules Fewer critical points than switches Reduces number of schedules Example: Two s, but no communication N = 1 Unit tests are small Reduces number of schedules Hopefully comprehensive simulation is tractable If not, heuristics are still better than nothing
Limitations Improvements only check chosen schedule A different schedule may still fail Requires comprehensive testing to be meaningful May still miss uncaught exceptions Specify absolute parent group, not relative Cannot detect uncaught exceptions in a program s uncaught exception handler (JLS limitation) details
Extra: Limitations May still miss uncaught exceptions Specify absolute parent group, not relative (rare) Koders.com: 913 matches ThreadGroup vs. 49,329 matches for Thread Cannot detect uncaught exceptions in a program s uncaught exception handler (JLS limitation) Koders.com: 32 method definitions for uncaughtexception method back
Extra: DrJava Statistics Unit tests passed failed not run Invariants met failed % failed KLOC event 2004 736 610 36 90 5116 4161 965 18.83% 107 1 2006 881 881 0 0 34412 30616 3796 11.03% 129 99 back