Extended Abstract
David Park,
Ulrich Stern, and
David Dill
Computer Science Department, Stanford University, Stanford, CA 94305
{parkit,uli,dill}@cs.stanford.edu
The Java programming language is becoming increasingly popular for writing multi-threaded applications. In particular, many Internet servers are written in Java. Since Java has multi-threading built in among other advantages, we expect it to gain popularity in other areas in which multi-threading is useful, like embedded systems.
Developing multi-threaded programs is notoriously difficult, however. Subtle program errors can result from unforeseen interactions among multiple threads. In addition, these errors can be very hard to reproduce since they often depend on the non-deterministic behavior of the scheduler and the environment.
It is hence desirable to provide tools for software developers that automatically detect errors due to multi-threading. The tools should also provide detailed error traces to help the developer during the debugging phase. We have developed such a tool based on model checking.
The main challenge in model checking is the state explosion problem -- the fact that the number of states in the model is frequently so large that model checkers exceed the available memory and/or the available time. We have incorporated some techniques that we found effective in alleviating the state explosion problem into our tool, and are in the process of adding further optimizations.
Our Java model checker explores all reachable states of a given Java program. The two sources of nondeterminism during this state exploration are the choice of the next thread to run (scheduling) and the input values from the environment. The model checker currently detects deadlocks and assertion violations.
Java allows dynamic creation of threads and objects. (In fact, a Java thread is an object of class java.lang.Thread.) To allow dynamic object creation, the model checker supports arrays whose size can vary dynamically.
The model checker is based on the SAL (Symbolic Analysis Laboratory) Intermediate Language. Using an intermediate language has two advantages. First, languages other than Java can be translated into the intermediate representation, reducing the effort to develop a model checker for each new language. Second, we are developing other analysis tools that accept SAL as their input language, with the advantage being that these tools can readily be used to analyze Java as well.
The model checker employs the following languages:
| Language | Comments |
| Java | input to our model checker |
| Java Byte-code | can also be used as input to our model checker |
| Jimple | McGill's 3-address representation for optimization |
| SAL Level 1 | generated by our Jimple to SAL translator |
| SAL Level 0 | translated into C++ by our SAL to C++ translator |
| C++ | compiled with our C++ model checking core to get the model checker |
The SAL language is used to model the Java (Jimple) program as a state transition system. SAL has state variables, nondeterministic inputs, an initialization function, and a transition function that is a collection of guarded command (or rules).
The state of the Java program is modeled in SAL using
the following state variables:
(1) each thread contains a program counter (PC), stack (array
of frames), and stack pointer (CurrentFrame)
(2) each object contains a class id, the fields, a counter for locking, etc.
To select the next thread to execute, we use a nondeterministic
input (TID).
Each Jimple statement is translated into a guarded command. For example, the Jimple statement
i0 = 1
is translated into
(PC[TID] = label_0) -->
next(Stack)[TID][CurrentFrame].localVariables.i0 = 1;
next(PC)[TID] = label_1;
where (PC[TID] = label_0) is the guard condition,
followed by the SAL statements than can be executed in case the
condition is true.
The translator has to deal with all "advanced" features of the Jimple language like inheritance, overriding, overloading, dynamic method lookup, exception handling, etc. To model exception handling, for example, the translator generates exception tables, subclass relationship tables, and rules that search the exception tables in the SAL language.
The SAL description of the program is translated into a C++ source file that is #included in the model checking code, and the result is then compiled into an executable that outputs a trace if any deadlock or assertion failure exists. The included source file contains:
As an initial step in combatting state explosion, we adopt the general strategy of executing a sequence of rules local to a thread atomically and interleaving threads only when a global operation is performed. We assume that accesses to shared variables are always protected by locks. The particular technique we use is adapted from [Brue99] and proceeds as follows.
Given some state, we generate its successors by taking each enabled thread in turn and executing it until a lock or unlock operation is performed. In Figure 1, the sequence of statements between Unlock(A) and Lock(B) must only modify variables local to the thread since we assume that accesses to shared variables must be protected by locks. Hence, it is unnecessary to interleave such statements with statements from other threads. Similarly, the synchronized region delimited by Lock(B) and Unlock(B) can be executed atomically since no other thread can access B while its lock is acquired.
synchronized(A) { | Lock(A)
... | ...
} | Unlock(A)
... | ...region X...
synchronized(B) { | Lock(B)
... | ...region Y...
} | Unlock(B)
|
Figure 1. Synchronized regions and atomic blocks
T1: T2: ========= ========= Lock(A) Lock(B) Lock(B) Lock(A) ... ... Unlock(B) Unlock(A) Unlock(A) Unlock(B) |
Figure 2. An example of a lock-cycle deadlock
Our algorithm actually treats the entire region between Unlock(A) and Unlock(B) as one atomic block. In other words, we execute threads atomically until some unlock operation is performed (or until the thread dies). The only problem this poses is that certain deadlocks can be missed, as illustrated in Figure 2. Our algorithm will never execute the schedule in which T2 locks B immediately after T1 locks A but before T1 locks B, since the sequence of statements from Lock(A) to Unlock(B) in T1 is executed atomically.
However, since larger atomic blocks result in fewer unnecessary interleavings, and thus fewer states, we delineate atomic blocks by unlocks only and detect lock-cycle deadlocks by analyzing states for the existence of a cycle in the hold-wait relationship between threads (cf. [Brue99]).
In the two tables below we give the examples on which we tried our model checker and the results we obtained for the examples. Note that we achieved larger state reductions for the larger examples. Since the current version of our model checker does not implement hash compaction and state descriptors are several thousand bytes, we could only enumerate 140,000 states before we ran out of memory.
| Example | Description | Primitives | # Prim. |
| Bruening's SplitSync | two threads access shared variable | synchronized | 2 |
| CS193k ReaderWriter | two reader and two writer threads | synchronized, wait/notify | 4 |
| CS193k TurnDemo | two threads synchronize using a semaphore | synchronized, wait/notify | 6 |
| NASA's classic | two threads communicate using events (deadlocks) | synchronized, wait/notify | 4 |
| NASA's ksu_pipe | 2-stage pipeline | synchronized, wait/notify | 4 |
| Example | Algorithm | States | Rules | Time | State Reduction |
new time |
| Bruening's SplitSync | noatomic lockunlock unlock |
1763 57 37 |
4090 77 43 |
|
1.0 53.1 95.1 |
|
| CS193k ReaderWriter | noatomic lockunlock unlock |
261838 848 528 |
1030130 2184 1356 |
>201s 1.91s 1.54s |
1.0 309 496 |
442s 3.5s 3.1s |
| CS193k TurnDemo | noatomic lockunlock unlock |
26145 385 166 |
68715 617 236 |
22.2s 0.64s 0.42s |
1.0 67.9 158 |
|
| NASA's classic | noatomic lockunlock unlock |
45924 322 143 |
118047 554 234 |
36.0s 0.63s 0.48s |
1.0 143 321 |
|
| NASA's ksu_pipe | noatomic lockunlock unlock |
3990883 28357 4991 |
14022723 92334 15762 |
6401s 51.5s 11.8s |
1.0 141 800 |
Finally, we intend to augment our model checker with adaptations of general partial order reduction methods, as well as with techniques from symmetry reduction and hash compaction. We plan to analyze how these conventional optimizations perform and interact within an environment of dynamic process instantiation and large data structures.