The current dominant paradigm in synchronizing multi-threaded programs
-based. One associates a lock with a piece of data, and by convention, any thread that wishes to read or modify this data must acquire this lock. The lock provides the guarantee of mutual exclusion
: only one thread may have acquired the lock at any point in time. All other threads that wish to do so must wait. Lock-based parallel programming comes with a number of problems
of which have been explored elsewhere
. With chip designers, for the time being, having given up on increased clock speed
and opted instead for increased parallelism
, it has become increasingly important to find a better paradigm for concurrent programming.
Software Transactional Memory (STM) is one such alternative paradigm that in recent years has come into favor within the research community. STM, to first approximation, is to lock-based concurrent programming what garbage collection is to manual dynamic memory management. It abstracts away the complexities of synchronization into the programming language or run-time system, and the sole responsibility of the programmer is to designate certain sections of code that access shared memory as atomic. The run-time system then executes this section as a transaction, ensuring that these sections execute in isolation from other threads, and that they either take effect (i.e., are visible o the rest of the world) all at once, or not at all. Typically, this is accomplished by executing the atomic section on either local copies of shared objects, or by using behind-the-scenes locking. However, it is important in either case that a transaction can be aborted, i.e. have its changes un-done and restarted, so that deadlock is avoided. Typically, a transaction will run until it comes into conflict with another transaction. This happens when the read or write set of one transaction intersects with the write set of another. At this point, one transaction must be aborted in favor of the other and restarted.
The question of nesting in transactional memory may seem at first glance to be purely academic: after all, why would anyone write one atomic section inside another? But it is important to note that the nesting of transactional scopes is dynamic rather than static, and there may exist atomic sections at different levels of abstraction. For example, the author of a STM-based hash table may use an STM-based linked list internally. Then, the atomic insert method of the hash table will use the atomic insert method of the linked list, and one has nested transactions.
The easiest way to deal with nested transactions is flat nesting
. In this scenario, the inner transaction is merged entirely with the outer one. The advantage of this option is its simplicity, as in most implementations of STM it merely involves eliding the transaction-specific actions of the inner atomic section. The disadvantage is that no additional parallelism is exploited from the extra information the programmer has provided: if the inner transaction gets aborted, the entire transaction stack is aborted and started again.
This variant is somewhat more complex. With closed nesting
, the beginning of each inner atomic section essentially acts as a checkpoint. When a transaction conflicts with a closed-nested transaction, only the actions taken in the nested transaction are rolled back and retried. For example, consider the previously mentioned atomic hash table. Suppose two threads try to insert two objects that hash to the same bucket. They would conflict and one would abort when they both tried to insert into the same linked list, but rather than the aborted transaction retrying from the beginning of the outer insert method, it would only need to retry the insert on the linked list. Clearly this involves less wasted work than flat nesting, it is semantics-preserving, and it is usually not difficult to implement either.
is the most complex variant, both in terms of implementation and in terms of the programming model presented; as such, it is mostly recommended for use by library writers who know very, very well what they are doing. STMs that allow for open nesting almost always have closed nesting as the default model. An open nested transaction that commits does so as though it were a top-level transaction: its actions become visible to the rest of the world. Obviously, this is a problem if its parent transaction aborts: its actions must be un-done, but it has already committed! The up-shot is that this can often allow for greatly increased parallelism. However, the STM must also provide certain tools to the programmer to allow him to maintain correctness. The essential idea is to break physical serializability
(the idea that every transaction is seen to happen all at once at one atomic point in time) when the programmer knows that he can still maintain abstract serializability
using the other tools provided by open nesting.
At a minimum, these tools must include compensation actions and abstract locks. A compensation action is the abstract way of undoing a transaction. When a normal transaction aborts, its physical memory writes are rolled back. When an open nested transaction aborts, the same thing happens; but when its parent aborts, it's already been committed, so instead, the programmer provides a compensation action that is run when the parent aborts to undo the effects of the committed child. For example, an open nested insert action on a hash table might have, as its compensation action, code that executes a remove on the hash table. Typically, these compensation actions are represented as part of a larger set of handlers that are run on various STM-related events. For example, the open nesting STM of Ni, Menon, et al., provides handlers for
onvalidation, which execute when the transaction's parent aborts, the transaction's parent commits, the top-level transaction commits, and the transaction's parent is validated, respectively. The
onabort handler is clearly where compensation actions go; the others have various uses in maintaining correctness as well (e.g. the
onvalidation handler can be used to veto a transaction based on abstract locking violations, which are soon to be explained).
Compensation actions are used to undo the effects of erroneously committed open nested transactions; what about other transactions that might have seen (and acted on) these effects in between? This is where abstract locking comes in. Abstract locks are used to prevent committed data that might yet be rolled back from being used inappropriately. To see an example, let us once again turn to our atomic hash table that uses an atomic linked list for each bucket -- in this case, an open atomic linked list (the hash table can be open or closed -- it doesn't matter). So when an object is inserted into one of the linked lists, the insert happens open atomically, and in the absence of any abstract locking, that insert is visible to the rest of the world. But this should not be the case; other transactions should not be able to see the object in the list, get the length of the list (because that leaks information about the commit), or delete the object from the list. To enforce this, the insert operation on the list will specify that abstract locks must be obtained during the course of this transaction on either the list object or the inserted object with various lock modes that indicate what other transactions can and cannot do; if these locks are already held, the transaction must abort. For example, both inserting
o and deleting
o will mandate that an abstract lock in exclusive mode be obtained on
o; this ensures that once
o has been inserted, it cannot be deleted before the lock is released. Both abstract locks and handlers are inherited by their transaction's parent and held until it commits or aborts (at which point it is inherited further up the tree); however, an open nested transaction that commits discards the handlers and locks of child transactions by default, although the programmer can choose to keep them.
If this sounds like harkening back to the bad old days of locks, that intuition is somewhat correct, which is why the proponents of open nesting stress that it is only meant for use by expert programmers. Open nested transactions explicitly trade some of the guarantees of transactional memory for increased performance, and anyone who is not prepared to or capable of spending a lot of time reasoning about concurrency and correctness should stay well clear. Open nesting researchers advocate strong barriers between the different open layers of abstraction to help ensure correctness, which is why the default behavior is to discard the locks and handlers of child transactions; the idea is that any data structures you use as part of your open data structure should only be accessible through your open data structure, so abstract rollback and locking should be done at the abstraction level of your data structure, not the one it uses.
With all this complexity, one would be forgiven for wondering why it is worth it. To see how open nesting enables additional parallelism, consider the case where two transactions execute an insert on our hash table with keys that map to the same bucket, i.e. the same linked list. If the list were closed-nested, the two inserts would likely modify the same physical memory locations and thus conflict; only one could commit, and the system may even livelock trying to arbitrate between the two. However, with open nesting, once one has committed, the other can freely modify memory locations that the other touched in order to insert its own element. Though inserts and deletes on the same object have an abstract conflict (represented by an abstract lock to that effect), inserts and deletes of separate objects can occur with no conflict for abstract serializability, allowing for increased concurrency -- and indeed, preliminary results do show that data structures built with open nesting can be significantly faster than similar closed nested data structures.
Ni Y, Menon V, Adl-Tabatabai A-R, Hosking AL, Hudson RL, Moss JEB, Saha B, Shpeisman T. "Open nesting in software transactional memory." Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP):68-78 (San Jose, California, March 2007).