Here is a brief summary of locking in Kaffe. There are several layers to the locking abstractions in Kaffe. Ideally these layers are collapsed by the compiler. Java Object Locks ================= files: kaffe/kaffevm/locks.[hc] lockObject(H_java_lang_Object*) unlockObject(H_java_lang_Object*) These should only be used for Java object monitors. They just take a pointer to the Java object. The intent is that if we ever create a facility to log the execution of all monitorenter/exit bytecodes, entries and exits of synchronized methods and JNI MonitorEnter/Exit primitives, we'll be able to do it by tweaking this function. These functions are implemented as a thin layer over the VM-internal locks (See the next section). NOTE: these are functions, and must remain so, since their addresses are taken in some engines. The wait and signal functions on Java objects map directly to the wait and signal functions in the VM-internal locks (see the next section). XXX There should probably be waitObject(), signalObject(), and broadcastObject() interfaces. Currently the waitCond(), signalCond() and broadcastCond() interfaces are just invoked directly. XXX should be a slowUnlockObjectIfHeld(), too (for stack unwinding due to an exception). NOTE: Two other functions are part of this interface: slowLockObject(H_java_lang_Object*, void*); slowUnlockObject(H_java_lang_Object*, void*); these are used by some engines when the inlined "fast lock" fails. See the "Kaffe Fast Locking" section, below. There need not be "slow" versions of the wait/signal/broadcast functions. VM-internal object locks ======================== files: kaffe/kaffevm/locks.[hc] types: iLock lockMutex lockStaticMutex unlockMutex unlockStaticMutex waitCond waitStaticCond signalCond signalStaticCond broadcastCond broadcastStaticCond There are two flavors of VM-internal locking interfaces. One for dynamic locks (left column) and one for static locks (right column). Dynamic locks are generally used to protect VM-internal dynamic structures (e.g., classes). Static locks are generally global locks (e.g., the gc lock or finalizer lock) or locks that need a specific out-of-order initialization. The VM must use its own internal locks for most internal state, to avoid race conditions with arbitrary Java code that uses object locks. (For example on threads and classes, there are both logical Java-level locks, and VM-internal locks). There is one instance of using the VM-internal interfaces to lock a Java-level object: when JIT-compiling a method, a lock on its Class is held. Logging such a lock via the lockObject() interface would be unexpected: it would only be logged when the JIT was in use. Moreover, it is not mandated nor directed by any of the Java monitor facilities. Therefore, this case uses the VM variant, and so should similar cases that ever come up. These locks are recursive. Despite the disjoint names, the same iLock object supports both the lock/unlock and wait/signal/broadcast interfaces. The non-static interfaces are a bit odd in that the argument passed in must be a pointer to a struct with a "lock" field. The lock field must be an iLock pointer. The static interfaces take a pointer to a iLock pointer. To avoid the use of building heavy locks you must only lock once. If you exceed that limit a heavy lock is allocated and kept in memory until the lock is destroyed. The new implementation(*) heavy lock is actually not that slow as it must only acquire the heavy lock by putting a 1 into some field and then increment the lock counter when called recursively. (*) As of 2005-03-11 Threading System Synchronization Primitives =========================================== The VM-internal locking scheme is implemented on top of the primitives provided by the Kaffe threading system. The underlying threading system has two choices for implementing the lowest layer of locking in Kaffe. First is the Ksem interface which provides a semaphore primitive on which the Kaffe run-time bases its "heavy locks". (See the "Kaffe Fast Locking" section below.) The second option is to implement the "jmutex/jcondvar" interface (which is just used to fill out a default Ksem implementation.) A. The Ksem interface: --------------------- files: kaffe/kaffevm/ksem.h, threading system lock files types: Ksem ksemInit(sem) ksemGet(sem,timeout) ksemPut(sem) ksemDestroy(sem) The Ksem is a binary semaphore, and need not be recursive. The Ksem is initialized to 0 (in semaphore-speak this means the resource is not available). ksemGet() can timeout, thus it returns a boolean indication of whether it acquired the semaphore or not. The timeout is specified in milliseconds. As a side-effect of Kaffe's fast locking scheme, there is a one-to-one mapping between Ksem and threads. Also, all Ksem's are dynamically allocated when a thread is created and all are initialized before being used. B. The jmutex/jcondvar interface: -------------------------------- files: kaffe/kaffevm/ksem.h, threading system lock files types: jmutex, jcondvar jmutex_initialize jmutex_lock jmutex_unlock jmutex_destroy jcondvar_initialize jcondvar_signal jcondvar_wait jcondvar_destroy At this layer, mutexes are distinct from condition variables, though the condition variable wait and signal are provided with an associated mutex. A jmutex should not be recursive. Kaffe guarantees that a jmutex and jcondvar are used in a strict pair (see the implementation of Ksem). Broadcast is never used on the condition variable. Kaffe Fast Locking ================== Kaffe has a fast locking scheme that minimizes the cost of uncontended locks and non recursive locks. The fast locking scheme is implemented between VM-internal locking layer and the threading system's locking primitives. The basic idea is that a "heavy" lock is not allocated unless there is some contention for it. Note that once an heavy lock has been allocated the lock remains heavy until it is destroyed. The implementation takes into account one observation: all pointers to heap-allocated objects in Kaffe are at least 4-byte aligned, thus making the bottom two bits of every pointer available. However, we cannot rely on the stack frame to see recursive locks as some (un)locking may happen at different level. For example, JNI requires that we have JNI_MonitorEnter and JNI_MonitorExit: those two functions are called by an external library to (un)lock an object however it is absolutely not guaranteed that the stack pointer will be the same for the two calls. So recursion cannot be fully optimized. Here is some weak pseudo-code to explain the heavy lock algorithm. if lock field is NULL set lock field to current thread's "id" else /* we are going to acquire an heavy lock */ if lock field bottom bit is not set /* This is a thin lock */ if it is a static lock try to acquire the heavy lock success => initialize it and put it in the lock field failure => exit the outer if to "a". else it is not a static lock /* It is a dynamic lock and we must allocate the * heavy */ Allocate and initialize a new heavy lock try to put it in the lock field endif endif a: /* lock field is an heavy lock */ b: try to acquire the heavy lock if it fails wait a semaphore event loop to "b". When the heavy lock is released the algorithm increase the semaphore counter only if some thread is waiting for the semaphore. The checking is done using an atomic counter which gives the number of threads waiting on the given semaphore. Jason Baker <jbaker@cs.utah.edu> Jun 29, 1999 updated by Alexandre Oliva <oliva@lsd.ic.unicamp.br> Sept 18, 1999 updated by Patrick Tullmann <tullmann@cs.utah.edu> Mar 17, 2000 updated by Guilhem Lavaux <guilhem@kaffe.org> Mar 3, 2005