Sophie: kaffe-devel-0:1.1.8-0.20060723.1mdv2007.0 i586

kaffe-devel-1.1.8-0.20060723.1mdv2007.0.i586.rpm

Here is a brief summary of locking in Kaffe.  

There are several layers to the locking abstractions in Kaffe.
Ideally these layers are collapsed by the compiler.

Java Object Locks
=================

	files: kaffe/kaffevm/locks.[hc]

	lockObject(H_java_lang_Object*)
	unlockObject(H_java_lang_Object*)

These should only be used for Java object monitors.  They just take a
pointer to the Java object.  The intent is that if we ever create a
facility to log the execution of all monitorenter/exit bytecodes,
entries and exits of synchronized methods and JNI MonitorEnter/Exit
primitives, we'll be able to do it by tweaking this function.  These
functions are implemented as a thin layer over the VM-internal locks
(See the next section).

NOTE: these are functions, and must remain so, since their addresses
are taken in some engines.

The wait and signal functions on Java objects map directly to the wait
and signal functions in the VM-internal locks (see the next section).
XXX There should probably be waitObject(), signalObject(), and
broadcastObject() interfaces.  Currently the waitCond(), signalCond()
and broadcastCond() interfaces are just invoked directly.  
XXX should be a slowUnlockObjectIfHeld(), too (for stack unwinding due
to an exception).

NOTE: Two other functions are part of this interface:
	slowLockObject(H_java_lang_Object*, void*);
	slowUnlockObject(H_java_lang_Object*, void*);
these are used by some engines when the inlined "fast lock" fails.
See the "Kaffe Fast Locking" section, below.  There need not be
"slow" versions of the wait/signal/broadcast functions.


VM-internal object locks
========================

	files: kaffe/kaffevm/locks.[hc]

	types: iLock

	lockMutex	lockStaticMutex
	unlockMutex	unlockStaticMutex
	waitCond	waitStaticCond		
	signalCond	signalStaticCond	
	broadcastCond	broadcastStaticCond	

There are two flavors of VM-internal locking interfaces.  One for
dynamic locks (left column) and one for static locks (right column).
Dynamic locks are generally used to protect VM-internal dynamic structures
(e.g., classes).  Static locks are generally global locks (e.g., the
gc lock or finalizer lock) or locks that need a specific out-of-order
initialization.

The VM must use its own internal locks for most internal state, to
avoid race conditions with arbitrary Java code that uses object locks.
(For example on threads and classes, there are both logical Java-level
locks, and VM-internal locks).

There is one instance of using the VM-internal interfaces to lock a
Java-level object: when JIT-compiling a method, a lock on its Class is
held.  Logging such a lock via the lockObject() interface would be
unexpected: it would only be logged when the JIT was in use.
Moreover, it is not mandated nor directed by any of the Java monitor
facilities.  Therefore, this case uses the VM variant, and so should
similar cases that ever come up.

These locks are recursive.  Despite the disjoint names, the same iLock
object supports both the lock/unlock and wait/signal/broadcast
interfaces.

The non-static interfaces are a bit odd in that the argument passed in
must be a pointer to a struct with a "lock" field.  The lock field
must be an iLock pointer.  The static interfaces take a pointer to a
iLock pointer.  

To avoid the use of building heavy locks you must only lock once. If
you exceed that limit a heavy lock is allocated and kept in memory
until the lock is destroyed. The new implementation(*) heavy lock is
actually not that slow as it must only acquire the heavy lock by
putting a 1 into some field and then increment the lock counter when
called recursively.
(*) As of 2005-03-11

Threading System Synchronization Primitives
===========================================

The VM-internal locking scheme is implemented on top of the primitives
provided by the Kaffe threading system.  The underlying threading
system has two choices for implementing the lowest layer of locking in
Kaffe.  First is the Ksem interface which provides a semaphore
primitive on which the Kaffe run-time bases its "heavy locks".  (See
the "Kaffe Fast Locking" section below.)  The second option is to
implement the "jmutex/jcondvar" interface (which is just used to fill
out a default Ksem implementation.)

A. The Ksem interface:
---------------------

	files: kaffe/kaffevm/ksem.h, threading system lock files

	types: Ksem

	ksemInit(sem)
	ksemGet(sem,timeout)
	ksemPut(sem)
	ksemDestroy(sem)

The Ksem is a binary semaphore, and need not be recursive.  The Ksem
is initialized to 0 (in semaphore-speak this means the resource is not
available).  ksemGet() can timeout, thus it returns a boolean
indication of whether it acquired the semaphore or not.  The timeout
is specified in milliseconds.

As a side-effect of Kaffe's fast locking scheme, there is a one-to-one
mapping between Ksem and threads.  Also, all Ksem's are dynamically
allocated when a thread is created and all are initialized before
being used.


B. The jmutex/jcondvar interface:
--------------------------------

	files: kaffe/kaffevm/ksem.h, threading system lock files

	types:	jmutex, jcondvar

	jmutex_initialize
	jmutex_lock
	jmutex_unlock
	jmutex_destroy
	
	jcondvar_initialize
	jcondvar_signal
	jcondvar_wait
	jcondvar_destroy

At this layer, mutexes are distinct from condition variables, though
the condition variable wait and signal are provided with an associated
mutex.  A jmutex should not be recursive.  Kaffe guarantees that a
jmutex and jcondvar are used in a strict pair (see the implementation
of Ksem).  Broadcast is never used on the condition variable.


Kaffe Fast Locking
==================

Kaffe has a fast locking scheme that minimizes the cost of uncontended
locks and non recursive locks. The fast locking scheme is implemented between VM-internal
locking layer and the threading system's locking primitives.  The
basic idea is that a "heavy" lock is not allocated unless there is
some contention for it.  Note that once an heavy lock has been
allocated the lock remains heavy until it is destroyed. The
implementation takes into account one observation: all pointers to heap-allocated objects in
Kaffe are at least 4-byte aligned, thus making the bottom two bits of
every pointer available. However, we cannot rely on the stack frame to
see recursive locks as some (un)locking may happen at different
level. For example, JNI requires that we have JNI_MonitorEnter and
JNI_MonitorExit: those two functions are called by an external library
to (un)lock an object however it is absolutely not guaranteed that the
stack pointer will be the same for the two calls. So recursion cannot
be fully optimized.

Here is some weak pseudo-code to explain the heavy lock algorithm.

if lock field is NULL
	set lock field to current thread's "id"
else
	/* we are going to acquire an heavy lock */
	if lock field bottom bit is not set
		/* This is a thin lock */
		if it is a static lock
		   try to acquire the heavy lock
		       success => initialize it and put it in the lock
		       		   field
		       failure => exit the outer if to "a".
		else it is not a static lock
		   /* It is a dynamic lock and we must allocate the
		    * heavy */
		   Allocate and initialize a new heavy lock
		   try to put it in the lock field
		endif
	endif
a:

		/* lock field is an heavy lock */
		b: try to acquire the heavy lock
		   if it fails wait a semaphore event
		   loop to "b".


When the heavy lock is released the algorithm increase the semaphore
counter only if some thread is waiting for the semaphore. The checking is done
using an atomic counter which gives the number of threads waiting on the given
semaphore.

Jason Baker <jbaker@cs.utah.edu>
Jun 29, 1999
updated by Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Sept 18, 1999
updated by Patrick Tullmann <tullmann@cs.utah.edu>
Mar 17, 2000
updated by Guilhem Lavaux <guilhem@kaffe.org>
Mar 3, 2005