Sophie

Sophie

distrib > Fedora > 15 > i386 > by-pkgid > 010670e365eac4bfdf0087ea1c497c2e > files > 72

gauche-0.9.3.2-1.fc15.i686.rpm

This directory contains a sample extension module code to bridge
Gauche and C++ libraries.

The two files, mqueue.h and mqueue.cpp, are our hypothetical
external C++ library.  It implements a simple message queue.
It is independent from Gauche.   Our mission here is, given this
mqueue library, to write a Gauche binding for it.

The following files are needed for our extension:

  Makefile.in
  configure.in
  mqueue_glue.cpp    - a bridge (glue) between C++ library and Gauche
  mqueue_glue.h      - ditto
  mqueue_lib.stub    - Scheme binding definition
  test.scm           - unit test
  example/mqueue-cpp.scm - Scheme module definition

The skeleton of those files can be generated by
'gauche-package' command, like this:

  % gauche-package generate mqueue-cpp example.mqueue-cpp

I renamed the generated mqueue_cpp.h, mqueue_cpp.c, mqueue_cpplib.stub
to more descriptive names, mqueue_glue.h, mqueue_glue.cpp,
and mqueue_lib.stub, respectively.


[Build process]

In order to compile in C++, you have to tweak configure.in and
Makefile.in.

The default extension build process obtains C compiler name
from gauche-config, which knows what C compiler was used to build
Gauche.  Unfortunately it knows nothing about C++, so we manually
have to tell the build process about it.

In this example, I added AC_PROG_CXX macro in configure.in to find
out system's C++ compiler.

    NOTE: the C++ compiler has to be 'compatible' with the C
    compiler used to build Gauche, e.g. it must accept the
    same set of command-line options to generate a shared
    library.  Gcc and g++ are such compatible ones.

In Makefile.in, I added the following line to receive the C++
compiler name configure.in finds.

    CXX      = @CXX@

Then, when builing mqueue_cpp.so, I give --cc=$(CXX) option to
the gauche-package script to have it use the C++ compiler instead
of the default C compiler.

    mqueue_cpp.$(SOEXT): $(mqueue_SRCS) $(mqueue_HDRS)
        $(GAUCHE_PACKAGE) compile --cc=$(CXX) --verbose mqueue_cpp $(mqueue_SRCS)


[Glue code]

Now, let's take a look at the 'glue' code.

The header file mqueue_glue.h begins with including both the Gauche
interface and the library interface.

    #include <gauche.h>
    #include <gauche/extend.h>
    #include "mqueue.h"

It is important to include <gauche.h> *before any other system
header files*.  Since gauche.h includes some configuration
information (from gauche/config.h) which may affect the definitions
of the standard header files; one example is _FILE_OFFSET_BITS,
which is 32 on most 32bit Linux by default, but Gauche sets it to 64,
that causes the system library calls to call 64bit versions.

We need a Gauche class that represents MQueue object in C++ world.
We're going to use Gauche's foreign pointer feature to implement it.
The pointer to the Scheme <mqueue> class is held in this external
variable.  It is set in the extension initialization routine.

    extern ScmClass *MQueueClass;

Some conveninece macros.  The <mqueue> object in Scheme world is
wrapped in ScmForeignPointer.  'Unboxing' is to retrieve the C++
MQueue* object from it, and 'boxing' is to wrap C++ object by
ScmForeignPointer.

    #define MQUEUE_P(obj)      SCM_XTYPEP(obj, MQueueClass)
    #define MQUEUE_UNBOX(obj)  ((MQueue*)(SCM_FOREIGN_POINTER_REF(obj)))
    #define MQUEUE_BOX(ptr)    Scm_MakeForeignPointer(MQueueClass, ptr)

Next comes the declaration of the extension initialization routine.
It is important for it to be declared/defined in "C" linkage scope.
Macros SCM_DECL_BEGIN ad SCM_DECL_END ensures it (they are the same
as 'extern "C" {' and '}'.)

    extern void Scm_Init_mqueue_cpp();

The source mqueue_glue.cpp implements a bridging machinery.  The
important part is this statement:

    MQueueClass =
        Scm_MakeForeignPointerClass(mod, "<mqueue>",
                                    mqueue_print,
                                    mqueue_cleanup,
                                    SCM_FOREIGN_POINTER_KEEP_IDENTITY|SCM_FOREIGN_POINTER_MAP_NULL);

This creates a new Scheme class, named "<mqueue>", as a foreign pointer
class and stores its pointer to MQueueClass.  Scheme variable <mqueue>
is bound to the newly created class in the module MOD.

The second argument is a print procedure, called whenever the
instancd of <mqueue> is printed.

The tricky part here is the 'cleanup' procedure.  If you pass
a cleanup procedure, it is called when the Scheme <mqueue> object
is about to be GC-ed, so that you can free the resources in the C++ 
world.  Although it sounds simple, it requires more attention than
its superficial simplicity.  If you do something wrong here,
you'll get a nasty bug which is very hard to track.

Typically, there are a few cases with regard to the foreign resouce
management.

Case 1: You allocate the foreign object via Gauche's allocator.
        (SCM_NEW etc.)   In this case, Gauche's GC can take care
        of deallocation, so you don't need specify 'cleanup' procedure
        except you have other resouces to free (e.g. file descriptors).

Case 2: The foreign object is allocated by the foreign library, and
        it is only pointed from the Scheme world.  In this case,
        you are responsible to free the foreign object when there's
        no reference from the Scheme world to it.  You can use cleanup
        procedure to do so.

        Note that the cleanup procedure is called when there's
        no reference to the Scheme <mqueue> object here, but it doesn't
        necessarily means that the C++ MQueue object pointed by it can be
        freed---what if there are other Scheme <mqueue> object which
        points to the same C++ MQueue object?  Gauche's GC can't detect
        such case, since the C++ MQueue object is outside the scope
        of Gauche's memory management.

        There's no universal solution for it, but Gauche provides a
        convenience mehchanism that works in typical cases.  If you
        give a flag SCM_FOREIGN_POINTER_KEEP_IDENTITY to the
        Scm_MakeForeignPointerClass, then Gauche guarantees that 
        Scm_MakeForeignPointer returns exactly same Scheme object
        for the same pointer (internally, it uses a weak hash table to
        map the void* pointer value to the created Scheme object).
        If your code never assigns the wrapped foreign pointer to
        other Scheme objects, then you can be sure that whenever the
        Scheme <mqueue> object is GC-ed, the C++ MQueue object pointed
        by it can also be destroyed.

Case 3: The foreign object is allocated by the foreign library, and
        it may be pointed from both Scheme and C++ worlds.  In this case,
        you cannot destroy the foreign object even when the Scheme
        object is GC-ed, since there might be a pointer in C++ world
        that points to the same foreign object.  Gauche's GC cannot
        track it.

        This is very difficult problem indeed.  Usually, if this is
        the case, such foreign library provides some sort of resource
        management infrastructure such as reference counting.  If so,
        you can drop the reference count in the cleanup routine,
        provided that you do increment the reference count in the
        boxing routine.

        (If you use reference counting scheme, you don't need to use
        SCM_FOREIGN_POINTER_KEEP_IDENTITY, for there can be more than
        one Scheme object that points to the same foreign object).

        Unfortunately this won't work sometimes; there are cases that
        you also have to pass a Scheme object to the foreign library
        (e.g. closures for callback), and sometimes they form a loop
        of reference, spanning into both Scheme and foreign library
        world.  But it is beyond the scope of this document and I'll
        leave the further discussion for some other time.


The use of SCM_FOREIGN_POINTER_KEEP_IDENTITY is convenient even
for case 1 or 3, since you can use eq? to test the pointer identity
of the foreign object.  However, using it incurs overhead of
bookkeeping.

The other flag, SCM_FOREIGN_POINTER_MAP_NULL, is a convenience flag.
If you specify this flag, Scm_MakeForeignPointer returns SCM_FALSE
if NULL is given.  It is handy if you call a foreign routine that
may return NULL; with this flag, such case can be seen from
Scheme as if the foreign routine returned Scheme #f.
(Be careful, however, that when you use this flag, you cannot assume
the returned value of Scm_MakeForeignPointer(KLASS, PTR) is the
instance of KLASS.  If you play with the returned value in C code
you have to check its type.)


[Stub code]

The mqueue_lib.stub file binds the Scheme procedures and the foreign
library functions.

The define-type directive tells the stub generator about the <mqueue>
class you create in this extension.

    (define-type <mqueue> "MQueue*" "mqueue"
      "MQUEUE_P" "MQUEUE_UNBOX" "MQUEUE_BOX")

The arguments are: Scheme name, C type name, description (used in
error messages), C function or macro to check the type, an unboxer, 
and a boxer.

    NOTE: I'm planning to have better way to define foreign types.
    Consider this define-type as temporary solution.

The define-cproc directive defines the foreign function interface:

  (define-cproc SCHEME-NAME (ARGSPEC ...) ::RETURN-TYPE
    CLAUSE ...)

SCHEME-NAME becomes the name of the Scheme function.  ARGSPEC
specifies the arguments and its type.

  ARGSPEC := name::type

To see the exact meanings of type and how it is mapped in the
C world, peek the source of src/genstub (search "Type handling").
You can use &rest, &optional and &keyword a la Common Lisp.

::RETURN-TYPE specifies what type of value the function body
returns.  If the function doesn't need to generate a value,
specify ::<void>.  The Scheme function returns #<undef> then.
If the function generates a ScmObj value, you can omit ::RETURN-TYPE;
or give ::<top>.   If you give other types, such as ::<int>,
the stub generator can generate a code to convert C int to
Scheme integer.

CLAUSE gives the information about how to call the C stuff.
There are quite a few clauses you can put here, but the most
simple ones are the followings:

  (call C-function-name)

  (expr C-expression)

You have to have either one of these.

'CALL' clause generates a code that calls a function given by
C-function-name, with the arguments specified in ARGSPEC.
If the Scheme calling convention matches the C function, this
is the easiest way.  If return-type is given, the returned value
is boxed accordingly.  If return-type is omitted, the C function
must return ScmObj, which is returned from the Scheme function as is.
If the C function doesn't return a value, return-type must be ::<void>.

'EXPR' clause allows you to specify a C expression instead.  
The result of expression is boxed accordingly if return-type
is specified.

For more details, check out the stub files included in the
Gauche source tree.

Other clause type worth to mention here is 'catch' clause.
It handles C++ exceptions.

Gauche has its own exception handling system, and although both
can coexist, it is not allowed that one nonlocal exit jumps the
dynamic environment of the other.  For example, Suppose you call C++
function (2) from Scheme (1), which in turn calls Scheme function (3),
which calls C++ function (4), which calls C++ function (5).
And the C++ function (5) raised an exception.

   Scheme(1) --> C++(2) --> Scheme(3) --> C++(4) --> C++(5)
                                                        throw!

It is fine as far as the exception is caught within (5) or (4).

   Scheme(1) --> C++(2) --> Scheme(3) --> C++(4) --> C++(5)
                                           catch <----- throw


However, it is not allowed to catch the exception in the function (2),
since it will jump the Gauche's exception frame set up for Scheme (3).

   Scheme(1) --> C++(2) --> Scheme(3) --> C++(4) --> C++(5)
                   catch <----------------------------- throw

This fact mandates that if you call a C++ function that may throw
an exception, you have to catch it within the stub routine.
The typical way is to convert the caught exception to Gauche's
exception.  The Gauche's exception can then be caught in Scheme's
'guard' form.

   Scheme(1) --> C++(2) --> Scheme(3) --> C++(4) --> C++(5)
                                            catch <---- throw
                              guard <----- raise

For the convenience, you can write a 'catch' spec within the stub
description, such as this:

    (define-cproc mqueue-pop! (mq::<mqueue>) ::<const-cstring>
      (expr "mq->popMessage().c_str()")
      (catch ("MQueueException& e"
              "Scm_Error(\"mqueue-pop!: %s\", e.reason.c_str());")))

The 'catch' spec causes the body of stub function to be surrounded
by try, and appropriate catch clauses to be generated.  The generated
code roughly will be something like this:

    try {
      const char* result;
      result = mq->popMessage().c_str();
      return SCM_MAKE_STR_COPYING(result);
    }
    catch (MQueueException& e) {
      Scm_Error("mqueue-pop!: %s", e.reason.c_str());
    }
    catch (std::exception& e) {
      Scm_Error("mqueue-pop!: %s", e.what());
    }
    catch (...) {
      Scm_Error("C++ exception is thrown in mqueue-pop!");
    }
                            
Note that mere existence of 'catch' spec causes the last two
catch clause (std::exception and ...) to be generated.


[Caveats]

Be always careful about the ownership of resources.  If you're
within Gauche's world, most things are taken care of by its garbage
collector.  But once you step into the foreign land, it's up
to you again to make sure all resources are managed.

Especially, make sure the memory owned by Gauche is always visible
from Gauche.  For example, if you specify <const-cstring> in the
argument type, Gauche converts Scheme string to NUL-terminated 
C string, but Gauche still owns the resulted string.  So, if the
foreign function retains the passed pointer within itself, such as
this fictious code:

   /* foreign code */
   static const char *ss;

   void foo(const char *s)
   {
     ss = s;
   }

Then, it is wrong to write a stub function like this:

   ;; stub function
   (define-cproc foo (s::<const-cstring>) ::<void>
     (call "foo"))

It compiles, but the string passed to "foo" is stored
in the location that Gauche doesn't know, so later the string body
is GCed, leaving the foreign pointer dangling.

The more subtle case is to pass the foreign object pointer, which
itself is allocated via foreign allocator, to the foreign function.
If we adopt the Case 2 scheme described above, the foreign object
would be destroyed when its wrapping Scheme object is GC-ed, even
if the foreign object itself isn't allocated by Gauche.