This directory contains a sample extension module code to bridge Gauche and C++ libraries. The two files, mqueue.h and mqueue.cpp, are our hypothetical external C++ library. It implements a simple message queue. It works without Gauche. Our mission here is, given this mqueue library, to write a Gauche binding for it. The following files are needed for our extension: Makefile.in configure.in mqueue_glue.cpp - a bridge (glue) between C++ library and Gauche mqueue_glue.h - ditto mqueue_lib.stub - Scheme binding definition test.scm - unit test example/mqueue-cpp.scm - Scheme module definition The skeleton of those files can be generated, as usual, by 'gauche-package' command, like this: % gauche-package generate mqueue-cpp example.mqueue-cpp I renamed the generated mqueue_cpp.h, mqueue_cpp.c, mqueue_cpplib.stub to more descriptive names, mqueue_glue.h, mqueue_glue.cpp, and mqueue_lib.stub, respectively. [Build process] In order to compile in C++, you have to tweak configure.in and Makefile.in. The default extension build process obtains C compiler name from gauche-config, which knows what C compiler was used to build Gauche. Unfortunately it knows nothing about C++, so we manually have to tell the build process about it. In this example, I added AC_PROG_CXX macro in configure.in to find out system's C++ compiler. NOTE: the C++ compiler has to be 'compatible' with the C compiler used to build Gauche, e.g. it must accept the same set of command-line options to generate a shared library. Gcc and g++ are such compatible ones. In Makefile.in, I added the following line to receive the C++ compiler name configure.in finds. CXX = @CXX@ Then, when builing mqueue_cpp.so, I give --cc=$(CXX) option to the gauche-package script to have it use the C++ compiler instead of the default C compiler. mqueue_cpp.$(SOEXT): $(mqueue_SRCS) $(mqueue_HDRS) $(GAUCHE_PACKAGE) compile --cc=$(CXX) --verbose mqueue_cpp $(mqueue_SRCS) [Glue code] Now, let's take a look at the 'glue' code. The header file mqueue_glue.h begins with including both the Gauche interface and the library inetrface. #include <gauche.h> #include <gauche/extend.h> #include "mqueue.h" It is important to include <gauche.h> *before any other system header files*. Since gauche.h includes some configuration information (from gauche/config.h) which may affect the definitions of the standard header files; one example is _FILE_OFFSET_BITS, which is 32 on most 32bit Linux by default, but Gauche sets it to 64, that causes the system library calls to call 64bit versions. We need a Gauche class that represents MQueue object in C++ world. We're going to use Gauche's foreign pointer feature to implement it. The pointer to the Scheme <mqueue> class is held in this external variable. It is set in the extension initialization routine. extern ScmClass *MQueueClass; Some conveninece macros. The <mqueue> object in Scheme world is wrapped in ScmForeignPointer. 'Unboxing' is to retrieve the C++ MQueue* object from it, and 'boxing' is to wrap C++ object by ScmForeignPointer. #define MQUEUE_P(obj) SCM_XTYPEP(obj, MQueueClass) #define MQUEUE_UNBOX(obj) ((MQueue*)(SCM_FOREIGN_POINTER_REF(obj))) #define MQUEUE_BOX(ptr) Scm_MakeForeignPointer(MQueueClass, ptr) Next comes the declaration of the extension initialization routine. It is important for it to be declared/defined in "C" linkage scope. Macros SCM_DECL_BEGIN ad SCM_DECL_END ensures it (they are the same as 'extern "C" {' and '}'.) extern void Scm_Init_mqueue_cpp(); The source mqueue_glue.cpp implements a bridging machinery. The important part is this statement: MQueueClass = Scm_MakeForeignPointerClass(mod, "<mqueue>", mqueue_print, mqueue_cleanup, SCM_FOREIGN_POINTER_KEEP_IDENTITY|SCM_FOREIGN_POINTER_MAP_NULL); This creates a new Scheme class, named "<mqueue>", as a foreign pointer class and stores its pointer to MQueueClass. Scheme variable <mqueue> is bound to the newly created class in the module MOD. The second argument is a print procedure, called whenever the instancd of <mqueue> is printed. The tricky part here is the 'cleanup' procedure. If you pass a cleanup procedure, it is called when the Scheme <mqueue> object is about to be GC-ed, so that you can free the resources in the C++ world. Although it sounds simple, it requires more attention than its superficial simplicity. If you do something wrong here, you'll get a nasty bug which is very hard to track. Typically, there are a few cases with regard to the foreign resouce management. Case 1: You allocate the foreign object via Gauche's allocator. (SCM_NEW etc.) In this case, Gauche's GC can take care of deallocation, so you don't need specify 'cleanup' procedure except you have other resouces to free (e.g. file descriptors). Case 2: The foreign object is allocated by the foreign library, and it is only pointed from the Scheme world. In this case, you are responsible to free the foreign object when there's no reference from the Scheme world to it. You can use cleanup procedure to do so. Note that the cleanup procedure is called when there's no reference to the Scheme <mqueue> object here, but it doesn't necessarily means that the C++ MQueue object pointed by it can be freed---what if there are other Scheme <mqueue> object which points to the same C++ MQueue object? Gauche's GC can't detect such case, since the C++ MQueue object is outside the scope of Gauche's memory management. There's no universal solution for it, but Gauche provides a convenience mehchanism that works in typical cases. If you give a flag SCM_FOREIGN_POINTER_KEEP_IDENTITY to the Scm_MakeForeginPointerClass, then Gauche guarantees that Scm_MakeForeignPointer returns exactly same Scheme object for the same pointer (internally, it uses a weak hash table to map the void* pointer value to the created Scheme object). If your code never assigns the wrapped foreign pointer to other Scheme objects, then you can be sure that whenever the Scheme <mqueue> object is GC-ed, the C++ MQueue object pointed by it can also be destroyed. Case 3: The foreign object is allocated by the foreign library, and it may be pointed from both Scheme and C++ worlds. In this case, you cannot destroy the foreign object even when the Scheme object is GC-ed, since there might be a pointer in C++ world that points to the same foreign object. Gauche's GC cannot track it. This is very difficult problem indeed. Usually, if this is the case, such foreign library provides some sort of resource management infrastructure such as reference counting. If so, you can drop the reference count in the cleanup routine, provided that you do increment the reference count in the boxing routine. (If you use reference counting scheme, you don't need to use SCM_FOREIGN_POINTER_KEEP_IDENTITY, for there can be more than one Scheme object that points to the same foreign object). Unfortunately this won't work sometimes; there are cases that you also have to pass a Scheme object to the foreign library (e.g. closures for callback), and sometimes they form a loop of reference, spanning into both Scheme and foreign library world. But it is beyond the scope of this document and I'll leave the further discussion for some other time. The use of SCM_FOREIGN_POINTER_KEEP_IDENTITY is convenient even for case 1 or 3, since you can use eq? to test the pointer identity of the foreign object. However, using it incurs overhead of bookkeeping. The other flag, SCM_FOREIGN_POINTER_MAP_NULL, is a convenience flag. If you specify this flag, Scm_MakeForeignPointer returns SCM_FALSE if NULL is given. It is handy if you call a foreign routine that may return NULL; with this flag, such case can be seen from Scheme as if the foreign routine returned Scheme #f. (Be careful, however, that when you use this flag, you cannot assume the returned value of Scm_MakeForeignPointer(KLASS, PTR) is the instance of KLASS. If you play with the returned value in C code you have to check its type.) [Stub code] The mqueue_lib.stub file binds the Scheme procedures and the foreign library functions. The define-type directive tells the stub generator about the <mqueue> class you create in this extension. (define-type <mqueue> "MQueue*" "mqueue" "MQUEUE_P" "MQUEUE_UNBOX" "MQUEUE_BOX") The arguments are: Scheme name, C type name, description (used in error messages), C function or macro to check the type, an unboxer, and a boxer. NOTE: I'm planning to have better way to define foreign types. Consider this define-type as temporary solution. The define-cproc directive defines the foreign function interface: (define-cproc SCHEME-NAME (ARGSPEC ...) CLAUSE ...) SCHEME-NAME becomes the name of the Scheme function. ARGSPEC specifies the arguments and its type. ARGSPEC := name::type To see the exact meanings of type and how it is mapped in the C world, peek the source of src/genstub (search "Type handling"). You can use &rest, &optional and &keyword a la Common Lisp. CLAUSE gives the information about how to call the C stuff. There are quite a few clauses you can put here, but the most important ones are the followings: (call [return-type] C-function-name) (expr [return-type] C-expression) (body [return-type] C-code-fragment ...) You have to have either one of these. 'CALL' clause generates a code that calls a function given by C-function-name, with the arguments specified in ARGSPEC. If the Scheme calling convention matches the C function, this is the easiest way. If return-type is given, the returned value is boxed accordingly. If return-type is omitted, the C function must return ScmObj, which is returned from the Scheme function as is. You can also put <void> as a return-type, in which case the return value of the C function is discarded and the Scheme function returns #<undef>. 'EXPR' clause allows you to specify a C expression instead. The result of expression is boxed accordingly if return-type is specified. 'BODY' clause allows more complicated processing. You can write any C code in C-code-fragment. Unless return-type is <void>, you have to assign to a variable SCM_RESULT within the C-code-fragment. SCM_RESULT is declared with a suitable type that matches return-type. After C-code-fragment is executed, the value of SCM_RESULT is boxed and returned from the Scheme file. For more details, check out the stub files included in the Gauche source tree. I'll write up more complete specification until 0.9 release. Other clause type worth to mention here is 'catch' clause. It handles C++ exceptions. Gauche has its own exception handling system, and although both can coexist, it is not allowed that one nonlocal exit jumps the dynamic environment of the other. For example, Suppose you call C++ function (2) from Scheme (1), which in turn calls Scheme function (3), which calls C++ function (4), which calls C++ function (5). And the C++ function (5) raised an exception. Scheme(1) --> C++(2) --> Scheme(3) --> C++(4) --> C++(5) throw! It is fine as far as the exception is caught within (5) or (4). Scheme(1) --> C++(2) --> Scheme(3) --> C++(4) --> C++(5) catch <----- throw However, it is not allowed to catch the exception in the function (2), since it will jump the Gauche's exception frame set up for Scheme (3). Scheme(1) --> C++(2) --> Scheme(3) --> C++(4) --> C++(5) catch <----------------------------- throw This fact mandates that if you call a C++ function that may throw an exception, you have to catch it within the stub routine. The typical way is to convert the caught exception to Gauche's exception. The Gauche's exception can then be caught in Scheme's 'guard' form. Scheme(1) --> C++(2) --> Scheme(3) --> C++(4) --> C++(5) catch <---- throw guard <----- raise For the convenience, you can write a 'catch' spec within the stub description, such as this: (define-cproc mqueue-pop! (mq::<mqueue>) (expr <const-cstring> "mq->popMessage().c_str()") (catch ("MQueueException& e" "Scm_Error(\"mqueue-pop!: %s\", e.reason.c_str());"))) The 'catch' spec causes the body of stub function to be surrounded by try, and appropriate catch clauses to be generated. The generated code roughly will be something like this: try { const char* result; result = mq->popMessage().c_str(); return SCM_MAKE_STR_COPYING(result); } catch (MQueueException& e) { Scm_Error("mqueue-pop!: %s", e.reason.c_str()); } catch (std::exception& e) { Scm_Error("mqueue-pop!: %s", e.what()); } catch (...) { Scm_Error("C++ exception is thrown in mqueue-pop!"); } Note that mere existence of 'catch' spec causes the last two catch clause (std::exception and ...) to be generated. [Caveats] Be always careful about the ownership of the resources. If you're within Gauche's world, most things are taken care of by its garbage collector. But once you step to the foreign land, it's up to you again to make sure all resources are managed. Especially, make sure the memory owned by Gauche is always visible from Gauche. For example, if you specify <const-cstring> in the argument type, Gauche converts Scheme string to NUL-terminated C string, but Gauche still owns the resulted string. So, if the foreign function retains the passed pointer within itself, such as this fictious code: /* foreign code */ static const char *ss; void foo(const char *s) { ss = s; } Then, it is wrong to write a stub function like this: ;; stub function (define-cproc foo (s::<const-cstring>) (call <void> "foo")) It passes the compilation, but the string passed to "foo" is stored in the location that Gauche doesn't know, so later the string body is GCed, leaving the foreign pointer dangling. The more subtle case is to pass the foreign object pointer, which itself is allocated via foreign allocator, to the foreign function. If we adopt the Case 2 scheme described above, the foreign object would be destroyed when its wrapping Scheme object is GC-ed, even if the foreign object itself isn't allocated by Gauche.