Sophie: hugs98-20011215-2mdk i586

hugs98-20011215-2mdk.i586.rpm

Notes on the Foreign Function Interface (ffi) - 12 Feb 2001


This release includes a partial implementation of the Haskell foreign
function interface definition:
  
  http://www.haskell.org/hdirect/ffi.html
  http://www.haskell.org/hdirect/ffi-a4.ps.gz
  http://www.haskell.org/hdirect/ffi-letter.ps.gz
  http://www.haskell.org/hdirect/ffi-a4.dvi.gz
  http://www.haskell.org/hdirect/ffi-letter.dvi.gz

with two minor caveats (excruciating details appended at the end):

o "foreign export static" is not implemented but, fortunately, this is
  one of the least used parts of the ffi and can be worked around.

o "foreign export dynamic" is implemented but only for the x86
  architecture but it should be easy to port by any experienced
  assembly language programmer.



Suppose you have some C functions in test.c and some ffi
declarations for those functions in Test.hs, you can use them with
Hugs as follows:

  # Generate Test.c (note that it is _not_ test.c)
  #
  # [For every Haskell file loaded which contains ffi declarations,
  # this will generate a .c file _in the current working directory_.]
  hugs +G Test.hs

  # Compile and partially link Test.c and test.c putting the
  # result in Test.so.
  #
  # Details on how to partially link files vary from one platform to
  # another.

  # Most Unixen:
  cc -shared -I/usr/local/share/hugs/include Test.c test.c -o Test.so

  # MacOS X:
  cc -bundle -I/usr/local/share/hugs/include Test.c test.c -o Test.so

  # Run Hugs as normal - when Test.hs is loaded, it will load Test.so
  hugs Test.hs

  # And now try using the imported or exported functions.



Enjoy!

--
Alastair Reid        reid@cs.utah.edu        http://www.cs.utah.edu/~reid/




Known limitations:

o foreign export static is not implemented.

  You can code around this by writing:
  
    foreign import dynamic foo_dynamic :: Addr -> (A -> B -> C)
    foreign label foo_addr :: Addr
    foo = foo_dynamic foo_addr
  
  instead of:

    foreign import foo :: A -> B -> C

  Ideally Hugs would do this for you but there are some tricky
  interactions between ffi and type classes which baffle me.  Sorry.

o foreign export dynamic is only implemented for the x86 architecture.

  The following information is intended for those brave souls who try 
  to port the implementation to other architectures and can be safely 
  ignored by everyone else.

  To make foreign export dynamic work for other architectures, you
  have to modify the function mkThunk in hugs98/src/builtin.c to
  generate a short sequence of machine code (and then send your
  fix to hugs-bugs@haskell.org for inclusion in the next release).

  The goal of the code is (more or less) to implement this C function 

    rty f(ty1 a1, ... tym am) {
      return (*app)(s,a1, ... am);
    }
 
  where rty, ty1, ... tym are C types, app is a "apply" function
  generated by running "hugs +G" and "s" is a "stable pointer" to the 
  Haskell being exported.  The reason the function is written in
  machine code is:

  o For foreign export dynamic, the function has to be generated
    dynamically and neither ANSI C nor any extensions we know of let
    you generate C functions at runtime.  The alternative of 
    invoking the C compiler and loader at runtime is not attractive.

  o The code has to be placed next to a data structure in memory.
    The data structure has this type:
      
      struct thunk_data {
          struct thunk_data* next;
          struct thunk_data* prev;
          HugsStablePtr      stable;
          char               code[16];
      };

    The next and prev pointers are used to implement a doubly-linked list 
    used by the garbage collector to keep track of all dynamically 
    exported functions.

    The stable pointer stores a stable pointer to the Haskell function being
    exported.  This is used by the garbage collector.

    The code field stores the machine code.  It is expected that the size
    will have to be changed for other architectures.

  o By writing in assembly/machine code, it is possible to use the
    same code sequence no matter what the function type is.  This
    works because the C calling convention on most machines has the
    stack looking something like this (the stack grows downwards in
    this picture)
    
         |  ...   |
         +--------+
         |  argm  |
         +--------+
            ...  
         +--------+
         |  arg2  |
         +--------+
         |  arg1  |
         +--------+
         |ret_addr|
         +--------+
    
    This calling convention is more or less imposed by the need to 
    support vararg functions in C. 

    To implement the above function, all we need to do is adjust the
    stack to look like this:


         |  ...   |
         +--------+
         |  argm  |
         +--------+
            ...  
         +--------+
         |  arg2  |
         +--------+
         |  arg1  |
         +--------+
         |   s    |
         +--------+
         |ret_addr|
         +--------+
    
    and jump to (tailcall) the start of app.

    On the x86, you can do this with the following code sequence:
    
      pushl (%esp)      ; move the return address "up"
      movl  s,4(%esp)   ; stick the stable pointer "under" it
      jmp   app         ; tail call app

    On architectures with very different architectures, you can
    (hopefully) get things working by passing the stable pointer in a
    global variable or, perhaps, a callee-saves register and tweaking
    the "app" function (which is generated by implementForeignExport
    in ffi.c) to expect "s" in that variable instead of on the stack.

  o It is machine code instead of assembly code because we don't want
    to invoke an assembler and linker/loader at runtime.  

    Having determined which assembly code sequence to use, use 
    "as -a" (or equivalent) to view the corresponding machine code and
    then write C code which will insert that code into the code field 
    of a thunk.  

    For the x86, the code looks like this.  

      #if defined(__i386__)
          /* 3 bytes: pushl (%esp) */
          *pc++ = 0xff; *pc++ = 0x34; *pc++ = 0x24;  
      
          /* 8 bytes: movl s,4(%esp) */
          *pc++ = 0xc7; *pc++ = 0x44; *pc++ = 0x24; *pc++ = 0x04; 
          *((HugsStablePtr*)pc)++ = s;
      
          /* 5 bytes: jmp app */
          *pc++ = 0xe9;
          *((int*)pc)++ = (char*)app - ((char*)&(thunk->code[16]));
      #else
          ...
      #endif
           
    This code contains a copy of the stable pointer because it is
    convenient to do this on the x86.  On architectures such as the
    Sparc where 32-bit immediate loads are more painful, it may be
    easier to load the copy of the stable pointer stored in the 
    thunk - this is stored at a fixed offset from the code.
    Likewise, it may be convenient to add a copy of "app" to the
    thunk struct.