Sophie: parrot-docs-3.6.0-2.fc15 noarch

parrot-docs-3.6.0-2.fc15.noarch.rpm

=head1 NAME

Parrot Subroutines

=head1 VERSION

$Revision$

=head1 ABSTRACT

This document describes how to define, call, and return from Parrot subroutine
objects and other invokables.

=head1 DESCRIPTION

Parrot comes with different subroutine and related classes which implement CPS
(Continuation Passing Style) and PCC (Parrot Calling Conventions)
F<docs/pdds/pdd03_calling_conventions.pod>.

=head2 Class Tree

These are all of the built-in classes that are directly callable, or
"invokable":

  Sub
    Closure
    Coroutine
    Eval
  Continuation
    ExceptionHandler

By "invokable" we mean that they can be supplied as the first argument to the
C<invoke>, C<invokecc>, or C<tailcall> instructions.  Generally speaking,
invokable objects are divided into two subtypes:  C<Sub> and classes that are
built on it create a new context when invoked, and C<Continuation> classes
return control to an existing context that was captured when the
C<Continuation> was created.

There are (of course) two classes that straddle this distinction:

=over 4

=item 1.

Invoking a C<Closure> object creates a new context for the sub it refers to
directly, but it also captures an "outer" context that provides bindings for
the immediately-enclosing lexical scope (and, if that context is itself is for
a C<Closure>, the subsequent scopes working outwards).

[add a C<newclosure> example?  -- rgr, 6-Apr-08.]

=item 2.

A C<Coroutine> acts like a normal sub when called initially, and can also
return normally, but acts like a continuation when exited via the C<yield>
instruction and re-entered by re-invoking.

[need a reference to a C<coroutine> example.  -- rgr, 6-Apr-08.]

=back

=head1 SYNOPSIS

=head2 Creating subs

Subs are created by IMCC (the PIR compiler) via the B<.sub> directive.  Unless
the C<:anon> pragma is included, they are stored in the constant table
associated with the bytecode and can be fetched with the B<get_hll_global> and
B<get_root_global> opcodes.  Within the PIR source, they can also be put in
registers with a C<.const 'Sub'> declaration:

=begin PIR_FRAGMENT

    .const 'Sub' rsub = 'random_sub'

=end PIR_FRAGMENT

This uses C<find_sub_not_null> under the hood to look up the sub named
"random_sub".

Here's an example of fetching a sub from another namespace:

=begin PIR

    .sub main :main
	get_hll_global $P0, ['Other'; 'Namespace'], "the_sub"
	$P0()
	print "back\n"
    .end

    .namespace ['Other'; 'Namespace']

    .sub the_sub
	print "in sub\n"
    .end

=end PIR

Note that C<the_sub> could be defined in a different bytecode or PIR source
file from C<main>.

=head2 Program entry point

One subroutine in the first executed source or bytecode file may be flagged as
the "main" subroutine, where execution starts.

=begin PIR

  .sub the_main_event :main
     # ...
  .end

=end PIR

In the absence of a B<:main> entry Parrot starts execution at the first
statement.  Any C<:main> directives in a subsequent PIR or bytecode file that
are loaded under program control are ignored.

Note that if the first executed source or bytecode file contains more than one
sub flagged as C<:main>, Parrot currently picks the I<last> such sub to start
execution.  This is arguably a bug, so users should not depend upon it.

=head2 Load-time initialization

If a subroutine is marked as B<:load> this subroutine is run, before the
B<load_bytecode> opcode returns.

e.g.

=begin PIR

  .sub main :main
     print "in main\n"
     load_bytecode "library_code.pir"
     print "back to main\n"
  .end

  # library_code.pir

  .sub _my_lib_init :load
     print "initializing library\n"
  .end

=end PIR

If a subroutine is marked as B<:init> this subroutine is run before the
B<:main> or the first subroutine in the source file runs.  Unlike B<:main>
subs, B<:init> subs are also run when compiling from memory.  B<:load> subs
are run only in any source or bytecode files loaded subsequently.

These markers are called "pragmas", and are defined fully in
L<docs/pdds/pdd19_pir.pod>.  The following table summarizes the behavior
of the five pragmas that cause Parrot to run a sub implicitly:

                ------ Executed when --------
                compiling to    -- loading --
  Sub Pragma    disk  memory    first   after
  ==========    ====  ======    =====   =====
   :immediate   yes   yes       no      no        
   :postcomp    yes   no        no      no        
   :load        no    no        no      yes       
   :init        no    yes       yes     no        
   :main        no    no        yes     no        

The same load-time behavior applies regardless of whether the loaded file is
PIR source or bytecode.  Note that it is possible to mark a sub with both
B<:load> and B<:init>.

=head2 Defining subs

A sub is defined by a block of code starting with C<.sub> and ending with
C<.end>. Parameters which the sub can be called with are defined by C<.param>:

=begin PIR

    .sub do_something
      .param pmc a_pmc
      .param string some_string
      #do something
    .end

=end PIR

The set of C<.param> instructions are converted to a single C<get_params>
instruction. The compiler will decide which registers to use.

=begin PIR_FRAGMENT
    
    get_params '(0,0)', $P0, $S0

=end PIR_FRAGMENT

A parameter can be declared optional with the C<:optional> command. If an
optional parameter is followed by parameter declared C<:opt_flag>, this
parameter will store an integer indicating whether the optional parameter
was used.

=begin PIR_FRAGMENT

    .param string maybe :optional
    .param int has_maybe :opt_flag
    unless has_maybe goto no_maybe
    #do something with maybe
    no_maybe:
    #don't use maybe

=end PIR_FRAGMENT

A sub can accept an arbitrary number of parameters by declaring a C<:slurpy>
parameter.  This creates a pmc containing an array of all parameters passed to
the sub, these can be accessed like so:

=begin PIR_FRAGMENT

    .param pmc all_params :slurpy

    $P0 = all_params[0]
    $S0 = all_params[1]

=end PIR_FRAGMENT

A slurpy parameter can also be defined after a set of positional parameters, in
which case it will only hold any additional parameters passed.

A parameter may also be declared C<:named>, giving them a string which can be
used when calling the sub to explicitly assign a parameter, ignoring position.

=begin PIR_FRAGMENT

    .param int counter :named("counter")

=end PIR_FRAGMENT

This can be combined with C<:optional> as well as C<:opt_flag>, so that the
parameter need only be passed when necessary.

If a parameter is declared with C<:slurpy> and C<:named> (with no string), it
creates an associative array containing all named parameters which can be
accessed like so:

=begin PIR_FRAGMENT

    .param pmc all_params :slurpy :named
    $S0 = all_params['name']
    $I0 = all_params['counter']

=end PIR_FRAGMENT

=head2 Calling the sub

PIR sub invocation syntax is similar to HLL syntax:

=begin PIR_FRAGMENT

    $P0 = do_something($P1, $S3)

=end PIR_FRAGMENT

This is syntactic sugar for the following four bytecode instructions:

=begin PIR_FRAGMENT

    # Establish arguments.
    set_args '(0,0)', $P1, $S3
    # Find the sub.
    $P8 = find_sub_not_null "do_something"
    # Establish return values.
    get_results '(0)', $P0
    # Call the sub in $P8, implicitly creating a return continuation.
    invokecc $P8

=end PIR_FRAGMENT

The sub name could be replaced with a PMC register, in which case the
C<find_sub_not_null> instruction would not be needed.  If the return values
from the sub were ignored (by dropping the C<$P0 => part), the C<get_results>
instruction would be omitted.  However, C<set_args> is emitted even in the
case of a call without arguments.

The first operands to the C<set_args> and C<get_results> instructions are
actually placeholders for an integer array that describes the register types.
For example, the '(0,0)' for C<set_args> is replaced internally with C<[2,
1]>, which means "two arguments, of type PMC and string".  Note that return
values get the same register type coercion as sub parameters.  This is all
described in much more detail in L<docs/pdds/pdd03_calling_conventions.pod>.

Named parameters can be explicity called in one of two ways:

=begin PIR_FRAGMENT

    $P5 = do_something($I6 :named("counter"), $S4 :named("name"))
    #or equivalently
    $P5 = do_something("counter" => $I6, "name" => $S4)

=end PIR_FRAGMENT

To receive multiple values, put the register names in parentheses:

=begin PIR_FRAGMENT

    ($P10, $P11) = do_something($P1, $S3)

    ($P10, $P11) = do_something($P1, $S3)

=end PIR_FRAGMENT

To test whether a value was returned, declare it C<:optional>, and follow it
with an integer register declared C<:opt_val>:

=begin PIR_FRAGMENT_INVALID

    ($P10 :optional, $I10 :opt_val) = do_something($P1, $S3)

=end PIR_FRAGMENT_INVALID

A C<:slurpy> value can be declared, as in parameter declarations, to catch an
arbitrary number of return values:

=begin PIR_FRAGMENT

    ($P12, $P13 :slurpy) = do_something($P1, $S3)

=end PIR_FRAGMENT

Note that the parameters stored in a C<:slurpy>, or C<:slurpy> C<:named> array
can be used as parameters for another call using the C<:flat> declaration:

=begin PIR_FRAGMENT

    ($P14, $P15) = do_something($P13 :flat)

=end PIR_FRAGMENT

Subs may also return C<:named> values, which can be explicitly accessed similar
to parameter declarations:

=begin PIR_FRAGMENT

    ($I11 :named("counter"), $S4 :named("name")) = do_something($P1, $S3)

=end PIR_FRAGMENT

All of these affect only the signature provided via C<get_results>.

[not sure what this is for, leaving it alone for now -aninhumer]

=begin PIR_FRAGMENT

    # Call the sub in $P8, with continuation (created earlier) in $P9.
    invoke $P8, $P9

=end PIR_FRAGMENT

=head2 Returning from a sub

PIR supports a convenient syntax for returning any number of values from a sub
or closure:

=begin PIR

    .sub main 
      .return ($P0, $I1, $S3)
    .end

=end PIR

Integer, float, and string constants are also accepted.  This is translated
to:

=begin PIR_FRAGMENT

    set_returns '(0,0,0)', $P0, $I1, $S3
    returncc	# return by calling the current continuation

=end PIR_FRAGMENT

As for C<set_args>, the '(0,0,0)' is actually a placeholder for an integer
array that describes the register types; it is replaced internally with C<[2,
0, 1]>, which means "three arguments, of type PMC, integer, and string".

All of the declarations allowed for calls to a sub can also be used with
return values. (C<:named>, C<:flat>)

Another way to return from a sub is to use tail-calling, which calls a new sub
with the current continuation, so that the new sub returns directly to the
caller of the old sub (i.e. without first returning to the old sub).  This
passes the three values to C<another_sub> via tail-calling:

=begin PIR

    .sub main
      .tailcall another_sub($P0, $I1, $S3)
    .end

=end PIR

This is translated into a C<set_args> instruction for the call, but with
C<tailcall> instead of C<invokecc>:

=begin PIR_FRAGMENT

    set_args '(0,0,0)', $P0, $I1, $S3
    $P8 = find_sub_not_null "another_sub"
    tailcall $P8

=end PIR_FRAGMENT

As for calling, the sub name could be replaced with a PMC register, in which
case the C<find_sub_not_null> instruction would not be needed.

If needed, the current continuation can be extracted and called explicitly as
follows:

=begin PIR_FRAGMENT

    ## This is what defines .INTERPINFO_CURRENT_CONT.
    .include 'interpinfo.pasm'
    ## Store our return continuation as exit_cont.
    .local pmc exit_cont
    exit_cont = interpinfo .INTERPINFO_CURRENT_CONT
    ## Invoke it explicitly:
    invokecc exit_cont
    ## ... or equivalently:
    tailcall exit_cont

=end PIR_FRAGMENT

To return values, use C<set_args> as before.

=head2 All together now

The following complete example illustrates the typical call/return pattern:

=begin PIR

    .sub main :main
	print "in main\n"
        the_sub()
	print "back to main\n"
    .end

    .sub the_sub
	print "in sub\n"
    .end

=end PIR

Notice that we are not passing or returning values here.

[example of passing values.  this could get pretty elaborate; look for other
examples first.  -- rgr, 6-Apr-08.]

If a short subroutine is called several times, for instance inside a loop, the
creation of the return continuation can be done outside the loop:

=begin PIR_INVALID

    .sub main :main
	    ## Initialize the sub and the return cont.
	    .local pmc cont
	    cont = new 'Continuation'
	    set_addr cont, ret_label
	    .const .Sub rsub = 'random_sub'
	    ## Loop initialization.
	    .local int loop_max, i
	    loop_max = 1000000
	    i = 0

	    ## Main loop.
    again:
	    set_args '(0)', i
	    invoke rsub, cont
    ret_label:
	    ## This is where "cont" returns.
	    inc i
	    if i < loop_max goto again
    .end

    .sub random_sub
	    .param int foo
	    ## do_something
    .end

=end PIR_INVALID

If the sub returns values, the C<get_results> must be B<after> C<ret_label> in
order to receive them.

Since this is much more obscure than the PIR calling syntax, it should only be
done if there is a measurable performance advantage.  Even in this trivial
example, calling "rsub(i)" is only about a third slower on x86.

=head1 FILES

F<src/pmc/sub.pmc>, F<src/pmc/closure.pmc>,
F<src/pmc/continuation.pmc>, F<src/pmc/coroutine.pmc>, F<src/sub.c>,
F<t/pmc/sub.t>

=head1 SEE ALSO

F<docs/pdds/pdd03_calling_conventions.pod>
F<docs/pdds/pdd19_pir.pod>

=head1 AUTHOR

Leopold Toetsch <lt@toetsch.at>

=cut

__END__
Local Variables:
  fill-column:78
End: