Sophie

Sophie

distrib > Mageia > 1 > i586 > by-pkgid > d92aa75c2d384ff9f513aed09a46f703 > files > 55

parrot-doc-3.1.0-2.mga1.i586.rpm

# Copyright (C) 2001-2006, Parrot Foundation.

=head1 NAME

docs/dev/pmc_freeze.pod - Freeze/Thaw Design Notes

=head1 VERSION

This document describes freeze/thaw internals version 0.1. This is not the
final implementation.

=head1 Overview

Freezing or serializing arbitrary PMCs is an interesting problem.  Aggregates
can hold other aggregates and can be deeply nested, so so a recursive approach
could easily blow the stack, especially on embedded systems. Also, aggregates
can be self-referential -- they can hold pointers to themselves -- so that
working on such structures could create infinite loops.

=head1 Coverage

Although the file is named F<pmc_freeze.c> it ultimately will deal with every
kind of operation that deeply traverses an arbitrary data structures.  For
example:

=over 4

=item freeze

Called from user code to serialize the state of a PMC into some (possibly
binary) representation held in a STRING.

=item freeze_at_destruct

A variant of C<freeze>, possibly called from an exception handler or on
resource shortage before interpreter shutdown, to save some data before dying.
It must not consume any additional resources.

=item thaw

The opposite of C<freeze>: reconstruct all PMCs to generate an identical copy
of the original frozen PMC. As with C<freeze>, can be called from user code.

=item dclone

Deeply clone an aggregate. C<dclone(p)> is basically the same as
C<thaw(freeze(p))>.

=item dump, pretty_print

Create a visual representation of an aggregate.

=item destruction ordering

Find the logical dependencies of a collection of PMCs, so that they can be
destroyed in an appropriate order. This is also called on interpreter shutdown.

=item mark

Mark all objects as being live by calling B<Parrot_gc_mark_PObj_alive> called from GC.
While the functionality is the same, it will not be implemented on top of this
general scheme for performance reasons. This leads to some code duplication,
but GC is run permanently and deserves all the speed it can get.

=back

=head1 Description

The basic scheme of operation looks like this:

  info = init()
  push todo_list, pmc
  while (todo_list)
      current = shift todo_list
      current->visit(info)
  done.

=head2 The visit_info structure

This structure holds all necessary information and function pointers specific
to the desired functionality. It gets passed on to all vtable functions and
callback functions.

=head2 Working loop

=over 4

=item todo list

A B<List> called B<todo> holds items still to be worked on. This method is
slower and consumes more resources, but doesn't interfere with GC runs and is
thread-safe.

=back

=head2 Putting items on the todo list

This is done by a callback function inside the B<visit_info> structure called
B<visit_pmc_now>. It gets called initially to put the first item on the list
and is called thereafter from all PMCs for contained PMCs inside the B<visit>
vtable function.


=head2 The visit() vtable

The general scheme above shows that this method is called for all items on the
B<todo_list>. B<visit> has to call B<visit_pmc_now> for all contained PMCs,    
which then get visited until all is done.

=head2 The visit_pmc_now() callback

The basic operation is:

  (seen, id) = was_already_seen(pmc)
  do_specific_action(pmc, seen, id)
  if (!seen)
     pmc->visit_action()

=head2 Avoiding duplicates

As stated in the introduction structures can be self-referential, they can
contain (at an arbitrary depth) PMCs, that were already processed. Just
following these PMCs would lead to endless loops. So already B<seen> PMCs have
to be remembered.

=over 4

=item The B<seen> hash

Using a B<Hash> is one method to avoid duplicates. The B<seen> hash holds keys
being the address of the PMC and values being a PMC B<id>, which is unique for
this PMC. While this is straight forward, it consumes 16 bytes per PMC (plus
overhead, 32-bit system assumed). Hash lookups also take a considerable
amount of time.

=back

=head2 The actual action

So after all we finally arrived at the point to actually perform the desired
functionality. First the PMC-specific part is done inside F<pmc_freeze.c> then
the specific vtable function B<freeze>, B<thaw>, whatever, is called, again
via a function pointer called B<visit_action>.

=head1 Freeze and thaw

As stated PMCs are currently processed inside the core, PMC-specific parts are
done by calling the PMCs vtable function. This parts could of course be moved to
F<default.pmc> too, so that it's simpler to override the functionality.

=head2 Serializer interface

During initialization the B<visit_info>s B<image_io> data pointer is filled
with an object having B<vtable> functions that remarkably look like a PMCs
vtable. So B<io-E<gt>vtable-E<gt>push_integer> spits out an INTVAL to the
frozen B<image>, while B<shift_integer> gets an INTVAL from the frozen stream.

This simplifies final changes when B<image_io> becomes just a PMC of some
serializer class. There are currently two serializers:

=over 4

=item Plain text

This serializer is mainly intended for testing. Having a readable
representation of the image simplifies debugging a lot.

=item Parrot Byte Code

We already have a platform-independent way of reading and writing opcodes,
string, and number-constants. So this serializer uses functionality of the
pack-file routines. The produced image isn't as dense as it could be though,
because all data are aligned at B<opcode_t> boundaries.

=back

=head2 Image data format

PMC B<id>s ranging from 1 to N-PMCs are shifted left by two, so that the 2 lo
bits can serve as flags:

  id + 0x1   ... PMC was seen
  id + 0x2   ... PMC has same type as previous PMC
  id + 0x3   ... escape flag

A PMCs image generally looks like:

  <id><type><pmc-specific-data>

The text representation of the array

  P0 = [P1=666, P2=777, P0]

may look like:

  0xdf4 30 3 0xdf8 33 666 0xdf2 777 0xdf5

  0xdf4 ... PMC id (with "0x" in front for clarity)
  30    ... enum_class_ResizablePMCArray
  3     ... elements count
  0xdf8 ... id of first element
  33    ... enum_class_Integer
  666   ... value
  0xdf2 ... id of second element, same type as prev element
  777   ... value
  0xdf5 ... id of array itself with lo bit set

The escape flag marks places in the image, where additional data will follow.
After the escape flag is an int defining the kind of the following data, passed
on in B<extra_flags>.  During B<thaw> the PMCs vtable is called again, to
restore these data. So a PMCs B<thaw> vtable has to check B<extra_flags> if
normal or extra data have to be shifted from the image.

This is e.g. needed for PMC properties or arrays containing sparse holes, to
set the array index of the following data.

A Integer(666) with a property hash ("answer"=>42) thus looks like:

  0xdfc 33 666 0xdff 2 0xdf4 32 1 answer 0xdf8 33 42

B<0xdff> is the escape mark for the PMC B<0xdfc> followed by the constant
B<EXTRA_IS_PROP_HASH>.

[ To be continued ]

=head1 FILES

F<src/pmc_freeze.c>, F<pf/pf_items.c>

=head1 Author

Leopold Toetsch C<lt@toetsch.at>

=cut

# vim: expandtab shiftwidth=2 tw=70: