Sophie: gap-system-4.4.12-5mdv2010.0 x86

gap-system-4.4.12-5mdv2010.0.x86_64.rpm

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%
%W  pargap1.tex            ParGAP documentation            Gene Cooperman
%%
%H  $Id: pargap1.tex,v 1.8 2001/11/16 15:57:00 gap Exp $
%%
%Y  Copyright (C) 1999-2001  Gene Cooperman
%Y    See included file, COPYING, for conditions for copying
%%

\pretolerance=500 % Will tolerate badness of 500 before trying hyphenations%
\tolerance=1600 % Will tolerate stretching line up to badness of 1600%
\hbadness=4000 % Seems to affect overfull boxes reported by TeX%
\hfuzz=5pt % If still no good break, can stick out into margin by 5 pt.%
\overfullrule=0pt % Lines sticking out more than 10 pt should not%
                  % contain the black box marking it.%

\Chapter{Writing Parallel Programs in GAP Easily}

\indextt{ParGAP}
The {\ParGAP}  (Parallel  {\GAP})  package  provides  a  way  of  writing
parallel programs using the {\GAP} language. Former names of the  package
were \package{ParGAP/MPI} and \package{GAP/MPI}; the word <MPI> refers to
<Message Passing  Interface>,  a  well-known  standard  for  parallelism.
{\ParGAP} is based on the MPI standard, and this distribution includes  a
subset implementation of MPI, to provide a portable  layer  with  a  high
level interface to BSD sockets. Since knowledge of MPI  is  not  required
for use of  this  software,  we  now  refer  to  the  package  as  simply
{\ParGAP}. For more information visit the author's  {\ParGAP}  home  page
at:
\URL{http://www.ccs.neu.edu/home/gene/pargap.html}

For some background reading, see~\cite{Coo95} and \cite{Coo97}.

This first chapter is intended to help a new user set  up  {\ParGAP}  and
run through some quick examples: see

\beginlist%unordered

\item{$\bullet$}
Section~"Overview of ParGAP" for an overview of the features of {\ParGAP}
and a general discussion of how it's implemented;

\item{$\bullet$}
Section~"Installing ParGAP" for how to install {\ParGAP};

\item{$\bullet$}
Section~"Running ParGAP"  for  how  to  run  {\ParGAP}  (*not*  by  using
`RequirePackage'); and

\item{$\bullet$}
Section~"Extended Example"  for  some introductory {\ParGAP} examples.

\endlist

The later chapters present detailed explanations  of  the  facilities  of
{\ParGAP}. Because parallel programming is  sufficiently  different  from
sequential programming, this author  recommends  printing  out  at  least
Chapters~1 through~"MasterSlave Tutorial",  and  skimming  through  those
chapters for areas of interest, before returning to the terminal  to  try
out   some   of   the   ideas.   This   document   can   be   found    in
`.../pkg/pargap/doc/manual.dvi' of the  software  distribution.  You  may
also want to print the index at the end of `manual.dvi'.  In  particular,
the heading `example' in the index, or `??example'  from  within  {\GAP},
should be useful. If you prefer postscript, the UNIX command `dvips' will
convert that file to postscript form.

The development of {\ParGAP} was partially supported by National  Science
Foundation grants CCR-9509783 and CCR-9732330.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Overview of ParGAP}

{\ParGAP} is currently functional only on UNIX installations. (Cygwin for
Windows is also an option, if you would like to port it.)  {\ParGAP}  can
be  installed  on  top  of   an   existing   {\GAP}   installation.   See
Section~"Installing  ParGAP"  for   instructions   on   installation   of
{\ParGAP}. At the time that {\ParGAP} is invoked, a special ``procgroup''
file must be available to tell {\ParGAP}  which  processors  to  use  for
slave processors. See sections~"Installing ParGAP" and~"Extended Example"
for instructions on invoking {\ParGAP}. If there are  questions  or  bugs
concerning {\ParGAP}, please write to: \Mailto{gene@ccs.neu.edu}

If one wishes only to try out the parallel features, the first five pages
of this manual (through the section on the slave listener)  will  suffice
for installation, and using it. For the more advanced user who wishes  to
design new parallel algorithms or port old sequential code to a  parallel
environment, it  is  strongly  recommended  to  also  read  the  sections
following  on  from  Section~"Basic  Concepts   for   the   TOP-C   model
(MasterSlave)".

{\ParGAP} should be invoked via the script `bin/pargap.sh' created by the
installation process which invokes `<GAP_ROOT_DIR>/bin/<ARCH>/pargapmpi',
where <ARCH> depends on your system but is the same  directory  in  which
the `gap' binary is  found.  MPI  and  the  higher  layers  will  not  be
available if the binary is invoked in the standard way as `gap'. This  is
a feature, since a single binary and source distribution serves both  for
the standard {\GAP} and for {\ParGAP}.

{\ParGAP} is implemented in three layers: 1)~MPI, 2)~Slave~Listener,  and
3)~Master~Slave (TOP-C abstraction). Most users will find  that  the  two
highest layers (Slave Listener and Master Slave) meet all their needs.

\beginitems
`1) MPI:'&
    The lowest layer is MPI. Most users can ignore this layer. MPI  is  a
    standard for message-based parallel  computation.  A  subset  of  the
    original MPI commands is provided. The syntax is  modified  from  the
    original C binding  to  make  a  {\GAP}  binding  in  an  interpreted
    environment more convenient. This includes default arguments,  useful
    return  values,  and  `Error'  break  in  the  presence  of   errors.
    `MPI_Init()'       (see~"MPI_Init")       and        `MPI_Finalize()'
    (see~"MPI_Finalize") are invoked automatically by {\ParGAP}.

`'& The MPI layer is not documented, since most users will not  be  using
    it. From {\GAP} level, you can  type:  `MPI_<tab><tab>'  to  see  all
    implemented MPI functions and variables. However, typing  the  symbol
    name alone (e.g.: `MPI_Send;' ) will cause it to display the  calling
    syntax. The same information is displayed after  an  incorrect  call.
    The  return  value  is  typically  obvious.  MPI  is  implemented  in
    `src/pargap.c'. The  standard  distribution  uses  a  simple,  subset
    implementation of MPI in `pkg/gapmpi/mpinu/', which is implemented on
    top of a standard sockets interface. It  is  possible  to  substitute
    other implementations of MPI.

\atindex{MPI!standard}{@MPI!standard}
`'& For those who wish to directly use the MPI interface, the meanings of
    the MPI calls are best found from the standard MPI documentation:

`'&MPI Forum: \URL{http://www.mpi-forum.org/}

`'&MPI Standard (version 1.1):
   \URL{http://www.mpi-forum.org/docs/mpi-11-html/mpi-report.html}

`'&UNIX style man pages: \URL{http://www-c.mcs.anl.gov/mpi/www/}

`2) Slave Listener:'&
    This  layer   provides   basic   message   passing   facilities   for
    communication among multiple {\ParGAP} processes in a  form  that  is
    more convenient for programming than the lower MPI layer.  This  will
    be the most useful entry point to {\ParGAP} for most users.  This  is
    the default mode for {\ParGAP}. Each remote (slave) process is  in  a
    receive-eval-send loop, in which the slave receives a {\GAP}  command
    from the local or master, the slave evaluates the {\GAP} command, and
    the slave then sends the result  back  to  the  master  as  a  {\GAP}
    object.

`'&
    Almost all commands in the slave listener are  of  the  form  `*Msg*'
    e.g.  `SendMsg()'   (see~"SendMsg"),   `RecvMsg()'   (see~"RecvMsg"),
    `ProbeMsg()'   (see~"ProbeMsg").   Since   the   slave   is   in    a
    receive-eval-send loop, every `SendMsg(<cmd>)' on the master must  be
    balanced by a later `RecvMsg()'. `SendRecvMsg()'  (see~"SendRecvMsg")
    is provided to combine these steps. A few parallel utilities are also
    included, such as `ParRead()' ("ParRead"),  `ParList()'  ("ParList"),
    `ParEval()' ("ParEval"), etc.

`'& Messages are arbitrary {\GAP} objects. Note  that  arguments  to  any
    {\GAP} function are evaluated before being passed  to  the  function.
    Hence, any argument to `SendMsg()' or `ParEval()' would be  evaluated
    locally before being  sent  across  the  network.  For  this  reason,
    arguments can also be given as strings,  to  delay  evaluation  until
    reaching the destination process. Hence, real strings must be quoted:
    `ParEval("x:=\"abc\";");' Additionally, multiple commands are  valid,
    and the final ```;''' of the string is optional. So, one can write:

\begintt
BroadcastMsg("x:=\"abc\"; Print(Length(x), \"\\n\")");;
\endtt

`'& A full description is contained in Chapter~"Slave Listener".

`3) Master Slave:'&
    The Master Slave  facility  is  provided  both  for  writing  complex
    parallel software, and as an easier way to  parallelize  previous  or
    ``legacy''  sequential  code.  While  the  Slave  Listener   may   be
    sufficient for simple parallel requirements,  more  complex  software
    requires a higher level abstraction. The fundamental abstractions  of
    the master slave layer are the *task* and the *shared data*.

\beginlist
\itemitem{`1)'}
    The task typically corresponds to the procedure or inner  body  of  a
    loop in  a  sequential  program.  This  is  the  part  that  must  be
    repetitively computed in parallel.

\itemitem{`2)'}
    The shared data typically corresponds to the  data  of  a  sequential
    program that is not within the local scope of the task. Often this is
    a global data structure. In the case that the task is the inner  body
    of a loop, the shared data may be a  local  data  structure  that  is
    outside the local scope of the loop.
\endlist

`'& It is usually quite easy to identify the task and the shared data  of
    a sequential program  or  algorithm,  which  is  the  first  step  in
    parallelizing an algorithm.

`'& The  Master  Slave  parallel  model  described  here  has  also  been
    successfully used in~C  and  in  LISP.  It  has  been  used  both  in
    distributed memory and  shared  memory  environments,  although  this
    version in {\GAP} currently works only in a distributed  environment.
    In the C~language, this  parallel  model  is  known  as  TOP-C  (Task
    Oriented Parallel~C). For examples of the use of the TOP-C model  see
    \cite{Coo98},     \cite{CFTY94},     \cite{CH97},      \cite{CHLM97},
    \cite{CLMW96}, and \cite{CT96}.

`'& While no parallel software can eliminate the problem of designing  an
    algorithm that is efficient in  a  parallel  environment,  the  TOP-C
    abstraction eases the job by eliminating  programmer  concerns  about
    lower  level  details,  such  as  message  passing,   migration   and
    replication of data, load balancing, etc. This leaves the  programmer
    to concentrate on the primary goal:  maximizing  the  concurrency  or
    parallelism.

\enditems

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Installing ParGAP}

\index{installation}
Installing {\ParGAP} should be relatively simple.  However,  since  there
are many interactions both with the  {\GAP}  kernel  and  with  the  UNIX
operating system, in a minority of cases,  manual  intervention  will  be
necessary. If you are part of  this  minority,  please  see  the  section
"Problems with Installation".  The  most  common  problem  is  the  local
security policy; {\ParGAP} is more pleasant to use when you don't have to
manually provide the password for each slave. See section "Problems  with
Passwords (Getting Around Security)" for suggestions in this respect.

To install the {\ParGAP} package, move  the  file  `pargap-<XXX>.zoo'  or
`pargap-<XXX>.tar.gz' (for some version number <XXX> of  {\ParGAP})  into
the `pkg' directory in which you plan to install {\ParGAP}. Usually, this
will be the directory `pkg' in the hierarchy of your version of  {\GAP}~4
(in fact, currently it is  not  possible  to  have  the  `pkg'  directory
separate from {\GAP}'s `pkg' directory; we hope to remedy this in  future
versions of {\ParGAP} so that it will also possible to keep an additional
`pkg' directory in your private directories; section "ref:Installing  GAP
Packages" of the GAP 4 reference manual gives details on how to do  this,
when it's possible.)

Now change into  the  `pkg'  directory  in  which  you  plan  to  install
{\ParGAP}. If you got a `.zoo' file, unpack it with:

\){\kernttindent}unzoo -x pargap-<XXX>

If you got a `.tar.gz' file and  your  `tar'  command  supports  the  `z'
option, unpack it with:

\){\kernttindent}tar zxf pargap-<XXX>.tar.gz

or otherwise unpack in two steps with:

\){\kernttindent}gunzip pargap-<XXX>.tar
\){\kernttindent}tar xvf pargap-<XXX>.tar

Whether you got the `.zoo' or `.tar.gz' archive you should now have a new
directory `pargap'. As for a generic {\GAP} package, do:

\begintt
cd pargap
./configure ../..
make
\endtt

If your version of {\GAP} is earlier than {\GAP}~4.3 you will first  need
to adjust {\GAP}'s `lib/init.g' file; see item~0.\  of  Section~"Problems
with Installation".

Your {\ParGAP} should now be ready to use.  Now  read  the  next  section
which decribes how to  run  {\ParGAP}  (if  you  are  reading  this  from
{\GAP}'s on-line help, type: `?>').

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Running ParGAP}

After doing the `configure' and `make' steps of {\ParGAP}'s  installation
process (see Section~"Installing ParGAP"), you should find in {\ParGAP}'s
`bin' subdirectory a script

\begintt
pargap.sh
\endtt

which you should use to start {\ParGAP}. ({\ParGAP} can *not* be  started
by starting {\GAP}~4 in the usual way, and using `RequirePackage';  doing
so will result in `Info'-ed  advice  to  read  this  section.)  Edit  the
`pargap.sh' script if necessary, copy it to a standard path and rename it
according to how you intend to call {\ParGAP} (e.g. rename it: `pargap').
Also, in the `bin'  subdirectory  is  a  sample  `procgroup'  file  which
defines the master and slave processes that will be  used  by  {\ParGAP}.
When {\ParGAP} is started it looks for a file called `procgroup'  in  the
current directory, unless the `-p4pg' option is used. Thus if you renamed
your shell script `pargap', the following  are  valid  ways  of  starting
{\ParGAP}:

\begintt
pargap
\endtt

(if current directory contains the file: `procgroup'), or

\){\kernttindent}pargap -p4pg <myprocgroupfile>

(where <myprocgroupfile> is the complete path of your  procgroup  file --
there is no restriction on how you name it).

If you had trouble installing {\ParGAP}, see the  section~"Problems  with
Installation". Otherwise continue onto Section~"Extended Example" and try
out {\ParGAP}.

*Note:*
The script  `pargap.sh'  defines  the  program  that  runs  {\ParGAP}  as
`pargapmpi'. In fact, after installation `pargapmpi' is a  symbolic  link
to the {\GAP} binary named `gap'. The same binary runs  both  {\GAP}  and
{\ParGAP}; when the binary is invoked as `gap' {\GAP} runs in  the  usual
way without any parallel features; only when the  binary  is  invoked  as
`pargapmpi'    are    the    parallel    features    incorporated.    See
Section~"Modifying the GAP kernel" for more details.

Now you are ready to test your  installation,  try  the  example  in  the
following section (if you are reading this from  {\GAP}'s  on-line  help,
type: `?>').

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Extended Example}

After  installation,  try  it  out.  Invoke  {\ParGAP}  as  described  in
Section~"Running ParGAP" and try the example below (but  substitute  your
own program where you see `"/home/gene/myprogram.g"').  The  commands  in
this first example are also found in the `README' file. So, you may  wish
to copy text from the `README' file and paste it into a `ParGAP' session.
If you are using the unmodified `procgroup' file,  your  *remote  slaves*
will be other processes on your local machine. It is a good idea  to  run
only on your local machine for your first experiments and while  you  are
debugging parallel programs. When  you  wish  to  experiment  with  using
remote machines, you can then proceed to the following section, "Invoking
ParGAP with Remote Slaves".

\atindex{example!Slave Listener}{@example!Slave Listener}
\atindex{Slave Listener!example}{@Slave Listener!example}
\beginexample
gap> # This assumes your procgroup file includes two slave processes.
gap> PingSlave(1); #a `true' response indicates Slave 1 is alive
true
gap> # Print() on slave appears on standard output 
gap> # i.e. after the master's prompt.
gap> SendMsg( "Print(3+4)" );
gap> 7
gap> # A <return> was input above to get a fresh prompt.
gap> #
gap> # To get special characters (including newline: `\n')
gap> # into a string, escape them with a `\'.
gap> SendMsg( "Print(3+4,\"\\n\")" );
gap> 7

gap> # Again, a <return> was input above after the 7 and new-line
gap> # were printed to get a fresh prompt.
gap> #
gap> # Each SendMsg() is normally balanced by a RecvMsg().
gap> SendMsg( "3+4", 2);
gap> RecvMsg( 2 );
7
gap> # The following is equivalent to the two previous commands.
gap> SendRecvMsg( "3+4", 2);
7
gap> # Flush any messages that are pending. The response is
gap> # the number of messages flushed. (Above, the two
gap> # SendMsg("Print...") (to the default slave: 1) did not
gap> # have a corresponding RecvMsg() command.)
gap> FlushAllMsgs();
2
gap> # As with Print() the result of Exec() appears on standard
gap> # output. Print() and Exec() are each `no-value' functions,
gap> # and so the result of a RecvMsg() in these cases
gap> # is "<no_return_val>".
gap> SendRecvMsg( "Exec(\"pwd\")" ); # Your pwd will differ :-)
/home/gene
"<no_return_val>"
gap> # Put default slave into an infinite loop.
gap> SendMsg("while true do od");
gap> # Default slave can't execute the next command until it's 
gap> # finished with the previous command.
gap> SendMsg("Print(\"WAKE UP\\n\")");
gap> # Check to see if a message is waiting to be collected but
gap> # return immediately (i.e. don't get blocked by waiting for
gap> # a message to appear). A `false' response indicates the
gap> # infinite loop hasn't terminated and produced a value yet!
gap> ProbeMsgNonBlocking();
false
gap> # Send an interrupt to each slave, slave 1 will see the
gap> # following command and print `WAKE UP', and then all
gap> # pending messages are flushed.
gap> ParReset();
... resetting ...
WAKE UP
0
gap> # The return value, 0, from ParReset() indicates there
gap> # were 0 pending messages flushed, confirming correctness
gap> # of ProbeMsgNonBlocking() when it returned "false"
gap> SendRecvMsg( "a:=45; 3+4", 1 );
7
gap> # Note "a" is defined on slave 1, not slave 2.
gap> SendMsg( "a", 2 ); # Slave prints error, output on master
gap>  Variable: 'a' must have a value
gap> # <return> entered to get fresh prompt.
gap> RecvMsg( 2 ); # No value for last SendMsg() command
"<no_return_val>"
gap> RecvMsg( 1 );
45
gap> myfnc := function() return 42; end;;
gap> # Use PrintToString() to define myfnc on all slave processes
gap> BroadcastMsg( PrintToString( "myfnc := ", myfnc ) );
gap> SendRecvMsg( "myfnc()", 1 );
42
gap> FlushAllMsgs(); # There are no messages pending.
0
gap> # Execute analogue of GAP's List() in parallel on slaves.
gap> squares := ParList( [1..100], x->x^2 );
[ 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 
  289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 
  900, 961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521, 1600, 
  1681, 1764, 1849, 1936, 2025, 2116, 2209, 2304, 2401, 2500, 2601, 
  2704, 2809, 2916, 3025, 3136, 3249, 3364, 3481, 3600, 3721, 3844, 
  3969, 4096, 4225, 4356, 4489, 4624, 4761, 4900, 5041, 5184, 5329, 
  5476, 5625, 5776, 5929, 6084, 6241, 6400, 6561, 6724, 6889, 7056, 
  7225, 7396, 7569, 7744, 7921, 8100, 8281, 8464, 8649, 8836, 9025, 
  9216, 9409, 9604, 9801, 10000 ]
gap> # Ensure problem shared data is read into master and slaves.
gap> # Try one of your GAP program files instead.
gap> ParRead( "/home/gene/myprogram.g");
\endexample

Now that you have done a fairly rudimentary test of {\ParGAP} you  should
be ready to do something a little bit more interesting:

\beginexample
gap> ParInstallTOPCGlobalFunction( "MyParList",
> function( list, fnc )
>   local result, iter;
>   result := [];
>   iter := Iterator(list);
>   MasterSlave( function() if IsDoneIterator(iter) then return NOTASK;
>                           else return NextIterator(iter); fi; end,
>                fnc,
>                function(input,output) result[input] := output;
>                                       return NO_ACTION; end,
>                Error
>              );
>   return result;
> end );
gap> MyParList( [1..25], x->x^3 );
master -> 1:  1
master -> 2:  2
2 -> master: 8
1 -> master: 1
master -> 1:  3
master -> 2:  4
2 -> master: 64
1 -> master: 27
master -> 1:  5
master -> 2:  6
2 -> master: 216
1 -> master: 125
master -> 1:  7
master -> 2:  8
2 -> master: 512
1 -> master: 343
master -> 1:  9
master -> 2:  10
2 -> master: 1000
1 -> master: 729
master -> 1:  11
master -> 2:  12
2 -> master: 1728
1 -> master: 1331
master -> 1:  13
master -> 2:  14
2 -> master: 2744
1 -> master: 2197
master -> 1:  15
master -> 2:  16
2 -> master: 4096
1 -> master: 3375
master -> 1:  17
master -> 2:  18
2 -> master: 5832
1 -> master: 4913
master -> 1:  19
master -> 2:  20
2 -> master: 8000
1 -> master: 6859
master -> 1:  21
master -> 2:  22
2 -> master: 10648
1 -> master: 9261
master -> 1:  23
master -> 2:  24
2 -> master: 13824
1 -> master: 12167
master -> 1:  25
1 -> master: 15625
[ 1, 8, 27, 64, 125, 216, 343, 512, 729, 1000, 1331, 1728, 2197, 2744, 3375, 
  4096, 4913, 5832, 6859, 8000, 9261, 10648, 12167, 13824, 15625 ]
gap> ParInstallTOPCGlobalFunction( "MyParListWithAglom",
> function( list, fnc, aglomCount )
>   local result, iter;
>   result := [];
>   iter := Iterator(list);
>   MasterSlave( function() if IsDoneIterator(iter) then return NOTASK;
>                           else return NextIterator(iter); fi; end,
>                fnc,
>                function(input,output)
>                  local i;
>                  for i in [1..Length(input)] do
>                    result[input[i]] := output[i];
>                  od;
>                  return NO_ACTION;
>                end,
>                Error,  # Never called, can specify anything
>                aglomCount
>              );
>   return result;
> end );
gap> MyParListWithAglom( [1..25], x->x^3, 4 );
master -> 1: (AGGLOM_TASK): [ 1, 2, 3, 4 ]
master -> 2: (AGGLOM_TASK): [ 5, 6, 7, 8 ]
1 -> master: [ 1, 8, 27, 64 ]
2 -> master: [ 125, 216, 343, 512 ]
master -> 1: (AGGLOM_TASK): [ 9, 10, 11, 12 ]
master -> 2: (AGGLOM_TASK): [ 13, 14, 15, 16 ]
1 -> master: [ 729, 1000, 1331, 1728 ]
2 -> master: [ 2197, 2744, 3375, 4096 ]
master -> 1: (AGGLOM_TASK): [ 17, 18, 19, 20 ]
master -> 2: (AGGLOM_TASK): [ 21, 22, 23, 24 ]
1 -> master: [ 4913, 5832, 6859, 8000 ]
2 -> master: [ 9261, 10648, 12167, 13824 ]
master -> 1: (AGGLOM_TASK): [ 25 ]
1 -> master: [ 15625 ]
[ 1, 8, 27, 64, 125, 216, 343, 512, 729, 1000, 1331, 1728, 2197, 2744, 3375, 
  4096, 4913, 5832, 6859, 8000, 9261, 10648, 12167, 13824, 15625 ]
\endexample

If you wish  an  accelerated  introduction  to  the  models  of  parallel
programming provided here, you  might  wish  to  read  the  beginning  of
Chapter~"Slave Listener" through section~"Slave Listener  Commands",  and
then proceed immediately to Chapter~"Basic Concepts for the  TOP-C  model
(MasterSlave)".

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Author}

The {\ParGAP} package was designed and written by Gene Cooperman, College
of Computer Science, Northeastern University, Boston, MA, U.S.A.

If you use {\ParGAP} to solve a problem then please send a short email to
\Mailto{gene@ccs.neu.edu} about it, and cite  the  {\ParGAP}  package  as
follows:

\begintt
\bibitem[Coo99]{Coo99}
      Cooperman, Gene,
      {\sl Parallel GAP/MPI (ParGAP/MPI)}, Version 1,
      College of Computer Science, Northeastern University, 1999,
      \verb+http://www.ccs.neu.edu/home/gene/pargap.html+.
\endtt

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Invoking ParGAP with Remote Slaves}

{\ParGAP}, unlike {\GAP}, must be invoked under a  separate  name.  After
{\ParGAP} has been installed, a script  `bin/pargap.sh'  will  have  been
created  which   (after   any   changes   you   needed   to   make;   see
Section~"Installing ParGAP") you should use to invoke {\ParGAP}. This  is
similar  to  `<GAP_ROOT_DIR>/bin/gap.sh'  that  is  used  to  invoke  the
non-parallel {\GAP}. Installers are encouraged to  treat  `pargap.sh'  in
analogy to `gap.sh'. For example, if your site  has  copied  `gap.sh'  to
`/usr/local/bin/gap', then you  should  also  look  for  the  `pargap.sh'
script as `/usr/local/bin/pargap'.

In addition, when `pargap' (we'll assume that's how {\ParGAP} is  invoked
at your site) is called, there  must  be  a  file,  `procgroup',  in  the
current directory,  or  alternatively,  if  you  wish  to  use  a  single
procgroup file for all jobs, and that procgroup file is  in  `/home/joe',
then you can alias `pargap' to `pargap -p4pg /home/joe/procgroup'.

The  procgroup  file  has  a  simple  syntax,  taken   from   the   MPICH
implementation of MPI (inherited from P4). A `\#' in column~1  introduces
a comment line. The first non-comment line should be `local 0', verbatim.
This line declares the master process as the local process.  Other  lines
are of the form:

\){\kernttindent}<host-machine> 1 <pargap-script>

e.g.

\begintt
regulus.ccs.neu.edu 1 /usr/local/bin/pargap
\endtt

The first field is the hostname for a remote process.  The  second  field
specifies one thread per process. ({\ParGAP} recognizes only the  value~1
for the second field.) The  third  field  is  an  absolute  pathname  for
{\ParGAP}, as it would be called on the remote process. Note that you can
repeat the same line twice if you want two remote {\ParGAP} processes  on
the same processor. The default `procgroup' provided in the  distribution
will have lines of form:

\){\kernttindent}localhost 1 <path-of-provided-pargap.sh>

If you change <path-of-provided-pargap.sh> to just, say,  `pargap',  this
will work only if `pargap' is in your path on the  remote  machine  shell
(`localhost' in this case), using your default shell. On  most  machines,
`localhost' is an alias for the local processor. This is a  good  default
for debugging, so that you don't disturb users on other machines.

MPI will use a line

\){\kernttindent}<host-machine> 1 <pargap-script>

to create a UNIX subprocess executing:

\){\kernttindent}rsh <host-machine> <pargap-script>

Suppose <host-machine> is `regulus.ccs.neu.edu'  and  <pargap-script>  is
`/usr/local/bin/pargap' as in the above example,  and  we  were  to  have
trouble invoking {\ParGAP}, then it would be a good idea to try  invoking
`rsh regulus.ccs.neu.edu' from a UNIX prompt and  if  that  succeeds,  to
then try executing the full `rsh' command.

A typical problem is that the remote processor  requires  a  password  to
login.   MPI   requires   a   login   without    passwords.    Typically,
`/etc/hosts.equiv' has not been set up to remove the password requirement
for your remote host. Sometimes this can  be  solved  by  an  appropriate
`.rhosts' file in your home directory on the remote host. Sometimes,  PAM
is also used for user authentication (see `/etc/pam.conf'). `man in.rshd'
also has helpful information.  Consult  your  system  staff  for  further
analysis. In these days of hyper-security, `rsh' may be disabled at  your
site and you may have to use `ssh' instead; if so, there  is  a  solution
here: add the lines

\begintt
#############################################################################
##
##  RSH . . . .. . . . . . . . . . . . . . . . .  remote shell used by ParGAP
##
##
RSH=ssh
export RSH
\endtt

before the `GAP' block with the `exec' line. (Of course, the  `\#'  lines
are not needed; they are comments.)

Note that the remote {\ParGAP} process will not read from standard input,
although signals such as SIGINT (`\^{}C') may be received by  the  remote
process. However, the remote {\ParGAP} process  will  write  to  standard
output, which is relayed to the local process. So,

\beginexample
gap> SendMsg("Exec(\"hostname\")", 2);
\endexample

will execute and print from the remote process.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Problems with Installation}

If you still have problems, here is a list of things to check.

\beginlist
\item{0.}
    In  versions  of  {\GAP}  earlier  than  {\GAP}~4.3  some   {\ParGAP}
    ``hooks'' need to be added to {\GAP}'s `lib/init.g' file. Please add:

\begintt
PAR_GAP_SLAVE_START := fail;
\endtt

\item{}
    before the line:

\begintt
       READ(GAP_RC_FILE);
\endtt

\item{}
    and add:

\begintt
if PAR_GAP_SLAVE_START <> fail then PAR_GAP_SLAVE_START(); fi;
\endtt

\item{}
    at the end of the file.

\item{1.}
    Do you have enough swap space to support multiple {\GAP} processes? A
    simple way to check this is with the UNIX command, `top'.  The  Linux
    version of `top' sorts by memory usage if you type `M'.

\item{2.}
   `make' tries to automatically create:

\begintt
pkg/pargap/bin/pargap.sh
\endtt
       
\item{}
    and copy the parameters from `<GAP_ROOT>/bin/gap.sh'. <GAP_ROOT>  was
    specified when  you  executed  `./configure  <GAP_ROOT>'  to  install
    ParGAP. This can be error-prone if your site has an unusual setup. If
    you execute `<GAP_ROOT>/bin/gap.sh', does gap come up? If so, compare
    it   with   `pargap.sh'   and   check   for   correct   settings   in
    `.../pkg/pargap/bin/pargap.sh'?

\item{3.}
    Did {\ParGAP} find your `procgroup' file?
    [It looks in the current directory for `procgroup', or for:

\){\kernttindent}... -p4pg <PATH>/procgroup

\item{}
    on the command line.]

\item{4.}
    Were the remote slave processes able to start up? If so,  could  they
    connect back to  the  master?  To  test  connectivity  problems,  try
    manually starting a remote slave by executing a line in  the  script.
    Try a simple `rsh <remote-hostname>' to see  if  the  issue  is  with
    security. If your site uses `ssh' instead of `rsh', then there  is  a
    security issue. Read Section~"Problems with Passwords (Getting Around
    Security)", and possibly `man sshd'.

\item{5.}
    If  the  previous  step  failed  due  to  security  issues,  such  as
    requesting a password, you have several options. `man rshd' tells you
    the security model at your site (or possibly `man  ssh'  if  you  use
    that). Then read Section~"Problems  with  Passwords  (Getting  Around
    Security)".

\item{6.}
    Is the `procgroup' file in your current directory set correctly?
    Test it.  If you are calling it on a remote host, manually type:

\){\kernttindent}rsh <HOSTNAME> <ParGAP>

\item{}
    where <HOSTNAME> and <ParGAP> appear exactly as in `procgroup', e.g.
    
\){\kernttindent}rsh denali.ccs.neu.edu /usr/local/gap4r3/bin/pargap.sh

\item{}
    In some cases, `exec' is used to save process overhead. Also try:

\){\kernttindent}rsh <HOSTNAME> exec <ParGAP>

\item{}
    If you plan to call it on localhost, try just:   <ParGAP>

\item{}
    Note that if not all the slave processes succeed in connecting
    to the master, then {\ParGAP} writes out a file:

\begintt
/tmp/pargapmpi--rsh.$$
\endtt
       
\item{}
    where `\$\$' is replaced by the  the  process  id  of  the  {\ParGAP}
    process.

\item{7.}
    Is `pargap' listed in `.../pkg/ALLPKG'?
    [It's needed to autostart slaves.]

\item{8.}
    Inside {\ParGAP}, has MPI been successfully initialized?
    Try:  
    
\beginexample
gap> MPI_Initialized();
\endexample

\item{9.}
    A remote (slave) {\ParGAP} process starts in your home directory  and
    tries to `cd'  to  a  directory  of  the  same  name  as  your  local
    directory. Check your assumptions about the remote machine. Try:

\beginexample
gap> SendRecvMsg("Exec(pwd)"); SendRecvMsg("UNIX_Hostname()");
gap> SendRecvMsg("UNIX_Getpid()");
\endexample

\item{10.}
    If the connection dies at random, after some period of time:
    You can experiment with `SO_KEEPALIVE' and variants.  
    (See `man setsockopt'.)
    This periodically sends *null messages* so the  remote  machine  does
    not think that the originating  machine  is  dead.  However,  if  the
    remote machine fails to reply, the  local  process  sends  a  SIGPIPE
    signal to notify current processes of a broken  socket,  even  though
    there might have been only a temporary lapse in connectivity.
    `ssh' specifies `KeepAlive yes' by default, but setting `KeepAlive no'
    might get you through some transient lapses in  connectivity  due  to
    high congestion. 
    You may also want to experiment with: `setenv RSH "rsh -n"'

\item{11.}
    Read the documentation for further possible problems.

\endlist

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Problems with Hosts on Multiple Networks}

If a host is on multiple networks, it will have multiple IP addresses and
usually multiple hostnames. In  this  case,  the  master  process  cannot
always guess correctly which IP address (which internet  address)  should
be passed to the slave process, so that the slave process can  call  back
to the master. In such cases,  you  may  need  to  tell  {\ParGAP}  which
hostname or IP address to use for the callback. This is done  by  setting
the UNIX environment variable, `CALLBACK_HOST', as in the example below.

\begintt
# [ in sh/bash/... ]
CALLBACK_HOST=denali.ccs.neu.edu; export CALLBACK_HOST
# [ in csh/tcsh/... ]
setenv CALLBACK_HOST=denali.ccs.neu.edu
\endtt

The appropriate  line  for  your  shell  can  be  placed  in  your  shell
initialization file. Alternatively, you can set this up for all users  by
placing the Bourne shell version (for `sh') somewhere between  the  first
and last line of `.../pkg/pargap/bin/pargap.sh'.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Problems with Passwords (Getting Around Security)}

There is a simple test to see if you need to read this  section.  Pick  a
remote machine, <HOSTNAME>, that you wish to execute on, and  type:  `rsh
<HOSTNAME>'. If this did not work, also try `ssh <HOSTNAME>'. If you were
asked for your password, then you and your system administrator may  need
to talk about security policy. If you were successful with `ssh' and  not
with `rsh' then set the environment variable, `RSH', to the value  `ssh',
as described in item~3 below.

\beginlist
\item{(1)}
    Ask your systems administrator to put the machines in a `hosts.equiv'
    file, so that logging in from one to the other  does  not  require  a
    password. (`man hosts.equiv')

\item{(2)}
    Add a `.rhosts' file to your home directory (or `.shosts' for `ssh').

\item{(3)}
    Hack around the problem: By default, the startup script uses `rsh' to
    start remote processes. However, if the  environment  variable  `RSH'
    was set, the script  uses  the  value  of  the  environment  variable
    instead of `rsh'. This may be useful, if you have  your  own  script,
    `myrsh', that automatically gets around  the  security  issues.  Then
    just type:

\begintt
RSH=myrsh; export RSH  # [ in sh/bash/... ]
setenv RSH myrsh       # [ in csh/tcsh/... ]
\endtt

\item{}
    The appropriate line for your shell  can  be  placed  in  your  shell
    initialization file. Alternatively, you can set this up for all users
    by placing the Bourne shell version (for `sh') somewhere between  the
    first and last line of `.../pkg/pargap/bin/pargap.sh'.  (The  example
    for `ssh' was given earlier.)

\item{(4)}
    `ssh': `man ssh' mentions some possibilities for giving the  password
    the first time, and then having ssh remember that  future  logins  to
    that machine are authorized for the duration of  the  session.  Don't
    overlook the use of `\$HOME/.ssh/config' to set  special  parameters,
    such as specifying a different login name on the remote machine. Some
    parameters of interest  might  be  `KeepAlive',  `RSAAuthentication',
    `UseRsh'. You may also find useful information in `man sshd'.

\item{(5)}
     After starting {\ParGAP}, manually call

\begintt
/tmp/pargapmpi--rsh.$$
\endtt

\item{}
    and repeatedly type in the password for each slave  process.  If  you
    find yourself doing this, you may  want  to  talk  with  your  system
    administrator, since it actually hurts system security  to  have  you
    repeatedly typing passwords with a  concommitant  risk  that  someone
    else will find out your password.

\endlist

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Section{Modifying the GAP kernel}

Note that this package modifies the {\GAP} `src'  and  `bin'  files,  and
creates a new {\GAP} kernel. This new {\GAP}  kernel  can  be  shared  by
traditional users of the old, sequential  {\GAP}  kernel,  and  by  those
doing parallel processing.

The {\GAP} kernel will have identical behavior to the old  {\GAP}  kernel
when invoked through  the  `gap.sh'  script  or  the  `bin/@GAParch@/gap'
binary. The new {\ParGAP} variables will appear to the end user *ONLY* if
the {\GAP} binary was invoked as `pargapmpi':  a  symbolic  link  to  the
actual {\GAP} binary. The script, `pargap.sh', does this.

So, in a multi-user environment, traditional users can  continue  to  use
`gap.sh'  without  noticing  any  difference.  Only  an   invocation   of
`pargap.sh' will add the new features.

In a future version of {\GAP}, it is hoped that the  {\GAP}  kernel  will
have enough ``hooks'', so that no modification of the  {\GAP}  kernel  is
required. At that time, it will also be possible to speed up the  startup
time for {\ParGAP}. Much of the startup time is  caused  by  waiting  for
{\GAP} to read its library files. It will be possible to use  the  {\GAP}
function, `SaveWorkspace()' to save a version  with  the  {\GAP}  library
pre-loaded. That saved version can then be used to  start  up  {\ParGAP}.
This is not currently possible, because {\ParGAP} needs  to  get  at  the
command line of {\GAP} before the {\GAP} kernel sees it.

Comments and contributions to a {\ParGAP} user library, or any other type
of assistance, are gratefully accepted.


Gene Cooperman
\Mailto{gene@ccs.neu.edu}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%
%E