Running Elan MPI jobs on QsNet Clusters --------------------------------------- If built with --with-qshell or --with-mqshell, pdsh may be used to run MPI jobs on a QsNet interconnect. This option requires that you have the Elan user space libraries installed (qsnetlibs or qswelan-r RPM for Linux) and that your kernel be patched to run the 'elan3' or 'elan4' and 'rms' device drivers. Pdsh can run independently of the RMS product (the 'rms' kernel module, which is used by pdsh, is a distinct facility from the RMS product). Quadrics has provided a PDSH FAQ which may answer some common questions about getting the qshell module to run MPI jobs. Please see http://web1.quadrics.com/twiki/bin/view/FAQs/SetupPDSH rms pdsh module --------------------------------------- Pdsh can also be run via the Quadrics RMS 'allocate' command such that allocate takes care of the node reservations and passes a batch ID through to pdsh via the RMS_RESOURCEID environment variable. Pdsh retrieves the list of allocated nodes out of the RMS database using the rmsquery command. This functionality is provided by the "rms" pdsh module (--with-rms). slurm pdsh module --------------------------------------- Similar to the rms pdsh module, the slurm module allows pdsh to target nodes based on SLURM allocations, either targetting an already running job or by running under ``srun --allocate'' The SLURM jobid can be passed to pdsh using the `-j' option provided by the module, or via the SLURM_JOBID environment variable, which is set by --allocate. The `/etc/elanhosts' config file --------------------------------------- Pdsh uses a simple config file, /etc/elanhosts, to describe hosts containing Elan adapters (and on which Elan MPI jobs may be run). The config file is also used by the daemons qshd and mqshd to initialize the Elan network error resolver thread. Parsing of the /etc/elanhosts file is accomplished by using the libelanhosts(3) library, upon which pdsh depends (when building for QsNet). See the elanhosts(5) man page for a description of the /etc/elanhosts file format. The libelanhosts package may be obtained from ftp://ftp.llnl.gov/pub/linux/libelanhosts/