Sophie: dolly-0.57-9mdv2009.1 x86

dolly-0.57-9mdv2009.1.x86_64.rpm

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head><title>Dolly - A program to clone disks / partitions</title>

<h1>Dolly &mdash; A program to clone disks / partitions</h1>

Version 0.57<br>
8 may 2003<br>
Felix Rauch &lt;<a href="mailto:rauch@inf.ethz.ch">rauch@inf.ethz.ch</a>&gt;

<p>
This document describes the program "dolly", its purpose and the
format of the required config-file.


<h2>Purpose</h2>

Dolly is used to clone the installation of one machine to (possibly
many) other machines. It can distribute image-files (even gnu-zipped),
partitions or whole hard disk drives to other partitions or hard disk
drives. As it forms a "virtual TCP ring" to distribute data, it works
best with fast switched networks (we were able to clone a 2 GB Windows
NT partition to 15 machines in our cluster over Gigabit Ethernet in
less than 4 minutes).
<p>
As dolly clones whole partitions block-wise it works for most
filesystems. We used it to clone partitions of the following type:
Linux, Windows NT, Oberon, Solaris (most of our machines have multi
boot setups). We have a small (additional) Linux installation on all
of our machines or use a small one-floppy-disk-linux (e.g. muLinux) to
do the cloning. On newer machines we use PXE to boot a small system in
a RAM disk. From that system we then clone the hard disks in the
machines.

<h2>How it works</h2>

Setting up or upgrading a cluster of PCs typically leads to the
problem that many machines need the exact same files. There are
different approaches to distribute the setup of one "master" machine
to all the other machines in the cluster. Our approach is not
sophisticated, but simple and fast (at least for fast switched
networks). We send the data around in a "virtual TCP ring" from the
server to all the clients which store tha received data on their local
disks.
<p>
One machine is the master and distributes the data to the others. The
master can be a machine of the cluster or some other machine (in the
current version of dolly it should be the same architecture
though). It stores the image of the partition or disk to be cloned or
has the partition on a local disk. The server should be on a fast
switched network (as all the other machines too) for fast cloning.
<p>
All other machines are clients. They receive the data from the ring,
store it to the local disk and send it to the next machine in the
ring.
<p>
The cloning process is depicted in the following two figures. Usually
there are more than two clients, but you get the idea:
<pre>
      +--------+  +----------+ +----------+
      | Master |  | Client 1 | | Client 2 |
      +----+---+  +---|------+ +----+-----+
            \         |            /
             \    +---+----+      /
              +---+ Switch |-----+
                  +--------+

</pre>
Above: Cloning process, physical network
<pre>


     +--------+  Data   +----------+  Data  +----------+
     | Master |-------->| Client 1 |------->| Client 2 |
     +--------+         +----------+        +----------+
         ^                   |                   |
         | Data              | Data              | Data
         |                   V                   V
      +------+            +------+            +------+
      | Disk |            | Disk |            | Disk |
      +------+            +------+            +------+

</pre>
Above: Cloning process, virtual network with TCP connections
<p>
We choose this method instead of a multicast scheme because it is
simple to implement, doesn't require the need to write a reliable
multicast protocoll and works quite well with existing
technologies. One could also use the master as an NFS server and copy
the data to each client, but this puts quite a high load on the server
and makes it the bottleneck. Furthermore, it would not be possible to
directly clone partitions from one machine to some others without any
filesystem in the partition.


<h2>Different cloning possibilities</h2>

There are different possibilities to clone your master machine:
<ul>
<li>You already have an image of the partition which you want to clone
  on your master (raw or compressed). In this case you need Linux
  (some other UNIX might also work, but we haven't tested that yet) on
  your master and a Linux on each client.</li>

<li>You want to clone a partition which is on a local disk of your
  master. In this case you need Linux (or probably another UNIX, we
  haven't tried that) on your master as well as on all the clients.
  You can use any Linux installation as long as it's not the one you
  want to clone (i.e. you can not clone the Linux which you are
  currently running in. See the warning below).</li>

<li>You want to clone a whole disk including all the partitions. In this
  case you either need a second disk on all machines where your Linux
  used for the cloning process runs on (not the one you want to clone)
  or you need a small one-floppy-disk-Linux which you boot on all
  machines. In the later case you also need dolly on all machines
  (copy it to your floppy disk or mount it with NFS) and the
  config-file on the master.</li>
</ul>
<b>WARNING</b>: You can NOT clone an OS which is currently in use. That's why
         we have a small second Linux installation on all of our
         machines so that we can boot this to clone our regular Linux
         partition.

<h2>Changes since version 0.2</h2>

We applied some changes to Dolly since version 0.2. Most of them are
not very important.
<ul>
<li>Dolly as a benchmarking tool.<br>
  Dolly can now be used to benchmark your network. In the dummy mode,
  Dolly will not access the hard disk, neither for reading nor for
  writing. It just transfers data between your machines. This might be
  useful for testing the throughput of your switch. The running time
  for such a run can be specified with the "-t" option on the command
  line. With the "-o" option you can specify a logfile where Dolly
  will write some statistical information.</li>

<li>Using extra network interfaces.<br>
  It's now possible to use multiple network interfaces for the data
  transfer. This is mostly useful if you have multiple network
  interfaces with similar speeds, e.g. two fast ethernet networks (one
  for administration/logins and the other for your applications
  communication). For example: If your machines are connected with two
  fast ethernet links, then you should be able to increase the
  thourghput of the cloning process from 10 to 20 MB/s, therefore
  cutting the cloning-time by half.<br>
  You need the "add" option in the config file to use this feature.
  WARNING: This feature has only been tested with the linear network
  topology (no fanout option or "fanout 1" option in the config file).</li>

<li>Different networking topologies.<br>
  We tried different topologies (binary trees, ternary trees, ...) to
  get somre more results in a paper, but the initial multi-drop chain
  (virtual TCP ring) is still the best. You will most likely not need
  this feature.
</ul>

<h2>Change in version 0.57</h2>

Besides some bug-files and smaller improvements, it's now possible to
split an image in multiple files for archival and send the
multiple-file image to the clients. This allows to story arbitrary
long partitions on file systems with a file size limit. For details
and examples, see the section about the configuration file below
(parameters <em>infile</em> and <em>outfile</em>).
 
<h2>Configuration file</h2>

You need a configuration file for the cloning process. Its format is
strict, but easy. It contains the following entries (note that the
order of the entries is fix):<br>
(The text after "Syntax:" explains the syntax of the entry, the lines
following "EG:" are example lines)
<ol>
<li>The file/partition you want to clone, preceeded by the keywords
   "infile" or "compressed infile" in case of a compressed image.
   This file or partitions needs to be available on the master only.
   Dolly will warn you if you try to use a compressed infile which
   does not end with ".gz". The compressed keyword is important so
   that the master can inform the clients when they have to use gunzip
   before writing a file. The optional keyword "split" after the
   filename instructs Dolly to read all files with the given name and
   an appended number, separated by an underscore.<br>
   <b>Syntax</b>: [compressed] infile <em>input file or device</em> [split]<br>
   EG: <tt>infile /dev/sda10</tt><br>
   -> Will just send the partition <tt>/dev/sda10</tt> to all clients.<br>
   EG: <tt>compressed infile /images/cloneimages/sda10_WinNTRes.gz</tt><br>
   -> Will send the given file compressed to all the clients,
   instructing them to uncompress the image before writing it.<br>
   EG: <tt>infile /images/cloneimages/sda split</tt><br>
   -> Will send all files of the form /images/cloneimages/sda_&lt;number&gt;
   in order to the clients.<br>
   EG: <tt>compressed infile /images/cloneimages/sda.gz split</tt><br>
   -> Will send all files of the form /images/cloneimages/sda.gz_&lt;number&gt;
   in order to the clients, instructing them to decompress the
   incoming stream before writing it.<br>
<p>
<li>The file or partition you want to write (usually its a partition,
   but you can also write to a file) after the keyword "outfile". This
   file needs to be available on the clients only. The optional
   keyword "compressed" instructs the server to compress the data
   before sending it, so the client will store the data
   compressed. The optional keyword "split" after the filename,
   followed by a number and a multiplier, instructs the client to
   write the data in junks of no more than the given size. This is
   useful if the file system on your client does not allow files
   greater than a certain size. The files will be stored with the
   given namen and an appended <tt>_&lt;number&gt;</tt>.<br>
   <b>Syntax: [compressed] outfile <em>output file or device</em>
   [split <em>n</em>(k|M|G|T)]</b><br>
   EG: <tt>outfile /dev/sda10</tt><br>
   -> Will store the incoming data stream to the partition sda10.<br>
   EG: <tt>compressed outfile /images/cloneimages/sda10_SuSE81.gz</tt><br>
   -> Will store the compressed data stream in the given file.<br>
   EG: <tt>compressed outfile /images/cloneimages/sda_all.gz split 2G</tt><br>
   -> Will store the incoming compressed data stream in the directory
   <tt>/images/cloneimages/</tt> in files <tt>sda_all.gz_0</tt>,
   <tt>sda_all.gz_1</tt> and so on.</ br>


<p>
   Instead of the first two entries ("infile" and "outfile") it is
   also possible to use the single line "dummy <em>[MB]</em>", where
   <em>MB</em> is  the number of Megabytes to transfer in dummy
   mode. If &lt;MB&gt; is set to 0, then the clients will just
   terminate. This is useful when benchmarking with different options,
   so the clients can run all the time. To finally terminate them on
   all clients, just set dummy to 0.<br>
   NOTE: It is probably better to use the newer "-t" switch on the
   server to specify the number of seconds the benchmarks should
   run. In that case you can leave the &lt;MB&gt; blank.<br>
   <b>Syntax: dummy <em>[MB]</em></b><br>
   EG: <tt>dummy 128</tt>
<p>
   The optional keyword "segsize" is mostly used to benchmark
   switches. It specifies the maximal size of TCP segments during the
   network transfer. Usually you don't need to specify this option at
   all.<br>
   <b>Syntax: segsize <em>TCP_MAXSEG_size</em></b><br>
   EG: <tt>segsize 128</tt>
<p>
   With the optional keyword "add" it is possible to add more
   interfaces to use. The network traffic is then evenly distributed
   across the interfaces. This option is useful if you have for
   example two fast ethernet interfaces in your machines: One for
   administrative purposes and one for your main application on the
   cluster. This option is not so useful if you have multiple
   interfaces with different bandwidths. In this case just use the
   fastest available.<br>
   You have to specify the number of additional interfaces and the
   suffixes of thouse interfaces. For example, in a cluster where the
   machines are named slave0..slave15 on their default interfaces and
   all the machines have a second interface named
   slave0-fast..slave15-fast, you should use the line specified below
   (EG).<br>
   <b>Syntax: add <em>nr</em>:<em>suffix</em>{:<em>suffix</em>}</b><br>
   EG: <tt>add 1:-fast</tt>
<p>
   The optional keyword "fanout" was mostly used during performance
   tests of different network topologies. You barely need it in
   practice. Fanout specifies the number of outlinks from the server
   and the following machines (except the leafes). A fanout of 1 is a
   linear list (the default behaviour of Dolly and usually the
   fastest), 2 is a binary tree, 3 is a ternary tree, etc. Dolly
   automatically connects all the specified clients with the desired
   topology.<br>
   <b>Syntax: fanout <em>fanout</em></b><br>
   EG: <tt>fannout 1</tt>
<p>
   The optional keyword "hypennormal" instructs Dolly to treat the '-'
   character in hostnames as any other character. By default the
   hyphen is used to separate the base hostnames from the names of the
   different interface (e.g. "node12-giga"). You might use this
   paramater if your hostnames include a hypen (like e.g. "node-12").<br>
   <b>Syntax: <tt>hyphennormal</tt></b><br>
   EG: hyphennormal<br>
<p>
<li> After the keyword "server" follows the hostname of the server (or
   master). This is required for the last machine in the ring to be
   able to send the end-acknowledge back to the server.<br>
   <b>Syntax: server <em>master machine</em></b><br>
   EG: server cluster-master<br>
<p>
<li> This entry has the keyword "firstclient" followed by the hostname
   of the first client in the ring. You should use the hostname of the
   machine here, not the name of the interface where you want to
   connect.<br>
   <b>Syntax: firstclient <em>name of first machine</em></b><br>
   EG: <tt>firstclient cluster-1</tt><br>
<p>
<li> This entry has the keyword "lastclient" followed by the hostname of
   the last client in the ring. You should use the hostname of the
   machine here, not the name of the interface where you want to
   connect.<br>
   <b>Syntax: lastclient <em>name of last machine</em></b><br>
   EG: <tt>lastclient cluster-9</tt><br>
<p>
<li> This entry specifies how many clients are in the ring. The keyword
   is "clients" followed by the actual number of clients. This number
   does not include the master.<br>
   <b>Syntax: clients <em>number of clients</em></b><br>
   EG: <tt>clients 9</tt><br>
<p>
<li> The following lines contain the interface-names of the client
   machines. The number of machines must match the above number of
   clients (see 6.). You should use the name of the interface on
   which the machines will receive the data.<br>
   <b>Syntax: <em>name of client 1</em><br>
           <em>name of client 2</em><br>
           [...]<br>
           <em>name of client n</em></b><br>
   EG: <tt>cluster-1-giga<br>
       cluster-2-giga<br>
       [...]<br>
       cluster-9-giga</tt><br>
<p>
<li> The last entry in the config file consists of the keyword
   "endconfig" and marks the end of the configuration file.<br>
   <b>Syntax: endconfig</b><br>
   EG: <tt>endconfig</tt><br>
</ol>

<h2>Dolly options</h2>

Dolly has a few options which are explained here:
<ul>
<li>-h<br>
    Prints a short help and exits.
<br><br>
</li>
<li>-V<br>
    Prints the version number as well as the date of that version and exits.
<br><br>
</li>
<li>-v<br>
    This switches to verbose mode in which dolly prints out a little
    bit more information. This option is recommended if you want to
    know what's going on during cloning and it might be helpful during
    debugging.
<br><br>
</li>
<li>-s<br>
    This option specifies the server machine and should only be used
    on the master. Dolly will warn you if the config file specifies
    another master than the machine on which this option is set.
    This option must be secified before the "-f" option!
<br><br>
</li>
<li>-c<br>
    With this option it is possible to specify the uncompressed size
    of a compressed file. It's only needed for performance statistics
    at the end of a cloning process and not important if you are not
    interested in the statistics.
<br><br>
</li>
<li>-d<br>
    The "Dummy" option disables all disk accesses. It can be used to
    benchmark the throughput of your system (computers, network,
    switches). This option must be specified before the "-f" option!
<br><br>
</li>
<li>-t <em>seconds</em>
    When in dummy mode, this option allows to specify how long the
    testrun should approximately take. Since the dummy mode is mostly
    used for benchmarking purposes and single runs might result in
    different speeds (especially with many machines and bad switches
    or with small TCP segment sizes), it's more convenient to specify
    the run-lenght in seconds, as the benchmark-time becomes more
    predictable.
<br><br>
</li>
<li>-f <em>config file</em><br>
    This option is used to select the config file for this cloning
    process. This option makes only sense on the master machine and
    the configuration file must exist on the master.
<br><br>
</li>
<li>-o <em>logfile</em><br>
    This option specifies the logfile. Dolly will write some
    statistical information into the logfile. it is mostly
    used when benchmarking switches. The format of the lines in the
    logfile is as follows:<br>
    Trans. data [MB], Segsize [Byte], Clients [#], Time [s], Dataflow
    [MB/s], Agg. dataflow [MB/s]
<br><br>
</li>
<li>-a <em>timeout</em><br>
    Sometimes it might be useful if Dolly would terminate instead of
    waiting indefinitely in case something goes wrong. This option
    lets you specify this timeout. If dolly could not transfer any
    data after <em>timeout</em> seconds, then it will simply print an error
    message and terminate. This feature might be especially useful for
    scripted and automatic installations (such as "CloneSys"), where
    you don't want to have dolly-processes hang around if a machine
    hangs.
<br><br>
</li>
</ul>

<h2>Starting the process</h2>

To start the cloning, you need to start dolly on each machine. It is
recommended to start it with the "-v" (verbose) option. The order in
which you start the programs on the master and the clients doesn't
matter. You must give the "-s" (server) option on exactly one machine
(the master).
<p>
When the machines have found each other and the ring is completed, the
cloning starts. Dolly will print some progress information every
10 MBytes.


<h2>Example</h2>

In this example we assume a cluster of 16 machines, named
node0..node15. We want to clone the partition sda5 from node0 to all
other nodes. The configuration file (let's name it dollytab.cfg)
should then look as follows:
<pre>
  infile /dev/sda5
  outfile /dev/sda5
  server node0
  firstclient node1
  lastclient node15
  clients 15
  node1
  node2
  node3
  node4
  node5
  node6
  node7
  node8
  node9
  node10
  node11
  node12
  node13
  node14
  node15
  endconfig
</pre>
Next, we start Dolly on all the clients. No options are required for
the clients (but you might want to add the "-v" option for verbose
progress reports). Finally, Dolly is started on the server as follows:<br>
<pre>  dolly -v -s -f dollytab.cfg</pre>
That's all.

<HR>
<A HREF="http://www.cs.inf.ethz.ch/">ICS</A> [Laboratory for
Computersystems], <A HREF="http://www.inf.ethz.ch/">DINFK</A>
[Dept. of Computer Science], <A HREF="http://www.ethz.ch/">ETHZ</A>
[Swiss Institute of Technology], <a href="http://www.cs.inf.ethz.ch/stricker/CoPs/patagonia/">Patagonia Cluster Project</a>.
<br><br>
<a href="http://www.anybrowser.org/campaign/"><img border="0" src="http://www.cs.inf.ethz.ch/~rauch/images/w3c_ab.png" ALT="Best viewed with any browser."></a>
<a href="http://validator.w3.org/check/referer"><img border="0"
   src="http://www.w3.org/Icons/valid-html32"
   alt="Valid HTML 3.2!" height="31" width="88"></a>
<address>
Maintained by <a href="http://www.cs.inf.ethz.ch/~rauch/">Felix Rauch</a>.
</address>
Last changed: 8-may-2003
</body>
</html>