Sophie: ka-deploy-server-host-0.92-23mdv2010.0 x86

ka-deploy-server-host-0.92-23mdv2010.0.x86_64.rpm

What is Ka-deploy ?
-------------------

Amongst the numerous issues when dealing with the installation of a cluster is the problem of the installation of the OS on all (and it means MANY) the machines.
Ka-deploy is a tool that allows you to replicate one Linux machine many times at the same time, I mean you can create many clones of one machine at the same time.
Typically you only need to install one machine, prepare a few scripts of autoconfiguration that will update a few parameters on the nodes, and then use Ka-deploy to replicate your system on N nodes a the same time, and efficienly.

How does it work ?
------------------

Ka-deploy needs you to start a minimal Linux system (with initrd or nfs_root) on the nodes you want to install (target nodes). This system will then launch the Ka-deploy client, which will connect to the Ka-deploy server. The Ka-deploy server must be running on the node you want to clone.
After the clients have contacted the server, they connect to each other to form a chain of TCP connections. This chain will then be used by the server (which is at the extremity of the chain) to send the data (i.e the system) to all the clients : the server uses the tar command to produce a flow of data corresponding to the system, and sends this flow thru the chain. The clients read the data arriving thru the chain, send it to the tar command to recreate the system on the local disk, and send it also to the rest of the chain.

tar tar tar tar tar
| ^ ^ ^ ^
| | | | |
V | | | |
server ------> client1 ------> client2 -------> client3 ------> client4 ---...

Performances
------------

On a switched fast ethernet network with full duplex, this method gives the maximum bandwith for sending the data from the server to all the machines, and the total time taken by the replication is almost independant of the number target machines.
For instance, on our cluster of PIII-733, with IDE drives, I can install 60 nodes at the same, sending 1.5 gigabytes with the average bandwith of 6 MBytes/second.
In fact the limiting factor is the speed of the hard drives, and the use of tar to replicate the system does not help (on the performance point of view).

If you take a look at the server you will see that in fact the clients do not form a chain, but a tree. But the best performance is obtained when the arity of this tree is 1, and then this tree is merely a chain :)

How can I use it ?
------------------

* Server side

The server program just needs to be run on the node you want to clone. For permissions issues, be sure to run the server as root. If you don't, you will encounter bad permissions and missing files on the replicated systems. Be sure to check the EXTCOMMAND string in server.c before compiling.

The syntax for the server is:

ka-d-server [ -n nb_clients ] [ -a arity ] [ -s session_name ]

Where nb_clients is an integer.
When nb_clients have contacted the server, the data transfer will start, and other clients no longer can connect.
Session name : the clients have two methods for locating the server. Either you give them the hostname of the server, or you give the a 'session name'. If you do so, the clients will try to find the server by sending broadcast udp packets, and servers who have the sane session name will answer.

* Client side :

In fact what is offered by Ka-deploy is only the pair of server/client programs. This means that extra material (i.e a minimal system image, and a few scripts) is needed on the client side. Personally I use sfdisk to repartition the hard drive of the machine, then I use mke2fs to format the partitions, and mount the future root filesystem under /tmp/disk, and this is where the tar command will put the data. Be sure to check the EXTCOMMAND string in client.c before compiling.
Read the boot.txt file for fore information about the whole boot/install process.

Example:

#!/bin/bash
# Script to run on the nodes - prepare hard drive, mount partitions, and start client. Reboot upon completion.

echo Setting Hard drive optimizations
hdparm -c1 -d1 -K1 /dev/hda
echo Partitionning hard drive
/sbin/sfdisk /dev/hda -uM < /part
echo Formatting partitions
mke2fs /dev/hda1
mkswap /dev/hda2
mount /dev/hda1 /tmp/disk
# the name of the machine that runs the server is written on the file /servername
/ka-d-client -h `cat /servername`
# another possiblity : /ka-d-client -s clusterinstall : will try to locate a server that has the session name clusterinstall
if test $? -eq 0 ; then
Echo OK
else
Echo Problem during installation
fi
sync
while ! umount /tmp/disk ; do
echo Umount failed
sleep 1
echo Trying again
done
/sbin/reboot

When the scripts are done with care, it is even possible to duplicate a Linux system that uses several partitions (just mount them under /tmp/disk/usr for instance)

The syntax for the client is:

ka-d-client [ -g ] [ -h host_name ] [ -s session_name ]

Where host_name is the name of the machine where the server is running.
See above for an explanation of 'session_name'.

The -g option means 'go' : sometimes you don't know by advance how many clients will do the installation (say a few of the clients failed to boot, but you want to install the clients that are ready). You can then force the server to start the transfer by running ka-d-client with the -g option on any machine : this client will tell the server to start the transfer, and then will exit. This client will NOT be part of the chain nor accept the data.

Mini System
-----------

Builing the mini system needed by the client is a pain. Mine comes from the mini distribution Mininux (http://mininux.free.fr , sorry it's in french). It can be found on the FTP space of the ka-tools project ( ftp://ka-tools.sourceforge.net/pub/ka-tools/ ), and is named nfsroot.tar.gz.