Sophie

Sophie

distrib > Mandriva > 10.2 > x86_64 > by-pkgid > c3961f0ddbe441d4855d9f9d36ecac8b > files > 1

pcp-0.3.2-11mdk.src.rpm

http://www.cs.berkeley.edu/~bnc/pcp/docs/index.html

PCP Installation Instructions
Brent Chun

Copyright © 2002 by Brent Chun (bnc@caltech.edu)

0.1, 2002-02-27
Revision History
Revision 0.1 2002-02-27
Initial version.

pcp is a tool for replicating files on multiple nodes of a PC cluster. Replication is done by building an n-ary tree of TCP sockets and using parallelized, pipelined data transfers which use RSA authentication. This document briefly describes how to download, build, install, and use pcp.

Table of Contents
1. Downloading, Building, and Installing pcp
2. Using pcp

    2.1. Setting your PATH
    2.2. Writing a file to multiple nodes
    2.3. Checksumming a file on multiple nodes
    2.4. Deleting a file on multiple nodes
    2.5. Optimizing the Distribution Tree


1. Downloading, Building, and Installing pcp

The initial release of the software (v0.1) can be found off my web page: pcp-0.1.tar.gz

# tar xvfz pcp-0.1.tar.gz
   # cd pcp-0.1
   # ./configure
   # make

Installing it is a two step process First, we install the software using a make install on all the nodes. This command does essentially all the work, including installing the configuration files to start both pcpd (through xinetd) and authd (through init).

# make install

Second, we generate a new RSA public/private key pair using openssl and replicate it on all nodes of the cluster.

# openssl genrsa -out auth_priv.pem
   # chmod 600 auth_priv.pem
   # openssl rsa -in auth_priv.pem -pubout -out auth_pub.pem

Once the new key pair has been created, the keys then need to be copied into /etc on all the nodes.

# scp auth_priv.pem node1:/etc/auth_priv.pem
   # scp auth_pub.pem  node1:/etc/auth_pub.pem
   # scp auth_priv.pem node2:/etc/auth_priv.pem
   # scp auth_pub.pem  node2:/etc/auth_pub.pem
   # scp auth_priv.pem node3:/etc/auth_priv.pem
   # scp auth_pub.pem  node3:/etc/auth_pub.pem
   .... and so on .... 

With a make install and the copying of the new key pair, pcp should now be ready for use.

2. Using pcp

pcp can be used to write, produce checksums, or delete files replicated on a set of nodes.
2.1. Setting your PATH

# export PATH=$PATH:/usr/local/pcp/bin

2.2. Writing a file to multiple nodes

The following command copies local file foo.txt and copies it on nodes tgl0, tgl1, tgl2, and tgl3 as /tmp/foo.txt. The -v option produces verbose output. Writing is the default option so no switch is needed. Output is shown below.

# pcp -v foo.txt /tmp/foo.txt tgl0 tgl1 tgl2 tgl3

   ##################################################
   Write succeeded on tgl0.cacr.caltech.edu (131.215.145.40)
   Write succeeded on tgl1.cacr.caltech.edu (131.215.145.41)
   Write succeeded on tgl2.cacr.caltech.edu (131.215.145.42)
   Write succeeded on tgl3.cacr.caltech.edu (131.215.145.43)

2.3. Checksumming a file on multiple nodes

The following command produces the SHA-1 hash of the remote file /tmp/foo.txt on tgl0, tgl1, tgl2, and tgl3. Output is shown below.

# pcp -c /tmp/foo.txt tgl0 tgl1 tgl2 tgl3

   Checksum succeeded on tgl0.cacr.caltech.edu (131.215.145.40)
      SHA-1 = 1ccf4925fc3b8767986303a3b16c6c8dfaf7ee13
   Checksum succeeded on tgl1.cacr.caltech.edu (131.215.145.41)
      SHA-1 = 1ccf4925fc3b8767986303a3b16c6c8dfaf7ee13
   Checksum succeeded on tgl2.cacr.caltech.edu (131.215.145.42)
      SHA-1 = 1ccf4925fc3b8767986303a3b16c6c8dfaf7ee13
   Checksum succeeded on tgl3.cacr.caltech.edu (131.215.145.43)
      SHA-1 = 1ccf4925fc3b8767986303a3b16c6c8dfaf7ee13

2.4. Deleting a file on multiple nodes

The following command deletes the remote file /tmp/foo.txt on tgl0, tgl1, tgl2, and tgl3. Output is shown below.

# pcp -d /tmp/foo.txt tgl0 tgl1 tgl2 tgl3

   Delete succeeded on tgl0.cacr.caltech.edu (131.215.145.40)
   Delete succeeded on tgl1.cacr.caltech.edu (131.215.145.41)
   Delete succeeded on tgl2.cacr.caltech.edu (131.215.145.42)
   Delete succeeded on tgl3.cacr.caltech.edu (131.215.145.43)

2.5. Optimizing the Distribution Tree

pcp uses an nary tree and parallelized, pipelined data transfers for file distribution. However, given varying network, disk, and CPU speeds, the default parameters used to build this tree may not be optimal for all systems. To avoid locking users in with these suboptimal trees, users can set their own tree parameters explicitly using a configuration file $HOME/.pcprc.

# cat ~/.pcprc
   fanout          4
   frag_size   32768

The above example specifies that the tree should be a 4-ary tree and that data transfers should be fragmented into 32 KB chunks. Choosing the fragment size is important as it represents a trade-off between extra per-fragment processing costs and larger store-and-forward delays per fragment.