You are here

Configuring a cluster of workers with pool disk on each worker

In this section we describe the configuration of a more advanced and realistic cluster composed by a certain number of worker nodes, each with a certain, non-negligible, amount of disk space to store locally data files. In this case we use xrootd not only to serve PROOF but also to manage the disk pool resulting from the cluster of local disks. For this reason we need to run an additional daemon , called cmsd , which interconnects the disk pools and allows to perceive the whole system as a unique disk.

In principle, as the configuration directives for the various component protocols (cmsd, xrootd, xproofd) are different, we would need separated configuration files. However, using the conditional directives we can fit all in one file. The file example2.cf is dissected in the next sections.

Global view of the system

Figure 1 shows a schematic view of the cluster and of the relevant components. The entry points for data and PROOF serving may in general be different; in our example we keep them on the same machine, which will be called the master in PROOF terminology, or the redirector in data-serving terminology. Separating master and redirector may be necessary in the case the PROOF load on the master node is large, which is typically the case when the number of concurrent PROOF users is large. In such a case it may be even necessary to reserve more machines to the role of master. It is planned to make his more transparent with automatic load balancing between the masters.

Figure 1. Schematic view of a cluster with local storage
proof-cluster.jpg

In the following we will refer to node00 as to the master/redirector node, and to node01 ... node0n as the worker/data-server nodes. In this example the disk space on each machine will be used to store data and to host the sandboxes for PROOF users; for that purpose we need to structure it creating three subdirectories:

  • /pool/proofbox, for the sandboxes;
  • /pool/data, for the data files.

Finally, we assume that ROOT is installed on /opt/root .

Configuring the data-serving part

We dissect in this section the directives configuring the data-serving part.

  • Define the port on which the xrootd service on the redirector will be listening. Here we use the default (1094). This is the place where to change the port if already in use or to configure a different setup, for example for test purposes.
#
# XRD port
xrd.port 1094
  • Define the admin paths for the daemons; the default is under /tmp but this may leads to problems for long running daemons, as the automatic cleanup of temporary space may delete important files.
#
# Admin paths
all.adminpath /pool/admin
  • Open File System section; here we tell the system to use the built-in version of the file-system abstraction and tell the redirector that it should redirect any request on files to the leaves.
xrootd.fslib /opt/root/lib/libXrdOfs.so
if node00
  ofs.forward all
fi
  • Define the paths exported to clients; by default only /tmp is exported, so any additional path should be declared here.
#
# Export /pool/data
all.export /pool/data
  • Clustering section: here we tell the system that the manager is node00, listening on port 3121 and that it accept requests only from hosts with name 'node*'.
#
# Clustering section
if node00
  all.role manager
else
  all.role server
fi
# Manager location (ignored by managers)
all.manager node00 3121
cms.allow host node*

Configuring PROOF

We dissect in this section the directives configuring the PROOF part.

  • Load the Xproofd protocol processing requests incoming on the default port 1093 (the specification of the default port is not mandatory; in example2.cf we have indeed omitted it); we use the conditional on the executable name to avoid the olbd from loading the protocol.
### Load the XrdProofd protocol:
### using absolute paths (; with the path to the ROOT distribution)
if exec xrootd
xrd.protocol xproofd:1093 /opt/root/lib/libXrdProofd.so
fi
  • Define the ROOT distribution to use, for example, to load the proofserv executable.
### ROOTSYS
xpd.rootsys /opt/root
  • User sandboxes under /pool/proofbox
###
### Working directory for sessions [/proof]
xpd.workdir /pool/proofbox
  • This defines the resource finder. For the time being the only possibility is a static file with the list of nodes to use, located in this case in /opt/root/etc/proof.conf . Alternative finders are foreseen for the future.
###
### Resource finder
### NB: 'if ' not supported for this directive.
# xpd.resource static [] [ucfg: [wmx:] [selopt:]
# "static", i.e. using a config file
#             path alternative config file
#                       [$ROOTSYS/proof/etc/proof.conf]
#         if "yes": enable user private config files at
#                       $HOME/.proof.conf or $HOME/, where
#                        is the second argument to
#                       TProof::Open("","") ["no"]
#          Maximum number of workers to be assigned to user
#                       session [-1, i.e. all]
#       If  != -1, specify the way workers
#                       are chosen:
#                       "roundrobin"  round-robin selection in bunches
#                                     of n(=) workers.
#                                     Example:
#                                     N = 10 (available workers), n = 4:
#                                     1st (session): 1-4, 2nd: 5-8,
#                                     3rd: 9,10,1,2, 4th: 3-6, ...
#                         "random"    random choice (a worker is not
#                                     assigned twice)
xpd.resource static /opt/root/etc/proof.conf
  • Define the role of the different componets: node00 is the master, all the others workers. Order matters: the last 'xpd.role' superseeds any previous setting.
###
### Server role (master, submaster, worker) [default: any]
### Allows to control the cluster structure.
### The following example will set node00 as master, and all
### the others node* as workers
xpd.role worker 
if node00
  xpd.role master 
fi
  • Control who is allowed to be master on the worker nodes
###
### Master(s) allowed to connect. Directive active only for Worker or
### Submaster session requests. Multiple 'allow' directives can
### be specified. By default all connections are allowed.
xpd.allow node00
  • Entry point for the data storage: this is communicated back to clients and used as default URL for storage
###
### URL and namespace for the local storage if different from defaults.
### By the default it is assumed that the pool space on the cluster is
### accessed via a redirector running at the top master under the common
### namespace /proofpool.
### Any relevant protocol specification should be included here.
xpd.poolurl root://node00
xpd.namespace /pool/proofpool