Starting LAM is a three step procedure. In the first step, hboot(1) is invoked on each of the specified machines. Then each machine allocates a dynamic port and communicates it back to lamboot which collects them. In the third step, lamboot gives each machine the list of machines/ports in order to form a fully connected topology. If any machine was not able to start, or if a timeout period expires before the first step completes, lamboot invokes wipe(1) to terminate LAM and reports the error.
The <bhost> file is a LAM boot schema written in the host file syntax. See bhost(5). Instead of the command line, a boot schema can be specified in the LAMBHOST environment variable. Otherwise a default file, bhost.def, is used. LAM searches for <bhost> first in the local directory and then in the installation directory under etc/.
In addition, lamboot uses a process schema for the individual LAM nodes. A process schema (see conf(5)) is a description of the processes which constitute the operating system on a node. In general, the system administrator maintains this file -- LAM/MPI users will generally not need to change this file. It is also possible for the user to customize the LAM software with a private process schema.
The remote shell program that is used to invoke commands on remote hosts is set when LAM is configured. It is typically rsh, but can be set to any value by the person who setup/compiled LAM. This program can be overridden at lamboot invocation time by setting the LAMRSH environment variable to a suitable remote shell program. For example:
This will force LAM to use the "ssh" client to invoke programs on remote nodes, and ensure that "ssh" uses the -x command line flag (to suppress the ssh 1.x client series standard information banner that is normally output to the standard error, which would cause lamboot to fail).
Normally, lamboot uses two remote shell invocations to each node. The first remote shell invocation is used to determine the user's shell on the remote node. The second remote shell invocation is used to launch the desired LAM binary on the remote node. If the -b switch is used, lamboot will assume that the user's shell on all remote nodes is the same as it is on the local node, and therefore only one remote shell invocation is used, which is noticably faster.
In either case, on remote nodes, if the user's shell is not csh, tcsh, or bash, .profile is invoked by LAM before invoking any LAM binary. This allows the user to setup paths and any necessary environment before LAM binaries are invoked (csh and tcsh users can put such setup in their $HOME/.cshrc or $HOME/.tcshrc files; bash users can put this setup in their $HOME/.bashrc file).
rsh somenode lamboot -s hostfile
This is because rsh waits for two conditions before exiting: lamboot to exit, and stdout / stderr to be closed. Without -s, stdout / stderr would not be closed, and rsh would hang even though lamboot had completed. -s causes the stdout / stderr of the local LAM daemon to be closed upon invocation, which will allow rsh to complete. Using -s will not affect lamboot in any other way, but it will prevent the tstdio(3) package from working properly.