Patrick M. Ryan
In a distributed computing environment, it is often
advisable to keep
publicly needed disk resources -- such as manual pages
software -- on a central server and allow other machines
mount those disks. This has the advantage of freeing
up valuable disk
space and eliminating redundant data. You can effect
of disk space via Sun's Network File System (NFS) and
Certain types of disk resources can be thought of as
static; for example,
manual pages and third-party software change only at
intervals. By contrast, some public domain packages,
Gnu programs, are updated regularly, and other types
of software change
even more frequently, some on a weekly or even daily
basis. In our
work environment, we use the Interactive Data Language
(IDL) for most
of our data analysis. We maintain a suite of customized
for data visualization, specialized file I/O, and statistical
and we store these routines in a globally accessible
of the routines are in a constant state of evolution.
In a small cluster of machines, just one machine suffices
to act as
the server. In a large distributed environment, however,
machine can become loaded down and can cause a performance
the cluster. This is especially true in an environment
where the logical
cluster covers several different subnets. In such cases,
or more machines act as disk servers help to distribute
the load more
evenly. With two or more servers, though, the problem
the disks in sync. If the software on the server is
manual pages, which change rarely), then there is no
if local software resides on the disks and if that software
often, a strategy for keeping the disks in sync is required:
is where disk mirroring comes in.
Disk mirroring is used to keep two or more sets of disk
in sync. A "disk resource" generally refers
to a directory
tree residing on a single physical file system. With
such a mirroring
system in place, the system manager need not worry about
information from different machines. I have written
a Perl script
called mirror, to implement disk mirroring. The script
designed to be run at regular intervals in a cron job.
note: the mirror script is too large for publication
but is available electronically. See "Source Code
on the cover to find information on electronic distribution.]
The mirror script exploits a special type of entry which
be put in an automount map, as follows:
The -hosts keyword indicates to the automounter that
to directories in the /net directory refer to machine
and then to directories exported by those machines.
For example, assume
that a machine called jupiter is exporting directories
and /export. Host europa is running automount and
includes an entry in its automount map as described
above. On europa,
one can refer to directory /net/jupiter/export. When
reference is made, europa mounts everything that it
jupiter, in this case /export and /usr. It
is this capability to automount from a particular host
which is exploited
in the mirror script.
The model of the mirror scheme is master-slave. One
host is assumed
to have the definitive copy of the software or data.
Each mirror host
then runs the mirror script to update its own copy of
is on the master.
To be useful, any mirroring scheme must perform several
tasks including:Keep a log file of changes made to the slave.
Warn the system administrator about any inconsistencies
not be resolved or any unexpected system errors.
An earlier effort at disk mirroring took the form of
a chain of tar
commands. Essentially, a giant tar pipeline, consisting
everything on the server, would copy everything over
to the slave.
While this accomplished the task of copying new and
over, it did not clean up outdated (deleted) files.
This method also
had the drawback of moving several hundred megabytes
across the network
The logic of the mirror script is to create two collections
of data. The first collection is a text file representing
of the master directory, one file per line. The second
is a database
of the contents of the slave server. This database is
a DBM file indexed
on the full pathnames of the files (an approach inspired
method of attaching an associative array to a DBM file).
generates the two files by using the output from the
command (see Listing 1). Running find twice and storing
temporary files can be a significant drain on resources,
so it's best
to run the script in the middle of the night when temporary
and CPU cycles are both somewhat more abundant.
Once the two temporary files are created, the script
list of files from the master. As each file is examined,
it is removed
from the associative array containing the slave's files.
performs different types of tests on three different
types of files:
symbolic links, directories, and regular files. (The
that no one will be trying to mirror non-text files
like device files
If the file is a symbolic link, one of several conditions
true on the slave: (1) the symbolic link does not yet
exist on the
slave; (2) the symbolic link does exist but does not
point to the
right place; (3) the file exists but is not a symbolic
link; or (4)
the symbolic link exists and points to the same place
as the master.
In case 1, a new symbolic link is created on the slave.
In case 2,
the old symbolic link is deleted and a correct one created
place. Case 3 is a potential error condition; in this
case, a message
is sent to the administrator and no action is taken.
Case 4 obviously
requires no action. (See Listing 2 for the implementation
The script next checks to see if this is a directory.
program provides a breadth-first search, which is important
a new directory subtree with several levels is created.
directories must be created before lower ones. If a
not exist on the slave, it is created. In any other
case, such as,
if a file of the same name exists but is not a directory,
The third check (see Listing 3) is for regular files.
The script makes
several comparisons here to determine what action is
needed. A file
is considered "unchanged" if its size, user,
and modification time match that of the version on the
any of these differ, then a new version of the file
is copied over
and the file's various attributes are set using Perl's
to chown, chmod, and utime. If the file does
not exist on the slave (it is new on the master), then
it is copied
over and the appropriate attributes set.
As each file name in the master list is considered,
it is removed
from the DBM file. Once the traversal is complete, whatever
remain in a DBM file must refer to files and directories
been deleted from the master. These files and directories
deleted from the slave. Note that the overhead of creating
file of the slave's contents is purely for the purpose
old files. If not for that consideration, the code could
for the presence of each master file and proceed accordingly.
The script accepts several command-line options. They
-f file -- Log changes to named file. By
default, do not save changes in a log file.
-h host -- Use named host as the master
-l dir -- Local mirror directory. Defaults
-m user -- Send mail to user about any
changes made on the slave server. By default, send mail
-n Not Really mode. Report what is happening
but don't actually make any changes to the file system.
-t -- dir Use dir as the temporary
directory. Defaults to /tmp.
-v -- Be verbose. Tell the user exactly
what is happening. This option is used primarily for
This system has been in operation on our cluster now
for several months.
All indications are that the file systems have remained
Although the script is working properly, it does not
special cases, so some enhancements will be necessary
in the future.
The most serious problem now is the case of files with
links. Without keeping track of inode numbers, mirror
no way of realizing that two or more file names may
to a single physical file. As a result, multiple copies
of the same
physical file may be copied to the slave server. This
be solved with a little more overhead in the directory
multiple hard links to a single file all reside in the
The mirror script could do a little bit of looking around
in case it is examining a file with a link count greater
than 1. If
any hard links cannot be accounted for, the script can
send a warning
to the system administrator.
A second problem is that if large file systems are mirrored,
temporary files can become very large. A possible solution
is to peruse
the file systems on a per-directory basis and cache
only one directory
at a time in temporary space.
Given time, the whole mirroring problem might be solved
kind of clever client-server protocol. A file transfer
as FTP or RCP could be used to transfer files across
To conclude, writing the mirror script was a useful
in learning about the pitfalls of file management in
environment. It was also a chance for me to try my hand
a large Perl script.
This script may be obtained via anonymous ftp to jaameri.gsfc.nasa.gov
in the directory /pub/sysadmin/mirror.sh. I welcome
enhancement suggestions, and any questions about the
Wall, Larry and Randal Schwartz. Programming
Perl. Sebastopol, CA: O'Reilly & Associates, 1990.
System and Network Administration. Sun Microsystems
About the Author
Pat Ryan has been programming on UNIX systems of various
1986. He earned BS and MS degrees at St. Joseph's University,
Philadelphia. He is currently employed by Hughes STX
Corporation and is
working as a programmer and system manager at NASA's
Flight Center. He can be reached over the Internet at