How to Develop a Distributed Filesystem Model
Michael S. Hill
Sun's Network File System (NFS) was a major step in
the evolution of the
client-server capabilities of UNIX. NFS allows one host
to mount another
system's disk via the network, in essence grafting it
into the
filesystem just like a local physical disk. The remote
disk is accessed
as though it were local, and the process is transparent
to the user and
the system administrator. In fact, the same mount command
is used, with
a slight addition to the syntax to denote a remote disk
mount.
There are many benefits to using NFS: automounting,
diskless and
dataless clients, centralized backups, reduced disk
requirements,
uniform login environments, network client installation
(see sidebar),
centralized file control, and uniform filesystem architectures.
As with
most things, there are right and wrong ways to use NFS.
You could
arbitrarily mount directories all over the domain, creating
a convoluted
web of dependencies and an administrative nightmare.
Or, with some
forethought and careful planning, you can develop a
coherent and
consistent strategy that will facilitate the administration
of your
domain. This article shows how to use centralized file
control and
uniform filesystem architectures to simplify installation
and
maintenance of clients through NFS.
I have made several programs and files available across
my domain (gcc,
bash, ncftp, perl, sudo, info, olvwm, related
configuration/documentation files, and so forth). Some
of the files are
binaries, which must be grouped according to operating
system. Others
are scripts and data files that are independent of operating
system
(although in the case of scripts, they must be written
to handle
OS-dependent command differences, of course).
I used the following criteria in the design of this
file management
system. The system needed to achieve the goals of centralized
file
control and uniform filesystem architectures on the
server and all
clients, regardless of platform or operating system.
(Centralized file
control means that all of the files reside on one physical
disk attached
to a single server and are shared from that central
point. Uniform
filesystem architecture means that the path to a given
shared file is
the same on any client of any supported platform.) Also,
the system had
to provide for installing, instantly updating, and sharing
binaries and
common files without resorting to different directories
on each system
for those classes. These may sound like similar requirements,
but in
fact they impose quite different constraints on the
design.
I designed and implemented a distributed file management
system to meet
these requirements, and wrote three programs in the
process of building
it. These programs are specific to my requirements,
but are included
here as easy-to-customize examples of standardized configuration
management.
The Design of the Distributed Filesystem
To satisfy the requirement of centralized file control,
the target files
had to reside on a single system, the NFS server. To
satisfy the
requirement for uniform filesystem architecture (meaning
that any given
file is accessed by the same pathname on any system)
there must be a
single mount point on all clients. I used /usr/local/share
for the mount
point. (On the NFS server itself, /usr/local/share is
a symbolic link to
/export/share/SunOS5/share, SunOS5 because the server
is running
Solaris.) And to satisfy the requirement for binaries
and common files
to be hosted in the same place, without copies of the
common files in
every share directory, the share directories and the
common files must
be located on the same partition so that hard links
can be used. The
filesystem I designed to satisfy all of these requirements
is depicted
in Figure 1.
A 1.1GB partition is mounted on the NFS server as /export/share
with
three main subdirectories: SunOS4, SunOS5, and generic.
A subdirectory
for each OS is supported; I based the directory names
on the value of
bash's $OSTYPE variable. Each of these has two subdirectories:
one for
building software and one only for files to be shared.
The share
directories should be NFS-shared read-only to all clients.
Note that
only the OS-specific share directories are shared; generic/share
is not.
For building software (e.g., compiling the latest version
of gcc), one
host of each OS type must be able to access the build
directory. For
example, since /export/share/SunOS5/build resides on
the NFS server, I
can build the Solaris versions on that server. For the
rest of the
supported operating systems, however, the respective
build and share
directories must be shared read/write to one system
of each OS type that
will be used for building the software. (For the SunOS
server used in
building SunOS versions, the build directory is mounted
on
/usr/local/build.) Because the share directory is mounted
read/write on
every build system, the procedure for installing the
software once it's
built is the same: install it in /usr/local/share. So,
when I build the
latest version of gcc, for example, I specify /usr/local/share
for the
installation base directory, instead of the default
of /usr/local. That
way, it gets installed in the right place and with the
right path
dependencies for all clients.
Note that, because this system is dependent on NFS and
hence on the
network and the NFS server, problems with either one
will make the whole
/usr/local/share structure unavailable. For some applications,
such as
gcc and kermit, this is acceptable (since the whole
idea of NFS, after
all, is to not have local copies of everything). For
others, such as
bash, olvwm, sudo, and perl, this is not acceptable,
and local copies of
a few critical binaries should be maintained and updated
manually when
necessary. Additionally, certain data files (such as
configuration
files) apply across the domain, and others are different
for each
machine. Therefore, you should choose carefully which
applications and
files in your environment should be hosted locally on
each system.
The generic directory hosts all of the files that can
be used without
modification on any OS or platform. These include shell
scripts, Perl
programs, GNU info files, and any kind of text or documentation
files.
Because only one copy of each file is needed across
all clients, it
would be wasteful to have copies in each OS-specific
share directory.
Instead, the files reside in generic/share and are hard-linked
into the
other share directories. Hard links are used because
they can be
accessed across the NFS mount without making generic/share
available;
whereas, symbolic links to generic/share would not work
on the clients.
So, out of all of the directories in /export/share,
only one for each OS
type is visible to the clients (except the build machines).
Compiled
programs are installed into the share directories, and
scripts are
linked into them. Thus, everything becomes accessible
to each client.
The Implementation of the Distributed Filesystem
The OS-specific share directories (SunOS4/share and
SunOS5/share) are
shared to netgroups containing the clients for the respective
operating
systems. For security reasons, every shared directory
should be confined
to a netgroup that specifically lists all the hosts
to which it can be
shared. Directories should never be shared without restrictions,
or they
can potentially be accessed by any host -- in the worst
case, from any
machine on the Internet. This is a Bad Thing! I use
bin, lib, and other
directories under generic/devel to maintain scripts
(usually under SCCS
control) and files, and then use share-generic (Listing
1) to make a
read/only copy in generic/share, then hard-link it to
SunOS4/share and
SunOS5/share. The program share-generic copies the files
from
generic/devel into generic/share, and then links them
into the
OS-specific share directories. share-generic should
be run from the
/export/share/generic/devel directory and given relative
paths, such as
lib/perl/init.pl or bin/baseline, as arguments. The
program determines
the permissions for each file from its path prefix.
I use two programs to standardize installation of new
clients, both to
ensure that they have local copies of critical binaries
and to provide
certain files like the domain-wide syslog.conf file
and extra startup
scripts in /etc/init.d that are used for local applications.
The first
program, install-basics (Listing 2), is run from the
central server (in
this case, the NFS server), and takes advantage of root-level
host
equivalence to copy files from the server to the client.
Root-level host
equivalence means that the three servers in my domain
are in every
system's /.rhosts file, so I can issue commands and
copy files from the
servers to the clients, but not vice versa. I set up
/.rhosts on the
client manually after installing the OS and before running
install-basics. install-basics is run with one or more
client names as
arguments:
install-basics dorothy toto wizard
It copies over root login files and Solaris startup
files -- in short,
any files that aren't shared via NFS -- to each machine
named.
The second program, initialize-client (Listing 3), is
run on the client
with no arguments and merely copies files from /usr/local/share
to the
local filesystems. These include the critical binaries
that should
reside locally on each machine. This program also makes
sure that the
logfiles defined in syslog.conf exist on the client.
I include the version number in the name of each program,
such as
bash-1.14.6 and gcc-2.7.2. I then create a symbolic
link for the generic
name of the program (bash, gcc). The two directory entries
then look
like the following:
lrwxrwxrwx 1 root root 11 May 7 15:36 bash -> bash-1.14.6
-rwxr-xr-x 1 root root 401976 Jan 24 13:48 bash-1.14.6
initialize-client maintains this convention on the clients.
The functions performed by the two programs are logically
distinct
operations, that is, the same thing could not be accomplished
by running
only one program on the client. (It could be accomplished
by running one
program on the server, but that would be wasteful.)
After the OS is
installed and both programs have been run, the client
is standardized
and ready to use.
Incidentally, I keep share-generic and install-basics
in /usr/local/bin
on the NFS server (i.e., I don't share them), because
they are only used
on that system. initialize-client is kept in /usr/local/share/bin
because it needs to be available on the clients, but
doesn't need to
exist locally on each one.
Summary
I've been using this system for more than a year, and
the /export/share
partition is starting to get full. If I were designing
it over, I might
use one common build directory for all operating systems.
In fact, I
could move the compile directories to another partition,
because the
OS-dependent portion of my scheme doesn't rely on hard
links. Another
factor that will probably affect my implementation is
the planned
purchase of an NFS server from Network Appliances, on
which I will host
the shared files. It will change my current procedures
for building and
maintaining software, because I'm not sure if I will
be able to use hard
links on the unit's RAID filesystem as I do on the /export/share
partition. But, the essence of my distributed filesystem
management
system will remain the same.
I hope that this article has showed you how to use NFS
in a way that
will bring order and convenience to the administration
of your domain.
Your implementation, and even design requirements, may
well differ from
mine, but these concepts should help you get the most
out of NFS for
managing distributed filesystems.
About the Author
Michael Hill is a software engineer and system administrator
for
Lockheed Martin Astronautics on the Titan IV program.
He works on Sun
servers in a Solaris environment, with the odd legacy
SunOS system.
Michael has been working with UNIX since his college
days. His
interests include programming, reading, and playing
with computers in
general.
|