Hidden Dangers of NFS Mounting Foreign Filesystems
Doug Morris
NFS (Network File System) facilitates file access transparency
but does
not assure semantic transparency. If the local and remote
filesystems do
not share the same characteristics and behaviors (i.e.,
the remote UNIX
filesystem is foreign), the resulting difference in
filesystem semantics
(semantic gap) can be a breeding ground for system problems.
These semantic differences (see sidebar "A Note
on Terminology") are
generally more subtle than differences due to machine
architecture
(i.e., word size/addressing, character and data type
representations
int, float, long, real, etc.). NFS uses a protocol called
XDR (External
Data Representation) to encapsulate all data types into
standard
representations that are encoded/decoded on the client
and the server.
XDR encapsulation can only be performed when the data
types are known,
which generally is not the case for most files. Unless
directed
otherwise, NFS treats file contents as text and does
not employ
additional encoding/decoding. XDR is used primarily
for parameter
passing in internal NFS remote procedure calls.
What Causes the Semantic Gap?
NFS is a client/server protocol for transparent file
access. It
translates client-side I/O requests into NFS remote
procedure calls
interpreted on the NFS server. When the client and server
have identical
operating and filesystem environments, the local I/O
requests are
satisfied against the remote filesystem (server) in
a manner transparent
to the local operating environment. Unfortunately, when
differences
exist, a client-side I/O request can be satisfied against
the remote
server in a way completely unexpected by the local filesystem.
When this
happens a semantic gap or difference in meaning exists
between the local
and remote filesystems.
Common Problems
The most commonly encountered problems relate to differences
in
permissions, support (or lack thereof) for sparse files,
and differences
in compilation or execution environments.
POSIX chown-restricted/unrestricted behavior. If the
client-side
operating environment restricts chown to root (POSIX
chown restricted)
and the server side does not (POSIX chown unrestricted),
then the
client-side restrictions will be lost and result in
overall POSIX
chown-unrestricted behavior for remote files.
Sparse files. If the client-side operating environment
supports sparse
files (i.e., files that contain holes) and the server
side does not,
then file holes will be zero filled and read performance
could
significantly suffer.
Holes are created by writes preceded by lseeks beyond
the end of file. A
file system that supports holes will extend the file
logically but not
allocate disk space for the hole. A file with a size
of 1 Gb may
actually only occupy 4 Kb bytes of disk space. When
a hole is read,
binary zeros are returned without actual disk reads
being performed.
Sparse files can be very useful in random access applications
using
hashed keys. (Hashing is a file access technique in
which a key is
transformed into a file offset that can be directly
read.)
Program misunderstandings. It is common at compile/link
time for a
program to include system configuration parameter values.
Assuming the
program is compiled/linked on the client-side (or a
similarly configured
system), the parameter values will be valid on the client
but may not be
valid on the server. When this misunderstanding occurs,
the program may
act strangely or abort when accessing remote files.
Buffer sizes. Program read/write buffers are commonly
sized to the
MAX-BUFFER-SIZE kernel parameter. If the filesystem
block size on the
remote system (NFS server) is larger than MAX-BUFFER-SIZE
on the client,
then a block read (i.e., a read for BLKSZ bytes where
BLKSZ is the block
size returned by the stat system call ) will return
more bytes than the
client buffer will hold. The buffer overflow will overwrite
program or
data areas resulting in an error or incorrect results.
Path/filename length limits. The server-side could return
a path/file
name longer then the client-side program expects. A
simple non-UNIX
example would be an MS-DOS program receiving a UNIX
filename when
accessing a UNIX directory using PCNFS.
Other sysconf/pathconf limits. UNIX defines two system
calls to retrieve
system and pathname limits, sysconf and pathconf. Sysconf
takes an
integer argument corresponding to an assigned system
variable name and
returns the currently configured value. The X/Open specification
for
sysconf lists more than 30 variables. Pathconf takes
a pathname and an
integer corresponding to an assigned variable name as
arguments and
returns the configured value for the filesystem containing
the pathname.
The X/Open specification for pathconf lists nine variables.
A client-side server-side difference in any key sysconf
or pathconf
variable could be a potential source of problems. A
summary of these key
variables are listed in Table 1.
True NFS Horror Stories
I know of one computing installation in which the staff
used an SGI as a
Sun fileserver. The files contained on the SGI were
Sun files -- Sun
object files and data written by Sun programs. This
solution seemed very
safe to the installation staff. Unfortunately, this
misperception
resulted in many problems.
The MAX-BLOCK-SIZE on the Sun was 8192, and the SGI
was configured with
a filesystem block size of 32768. A popular UNIX project
management tool
on the Sun that ran fine against native Sun filesystems
failed when run
against Sun files on the SGI. This program unfortunately
did a block
read based on the block size returned by stat. stat
returned the block
size of the remote file 32768, but this was much larger
than the
program's buffer, which was sized to the 8192 MAX-BLOCK-SIZE
of the Sun.
The Sun was configured to the default POSIX chown-restricted
behavior,
and the SGI was configured to the default POSIX chown-unrestricted
behavior. The Sun chown shell command checked for a
uid of root before
calling the chown system call; it aborted when any non-root
user tried
to change ownership of a file. Because it checked for
root before
calling the chown system call, the shell command preserved
POSIX-restricted behavior. However for any program calling
chown
directly, the behavior was different. The Sun system
call on the remote
filesystem was translated by NFS into an SGI call. The
SGI was
configured POSIX chown unrestricted, so any non-root
user could change
ownership of the file as long as he/she owned the file.
At first, this inconsistency may not appear to be a
significant problem,
but most system utilities that need to change ownerships
use the system
call, not the chown command. tar (which on the Sun by
default attempts
to set ownership of extracted files to the archive owner)
suddenly
failed for all non-root users attempting to extract
files created under
another uid. When the files were native to the Sun,
the chown system
call failed with a return code that signaled tar to
extract the files
without preserving ownership. tar completion resulted
in all extracted
files being owned by the executing user. Over NFS, the
chown call
succeeded, and the high-level directories were created
with ownership
assigned to the archive owner. Subsequent file extractions
to these
directories failed because the user was not the owner,
and the users
were left with empty high-level directories they did
not own.
SunOS supports sparse files. The filesystem native to
SGI Irix does not.
Sparse files that occupied little space on the native
Sun filesystem
were much larger on the SGI where the holes had to occupy
real disk
space.
Many other potential problems existed, although they
apparently were not
experienced. The Sun and SGI differ in a number of sysconf
and pathconf
variables. Each of these variables could have been the
source of
problems and inconsistent behavior.
Summary
NFS is a reliable client/server protocol for transparent
access to
remote files. Usually NFS access presents few transparency
problems, but
only if the local and remote filesystems are semantically
similar. When
semantic differences exist, NFS becomes a potential
source for many
problems, including security inconsistencies and strange
program
behaviors.
Semantic similarity does not necessarily mean operating
environment
similarity. It simply means that the client and server
interpret
NFS-related system calls in the same way. The client
and server may run
different operating systems and have very different
hardware
architectures.
In general, you should avoid remote mounting filesystems
that have
different characteristics and behaviors than the local
filesystem. When
this cannot be avoided, proceed with caution, and seek
the advice of
your software vendors. Software vendors may not support
such usage _
most expect their programs to be run against local filesystems
or
identically configured remote filesystems.
References
X/Open CAE Specification, Protocols for X/Open Interworking.
XNFS. 1992,
Issue 4. X/Open Company Ltd., UK.
X/Open CAE Specification, System Interfaces and Headers.
XNFS. 1992,
Issue 4. X/Open Company Ltd., UK.
About the Author
Doug Morris has a B.S. in Mathematics and an M.B.A.
in Management
Information Systems. He is a UNIX enthusiast and long-time
Open Systems
advocate. He has been an active participant in XOpen,
including holding
an elected office on the XOpen User Council Executive.
He has held
various positions in technology evaluation and management
at several
Fortune 500 companies, including his current position
as a Systems
Specialist/Software Engineer at a major international
oil company. He
can be reached at his personal email address of damorri.msn.com.
|