Article

rlock: A Distributed File Locking Utility

Chris Cox

File locking prevents multiple processes in UNIX from accessing all or part of a file at the same time. For example, file locking can prevent two processes from editing a file simultaneously. Some editors, such as emacs, support file locking during editing; other, such as vi, do not. File locking is not an issue limited to file editors, however. Without file locking, in UNIX there is always the risk of collision among processes trying to use the same file. UNIX provides several ways to manage file locking:

Opening files with exclusivity

Using fcntl or lockf (Linux, BSD, flock)

Setting semaphores.

Unfortunately, most of these solutions assume that the lock is valid only for the duration of the initiating process, and/or only for processes on the same host machine. Services like NFS create a situation in which processes on multiple hosts could attempt to lock the same file. Although NFS has been extended to allow file locking (via fcntl), this feature may not be universally implemented.

Another drawback to these solutions is that they are generally not very "shell friendly." Shell scripts involve the execution of several processes, but kernel-level file-locking mechanisms cease to be valid when the initiating process goes away. What's needed is a locking entity that is "shell friendly" and that works across hosts within a network.

Modern UNIX File Locking Using fcntl

The fcntl call, upon which lockf is typically built, supports the concept of file locking. In fact, fcntl supports locking portions of a file in addition to the whole file. Usually, calls are made to the higher level lockf function rathr than to fcntl. Here is the prototype declaration for lockf:

int lockf(int filedes, int function, long size);

The first argument to lockf is a file descriptor to a UNIX file opened for write (O_WRONLY) or read/write (O_RDWR). The second argument describes the function to perform:

#define F_ULOCK   0   /* unlocks previously locked areas */
#define F_LOCK    1   /* lock area, wait on exiting lock if necessary */
#define F_TLOCK   2   /* lock area, fail if area already locked */
#define F_TEST    3   /* test area for any existing locks */

The third argument specifies the number of bytes from the current file offset that are to be locked. This means that you can lock file areas in addition to the entire file. As mentioned earlier, virtually all NFS implementations support file locking using fcntl. Therefore, the fcntl-based lockf should work across networked machines, provided that the file being locked is from an NFS exported filesystem. Again, however, this mechanism may have limited success within a shell script environment, since these file locks are valid only while the initiating process exists.

fcntl provides for two kinds of file locking: advisory and mandatory. Advisory locks place the burden of determining the status of the lock on the process. In other words, the program must determine if a lock exists for an area before accessing the area. A mandatory lock forces IO operations to fail and therefore is typically maintained within the kernel. Advisory locks are more universally available than mandatory locks; the SGID (Set Group Id, 02000) bit must be active on the file being accessed to enable mandatory locking. However, in this implementation the group execution bit must be turned off in order to distinguish the mandatory locking flag from a SGID executable (which means something totally different). Therefore, it is not possible to have mandatory locks with a SGID executable. One way of turning the SGID bit on is to issue the following:

$ chmod +l filename
$ ls -l
-rw-rwlr--   1 user      user     1765 May 17 08:45 filename

All locks on filename will now be considered mandatory instead of advisory.

Older Techniques

Before fcntl, file locking had to be done some other way. The simplest mechanism was to use the process id of the locking process as the key for the lock. AT&T's Source Code Control System (SCCS) uses this style. Oddly, this technique is still being used within more modern program packages, such as emacs and the Internet Network News package.

The SCCS Method in Detail

AT&T's Source Code Control System (SCCS) uses a simple file-locking technique to prevent collision on access to its configuration files (s-files). It places the process id of the first process to access an s-file into what it calls the z-file (a z-file is simply the name of the configured file with a "z." prefix). Any subsequent process attempting to access the same s-file will check first for the existence of a z-file. If a z-file is found and contains a process id for an existing process, then the lock is enforced and access is denied. However, if no z-file exists, or the existing z-file refers to a non-active process id, then the z-file is replaced with the process id of the current process requesting the lock. Process ids are not unique across a network, so this simple technique is not sufficient on its own. However, I'll use a variation of this advisory locking technique in the solution presented here.

emacs and shlock

Some current programs handle file locking using a variant of the SCCS technique, among them emacs and standalone program called shlock, which is available in the Internet Network News (inn) package. The emacs locking function is in the filelock.c. It uses a common lock directory to store all of its locking files. The lock file names in this directory are either based upon the full pathname of the file being edited with "!" replacing "/" in the pathname or they are computed using a checksum of the pathname of the file (the latter technique is used on hosts where long filenames are not supported, e.g., the old sysV filesystem). Again, as with SCCS, the process id of the locking emacs process is placed in the lock file and this determines the lock file's validity. The other program, shlock, is closer to the goal I want in that it is designed to be callable by shell scripts. The shlock program is very flexible; it allows the caller to choose the process id and the full pathname of the lock file. However, in both cases (emacs and shlock), the locking mechanism assumes that the process ids and lock files are on a single host.

Presenting rlock

The solution I present here is similar to shlock but was developed without knowledge of the shlock program. It uses the SCCS technique of file locking, but takes the idea one step further by putting hostname information within the lock file to identify where the locking process resides. The algorithm is essentially the same except where a lock file exists and the hostname in the file does not match the hostname of the process accessing the lock file. In this case, a remote shell is invoked on the locking host, which in turn attempts to determine the existence of the process id. This solution assumes that trusted user access is enabled on the network of locking hosts. Since this may not be desirable under your existing security policy, you might consider other ways of implementing this service (see the sidebar, "Without Trusted User Access"). The source for rlock is in Listing 1.

Unlike emacs and/or shlock, rlock does not handle explicit lock file pathnames and does not use a lock file directory. I did, however, add the "-z" option, which allows rlock to output in a format which is compatible with SCCS. This makes it easy to create z-files (using -pz.) for SCCS s-files. I have found this useful when making global edits to s-files using non-SCCS routines. By default, rlock places a lock file in the same directory as the file mentioned as a parameter and prepends a "Z." to the front of the file name. The logic is that a "z." file is an SCCS lock file, and a "Z." file is a network-wide lock file.

A major shortcoming of rlock is that it assumes that a user has write access to the directory where the file being locked resides. shlock gets around this problem by allowing you to specify where the lock file resides. A single publicly writable directory could be created for shlock lock files. Likewise, emacs handles the problem by using lock files within a predefined lock directory. Either technique could easily be added to the rlock program

Applying rlock

In Listing 2, I present a very primitive script that calls vi. The script assumes that only file names are given on the command line, and it locks each file using rlock. This is not meant to be a robust example, merely a suggestion of possible use. I have also successfully used rlock as part of scripts wrapped around SCCS commands. Having the lock file allowed me to use several SCCS commands yet treat them as a single atomic operation. My experience suggests that the need to treat several file operations from within a shell script as an atomic operation is more common than most people think. System administrators usually accept the risk; with rlock, you don't have to worry about the risk.

Locking Things Down

I hope that you find rlock useful. If you use shell scripts and need a file-locking mechanism that works well with distributed file systems, rlock may meet your needs. If not, you may want to modify shlock or borrow code from emacs.

About the Author

Chris Cox is a Computer Science graduate of Texas A&M University with an Electrical Engineering minor. He has been working with UNIX since System III, and over the past ten years has been a software developer, configuration manager, porting engineer, and system administrator for heterogeneous UNIX environments. Chris is currently employed as a consultant for ARMS, Inc.