Sys Admin File Revision Control with RCS
Dave Plonka
System administrators do a lot of operating system configuration. In systems such as UNIX, this task is done almost exclusively by editing plain text files that determine the behavior of the system. Any sys admin who has been tasked with duplicating all the tweaks done to a mature system knows that it's often impossible to remember all the changes made to a system since the base OS install. Additionally, in many environments a team of systems administrators shares the responsibility of system maintenance. Certainly every sys admin in such an environment has been surprised at one time or another by an unexpected change done by a co-worker. Systems or processes that allow the participants to track configuration changes are not new. Sometimes they are loosely referred to as "configuration management" or, usually in software development and maintenance, simply as "revision control".
A revision control system, used to track modification of such configuration files could provide: (1) a "documentation" mechanism through which sys admins explain why a change was made; (2) a "locking" mechanism so that two or more sys admins don't overwrite each others' changes by simultaneously editing the same file; and (3) an audit trail or historical record of what has been done, and the ability to retrieve an earlier configuration if necessary. The RCS utility is the basis for the system that I will describe. (For a discussion of differences between SCCS and RCS see the sidebar RCS vs. SCCS.)
Introducing RCS
Most administrators know that RCS, the Revision Control System, manages file revisions. Although it can be used to maintain any group of plain text files, in UNIX environments these files are most often C language source files, shell scripts, and configuration files. In addition to managing source code, RCS can enable sys admins to:
- Store and retrieve revisions of text files.
- Maintain a history of changes.
- Resolve access conflicts when two or more sys admins wish to modify the same file.
Some UNIX systems (e.g., HP-UX, RedHat Linux, Digital UNIX) supply a version of RCS with the system. You should be sure that you are using a fairly current version, such as 5.x or greater.
Obtaining and Installing RCS
As of this writing, the latest version of RCS is 5.7, packaged by the name rcs-5.7.tar.gz. It is available from many sites via anonymous ftp including ftp://ftp.cs.purdue.edu/pub/RCS/ and ftp://prep.ai.mit.edu/pub/gnu/. Installation is well documented. The only unusual bit is that the RCS author doesn't allow you to perform the build as root. The following is an example of how you might extract, build, and install RCS. If your system has a nobody user, you can substitute that for user below; otherwise use another non-root account, such as your own. (You should refer to the README and INSTALL files supplied in the distribution to verify that this is appropriate for your situation.)
# mkdir -p /usr/local/src
# cd /usr/local/src
# mv /tmp/rcs-5.7.tar.gz .
# gunzip -c ./rcs-5.7.tar.gz |tar xf -
# chown -R user:group rcs-5.7
# su - user
...
$ cd /usr/local/src/rcs-5.7
$ rm -f config.cache
$ ./configure
...
$ make
...
$ exit # back to root
# make install clean
...
At this point, the RCS commands and man pages are installed. You may also wish to obtain and install the find_revisions script, discussed below. This RCS reporting tool is available at: http://net.doit.wisc.edu/~plonka/find_revisions/. There are a few prerequisites to take care of before sys admins begin using RCS as root.
Using RCS as root
In applications other than system administration, RCS is most often used by "normal", non-root users. RCS uses the user's login name (from the LOGNAME or USER environment variable) to uniquely identify the user when a lock is obtained (using the RCS co command) and in maintaining the RCS log information on check-in (using the ci command). Unfortunately, sys admins often must do their machine configuration work as root. On most systems, root is essentially an anonymous account because it doesn't identify which individual is currently operating as root. When using RCS in an environment with more than one sys admin, this is unacceptable.
A solution I've found for this problem is to create personal "root" accounts. That is, create multiple accounts with the user id of 0, each with a unique name identifying the sys admin who uses it. Because RCS uses the LOGNAME to identify the "locker", even though multiple accounts may have uid 0, RCS will be able to differentiate among them. If you will have multiple sys admins using RCS to perform maintenance tasks on a given machine, see the sidebar titled "Managing Personal root Accounts".
One caveat of which sys admins should be aware when using RCS as root is that text editors, such as vi, will sometimes allow you to force the writing of a read-only file. This can be used to side-step RCS's locking mechanism, and could result in file modifications being lost. In practice, this need not be much of a problem since vi, for instance, will warn that the file is read-only - the user must use :w! to force the write. It should be sufficient just to remind users that they still must "play along" with RCS for it to work correctly. As is often the case, those with root privilege should not abuse it to work around the intended process.
Which Files Should be Maintained Using RCS?
Any plain text file can be maintained using RCS. UNIX configuration files have formats that allow varying amounts of freedom. For instance, shell scripts such as /etc/profile allow the introduction of comments by beginning the line with the # character. This allows you to paste a comment to remind others that the file is being maintained using RCS. For example:
# $Id$
#
# # ###
# # ###### # # ###
# # # # # ###
####### ##### # #
# # # #
# # # # ###
# # ###### # ###
# THIS FILE IS BEING MAINTAINED USING RCS!
# PLEASE USE the RCS co(1) and ci(1) commands to maintain it.
Other files, such as /etc/passwd, have a rigid format and cannot contain such reminder comments. There may be other requirements. For instance, the /etc/passwd and /etc/shadow are very special cases. These files should always be writable because they are sometimes maintained using such commands as vipw, chsh, chfn, HP-UX's sam, AIX's smit, and Solaris' admintool. So, when using RCS with these files, the sys admin would leave them locked out at all times and periodically rcsdiff them to discover changes. When you wish to "checkpoint" these files, you would perform check-in with ci -l. Be sure to keep these configuration file formats and other issues in mind when using RCS to maintain critical system configuration files. Table 1 is a list of some common configuration files that usually reside on the root file system that you would maintain in RCS.
An RCS Tutorial
Here is an example of how to use RCS when updating the sendmail aliases file. We will add an alias to be used as the recipient of the output of the find_revisions reporting script discussed below.
1. Log in using your own personal root account. Here I also show that you are really root (i.e., uid=0) and that your LOGNAME reflects a unique user name.
$ su - rootyou
Password:
# id -u
0
# echo $LOGNAME
rootyou
#
2. Change directories to that in which the aliases file resides. On some systems, such as Solaris, that will be /etc/mail rather than /etc as is shown here. Since this is the first time we are using RCS, we will make a sub-directory that will contain the RCS revision (,v) files. (The RCS sub-directory is strictly optional, but it keeps your working directories less cluttered.) We then check in the initial revision of this file as it exists now. This will be the baseline from which we will make our change. Upon initial check-in, RCS asks for a description of the file. Quite often for systems administrators, the file is self-explanatory, so either enter a short description or just type ^D (Control-d, or EOF) to indicate you are finished. Following this check-in, a read-only aliases file will be left in the working directory.
# cd /etc
# mkdir RCS
# ci -u aliases
RCS/aliases,v <-- aliases
enter description, terminated with single '.' or end of file:
NOTE: This is NOT the log message!
>> sendmail aliases file
>> ^D
initial revision: 1.1
done
#
3. Check out the aliases file, with a lock. This will leave a writable aliases file in your working directory. Use the editor of your choice to add an alias called root-revisions.
# co -l aliases
RCS/aliases,v - aliases
revision 1.1 (locked)
done
# vi aliases
4. Examine the modification. (For this particular change, you should probably run newaliases or /usr/lib/sendmail -bi to integrate the change into sendmail's database. You can test this new alias with /usr/lib/sendmail -bv root-revisions.) The rcsdiff command will show differences between the "head" or "tip" revision and the working file. (For our purposes, the head revision is simply the most recently checked-in revision.) Assuming that you approve of the modification as displayed, check in the revision, unlock the file, and leave a read-only aliases file. The ci command will ask for a "log message". You should explain why you made the change. Remember that it's better to explain why a change was done rather than to describe the change itself. The latter can always be deduced from the rcsdiff output, but the former may have been known only to the author of the change. When interacting with RCS, if you find that you've made a typo, simply type the interrupt character (usually Control-c) and issue the ci command again.
# rcsdiff aliases
===================================================================
RCS file: RCS/aliases,v
retrieving revision 1.1
diff -r1.1 aliases
38a39,41
>
> # folks who want to be notified about RCS revisions on the root file-system
> root-revisions: you@your.org
# ci -u aliases
RCS/aliases,v <-- aliases
new revision: 1.2; previous revision: 1.1
enter log message, terminated with single '.' or end of file:
>> added "root-revisions" which is an e-mail alias to which >> the output of the "find_revisions" script is e-mailed as specified in >> root's crontab
>> ^D
done
5. You can examine the RCS revision file using the rlog command. Note that it displays the head revision, which user has a lock (if any), the description, and log messages.
# rlog aliases
RCS file: RCS/aliases,v
Working file: aliases
head: 1.2
branch:
locks: strict
access list:
symbolic names:
keyword substitution: kv
total revisions: 2; selected revisions: 2
description:
--------------
revision 1.2
date: YYYY/MM/DD HH:MI:SS; author: rootyou; state: Exp; lines: +3 -0
added "root-revisions" which is an e-mail alias to which the output of
the "find_revisions" script is e-mailed as specified in root's crontab
--------------
revision 1.1
date: YYYY/MM/DD HH:MI:SS; author: rootyou; state: Exp;
Initial revision
========================================================================
#
Although I was a bit verbose in clarifying some subtleties of the RCS commands, that was a basic example of an edit with RCS. Once you are more familiar with it, you'll see that you usually issue just this series of commands: co -l file, vi file, ci -u file.
Common mistakes when learning how to use RCS include editing a file without first obtaining a lock (then finding out when attempting to write that the file is read-only), typing "garbage" characters into the RCS log message during check-in, and accidentally checking in a revision. These problems can be resolved with RCS commands. For instance, the rcs administrative command allows you to delete selected revisions within a ,v file with the -o option.
Never change file permissions to work around RCS access control and never attempt to edit RCS ,v files directly. The revision files can become corrupt and unusable if you edit them - treat them as a "black box". With better understanding of the RCS commands' features, you will be able to manipulate revisions through RCS's interface and avoid exacerbating the problem that you're attempting to solve.
Useful RCS Commands
The following is a list of commands useful to systems administrators.
co [-M] file - Check out file for reading. The -M specifies that the file should be retrieved with the original modification time, rather than the current time. (This is sometimes nice when using make(1), because the file won't look as if it's been touched unless a real change has been made.)
co -l file - Check out file locked (i.e., for writing).
rcsdiff file - Show the differences for file between the revision currently locked and the previously checked in revision. That is, this command will show changes to file that are in progress.
ci -u file - Check in file, to deposit changes made in the RCS/file,v. Before checking in a file, you will probably want to create an RCS sub-directory, which will contain the revision (,v) files: mkdir RCS. For administrative usage, it is important to use the -u option. This causes RCS to leave a read-only copy of the working file in place. By default (without the -u), ci will remove the working file, which would be a Bad Thing for most UNIX configuration files.
ci -l file - Check in file, to deposit changes made in the RCS/file,v but retain a lock on the file. For administrative usage, this is an appropriate way to check in files that are edited by other utilities (e.g., /etc/passwd which is maintained using vipw(1)). It's important to retain the lock so that a writable working file will remain.
rcs -u file - Break a lock for file. If someone else holds a lock on the file, you will be prompted for an explanation of why you are breaking their lock, and a message will be emailed to them. This is particularly useful when another sys admin has locked a file out and possibly modified it, but has neglected to check in the change and unlock it. If that user has modified the file, you should be sure to save their working file before performing co -l yourself.
rcs -u file && rm file && co -M file - Break a lock for file, remove the file with the edits to be thrown away, and check out the latest revision for reading. This procedure would be used to "back out" of a change you were working on, but decided not to check in. (Another method is to do a co -u and answer with a yes when co asks if it's okay to overwrite the writable working file.) See the rcsintro man page for more information.
Implement When?
When is the right time to implement RCS control on your system's configuration files? The answer is "as soon as possible". Every time you perform an edit without RCS, you potentially lose information about how a sys admin performs her job. This information could be used for such purposes as training new administrators or even helping you to fill in your timesheet at the end of the week.
Ideally, to build a complete revision history, start with RCS as soon as the base operating system is installed. If your system does not provide RCS, you will have to install a compiler in order to build RCS from its source distribution. By starting early, you'll catch all those quick-and-dirty configurations and will be able to use them as a reference for configuring your next system from scratch. Another advantage to starting just after the operating system install is that the initial file revisions checked in with RCS will be those provided by the operating system vendor. Often, the initial configuration is useful as a reference to determine how to fix problems that have been created by customizations.
That said, the adage "better late than never" still applies. Even if you didn't start from the initial system install, at least you will be able to track recent modifications and familiarize yourself with revision control so that you can apply it to future projects.
Reporting
Now that a revision control system is in place, a reporting mechanism would be useful to periodically report on activity involving RCS-maintained files. Ideally, the report will help identify:
- Misuse, such as editing a file without first obtaining a lock
- Edits that are left "in-progress" for an extended period of time
- Poor explanations of why a file was edited
You could do this sort of reporting with a combination of system and RCS-provided commands such as find and rlog. For instance, to show revisions to files on the root file system since July 1, you might issue this command:
# find / -xdev -type f -name '*,v' -print |xargs rlog -d'1998/07/01'
Although that may be sufficient in some instances, I wanted to fine tune the output format to show the RCS log information for revisions and to show the rcsdiff output for changes in progress. Additionally, if no changes were made, I didn't want any output. The result of my modifications is a Perl script called find_revisions mentioned above.
Here is an example of the output of find_revisions showing recent changes on the root file system:
# find_revisions -s 1998/07/01 -dx /
============================================================================
RCS file: /etc/RCS/group,v
revision 1.7
date: 1998-07-07 10:55:06-05; author: rootme; state: Exp; lines: +1 -1
added Jess to the "net" group
============================================================================
RCS file: /etc/RCS/vfstab,v
revision 1.3
date: 1998-08-04 10:11:52-05; author: rootme; state: Exp; lines: +6 -1
added mount of floppy
This is being used to hold the tripwire database
--------------
revision 1.2
date: 1998-07-23 15:07:15-05; author: rootme; state: Exp; lines: +14 -2
added comment to remind folks that this file is maintained using RCS
specified "nosuid" for "/export" and "/var/data"
(Apr 23)
============================================================================
RCS file: /etc/RCS/defaultrouter,v
revision 1.2
date: 1998-07-06 16:54:09-05; author: rootme; state: Exp; lines: +1 -1
changed the defaultrouter (May 8) now that the ATM LANE interface is up
============================================================================
RCS file: /etc/RCS/system,v
revision 1.2
date: 1998-07-23 15:08:17-05; author: rootme; state: Exp; lines: +21 -0
added System V IPC configuration (May 11)
============================================================================
find_revisions has a number of options similar to those of rlog, rcsdiff, and find. This is its usage information:
$ find_revisions -?
Unknown option: ?
usage: find_revisions [-hx] [-n newer_than_file|-s since_date] \ [-u user(s)] [-d|c] [find(1)_args]
-h - help (shows this usage info)
mnemonic: 'h'elp
-n newer_than_file - show revisions newer than
the modification time of the
specified file
mnemonic: 'n'ewer
-s since_date - find revisions done since
the date specified.
The date format should be
that accepted by rlog(1)'s
"-d" option.
e.g. "%Y/%m/%d %H:%M:%S"
mnemonic: 's'ince
-u user(s) - find revisions done by the
user(s) specified.
The argument format should be
that accepted by rlog(1)'s
"-w" option.
mnemonic: 'u'ser(s)
-d - show current rcsdiff(1)
output (if any) as well.
This is useful to discover
changes which are
in-progress, or simply
have yet to be checked-in
with ci(1).
mnemonic: 'd'iff
-c - show current rcsdiff(1) "-c"
output (if any) as well.
(implies "-d")
mnemonic: 'c'ontext diff
-x - restrict the search to the
file system containing the
directory specified.
mnemonic: -'x'dev
e.g.
$ find_revisions -x /
$ touch /var/tmp/find_revisions.touch
$ # time passes...
$ find_revisions -n /var/tmp/find_revisions.touch -x /
Below is a sample crontab entry, to run find_revisions every day at 5:30 a.m. It will attempt to find RCS revisions since the previous time find_revisions was run from this crontab entry. If revisions or work-in-progress is found, the output will be emailed to the email recipient root-revisions, which we set up earlier in the sendmail aliases file. (You should create the file /var/tmp/find_revisions.touch when you initially submit this entry using crontab.)
30 5 * * * /usr/bin/touch /var/tmp/find_revisions.next &&/usr/local/bin/find_revisions /
-n /var/tmp/find_revisions.touch -dx/ /var/tmp/find_revisions.out && /usr/xpg4/bin/mv /
/var/tmp/find_revisions.next /var/tmp/find_revisions.touch && test -s /var/tmp/find_revisions.out /
&& /opt/local/bin/mutt -s "`/usr/bin/hostname`: recent revisions on / file-system" /
-i /var/tmp/find_revisions.out root-revisions
Reality Check
In assessing the practicality of the system, be sure that it is not prohibitively cumbersome. It's important that sys admins can use such a system even amidst the "fire-fighting" that is sometimes characteristic of their work. Here are some suggestions to raise the level of awareness of RCS usage and to help remind sys admins to participate in the system:
- Remind administrators that RCS is being used to maintain the machine's configuration by placing comments within the files themselves when the file's format allows. Consider putting the full path to the real file (rather than a symbolic link) within the comment as well.
- Set up an email report to those interested in seeing recent revisions done by administrators.
- Make some simple documentation available showing a synopsis of typical RCS usage for sys admins. The man pages are the definitive reference, however a simple tutorial showing the common "check-out, edit, check-in" is less overwhelming. A good place to provide this would be within a FAQ.
Summary
RCS is an effective tool to track file modifications. Although it is often considered a programmers' tool, it is useful as a system administrators' tool as well. When used to track modifications to system configuration files, RCS provides a mechanism for administrators to cooperate in a team environment, to keep team members informed of changes, and to recall past configurations for problem solving.
About the Author
Dave is a benevolent hacker and UNIX aficionado. He is employed as a Network Engineering Technology Systems Programmer in the Division of Information Technology (DoIT), at the University of Wisconsin, Madison. He can be reached at: plonka@doit.wisc.edu or via http://net.doit.wisc.edu/~plonka/.
|