Centralized Logging with syslog
"Sys admin" is a title that rarely refers to the administration of a single system. Most organizations have from tens to thousands of computers, often with very high availability requirements, in their enterprise. All of these systems are often managed by only a handful of administrators. In this kind of environment, it is crucial for the administrator to have access to as much information about those systems as possible. Most of this information is found in log files kept on the individual systems. The question is: How can you realistically monitor these log files when they are spread out across so many machines? A solid approach to this problem is through a centralized logging scheme.
This article describes the implementation of such a logging scheme on a network of UNIX and NT systems by which all important system and application log messages are logged on a central log host using syslog. I also discuss two included Perl scripts, flog.pl and watch.pl. flog.pl (Listing 1) is used to force programs that use flat files for logging (like UUCP and some Web servers) to use syslog instead. watch.pl (Listing 2) is used to watch multiple system logs in real-time and highlight important log messages as they appear. (All listings for this article are available from: ftp.mfi.com in /pub/sysadmin.)
There are a number of benefits to using a centralized logging scheme, primarily ease of administration. When log files are spread out on the various machines that have generated them, it is difficult to track down the root causes to complex problems that involve multiple systems. With all of your log files on one server, it becomes a simple task to find the errors or debugging information reported by various pieces of the same puzzle. It also becomes possible to watch all of your system logs as they grow, in real-time. This can be a great way to head off problems. For example, instead of finding out after the fact that "SCSI Reset" errors have been appearing in your log files for weeks prior to a disk failure, you can see them minute by minute and do something about it. Additionally, having all your important log files in one place removes the need to rotate them on all of your systems. Backups of these logs also become more accurate, more reliable, and vastly more accessible. For example, if you were researching a suspected break-in, you would only have to restore from one tape to look at logs from the time of the event instead of restoring a tape from each machine involved. In short, the case for centralized logging is a good one - and a good tool for the job is syslog.
syslog first appeared as part of BSD UNIX from Berkely. Written by Eric Allman of sendmail fame, it eventually became the standard for system logging. Prior to syslog's introduction, logging was typically done to flat files kept in different places according to the application or system process that created them. syslog allowed the administrator to control where all these programs logged. It also gave some control over the amount of information being logged. In its current incarnation, syslog is a powerful logging system that has been incorporated into all modern versions of UNIX. It provides a hierarchical priority scheme and individual logging "facilities" for different applications. It can take actions based on the type and priority of the messages received - logging to files, sending alerts to users or the system console, or sending the message itself off to another syslog server for remote logging. This remote logging ability telescopes the usefulness of the syslog method by allowing you to connect all of your systems in a coordinated logging scheme.
The basic structure of the syslog system is composed of the programs that wish to log messages and the syslogd process, which receives the messages and acts on them. Programs typically will log messages to the syslog daemon on their local system using the standard library calls openlog, syslog, and closelog. Once the call to syslog has been made, the syslog daemon takes over. Each logged message is associated by the logger with a "facility" and a "level", which tells the daemon what to do with it. The facilities determine the class of program that made the logging call and the levels determine priority. While not exhaustive, the list of facilities covers most of the common system utilities and leaves the door open for customization with user-defined "local" facilities, the use of which is up to the individual administrator. The priority levels are used for each facility to define the importance of the particular message. These priorities range from "info" for simple status messages to "emerg" for the highest of importance (akin to the process dialing "911"). See Table 1 for a list of the facilities and levels that some common programs log to.
The Configuration File
The syslog.conf file is best defined in its manual page, but a few comments here are appropriate. Essentially, the syslog.conf file describes a relationship between types of messages and actions for the daemon to take. Message types consist of "facility.level" pairs, of which there can be many, on the same line separated by semicolons. An asterisk can be used as a wildcard representing all the facilities. Any message of a matching facility and an equal or greater priority level will cause the daemon to take the corresponding action. A level of "none" can be used to explicitly remove the indicated facility from a wildcard list. For example:
*.info; kern.none /var/adm/info_log
will log all messages with the level of "info" or higher to /var/adm/info_log, except for "kern" messages, which are logged in kern_all. You can use this type of rule to exclude noisy processes or to direct certain facilities to their own log file as above.
Actions can be one of four types: A file in which to log the message, a machine to forward the message to, one or more users to notify, or an asterisk to notify all logged-in users. Filenames must be fully qualified, and the file must exist before syslog can log to it. The file can be any type of file, including character devices. For example, you can attach a serial printer to your loghost and log to its device for a real-time printout of security messages. Machine names must be prefaced with an @ character, as in @loghost. Users will be notified if their username is listed and they are logged in. Notification is made to their terminal using a method similar to the write or wall programs. For example:
*.alert; user.none root
*.emerg; user.none *
This will direct all messages of "err" level and higher to the system console along with any "kern" or "auth" messages with "notice" (or higher) priority. All "alert" and higher messages, other than those from the "user" facility, will be displayed on the root user's terminal. Any message with a level of "emerg" is broadcast to all the users on the machine at once. Additionally, every single message is forwarded to the central log host - see below for why you might want to do this.
When editing the syslog.conf file, be careful to separate the actions from the message types with Tabs. Cutting and pasting is a good way to lose Tabs, and is a common cause of difficulties configuring syslog. Once you have edited the configuration file, send a HUP signal to the syslog daemon to make it re-read the file and reconfigure itself. This will also cause syslogd to close and re-open all of its log files.
Some syslog daemons pre-process the configuration file using the m4 macro language. This allows an increase in flexibility, because you can create one configuration file that will behave differently depending on the machine on which it is installed. Check your man page for more information.
When setting up a centralized logging scheme, the first thing to consider is which machine will be your log host. It is important to choose a machine that is easily accessible from your other machines. Choose one without a lot of other duties if possible - it is not necessary that it be a powerful system, just well-connected and with ample disk space. This is a perfect application for a Linux box if you are of the freeware frame of mind. You have complete control over how much data is actually stored on the box, so you can make do with less disk space if you need to. The more available space the better, however, because then you will have a longer time window before you need to start deleting data. On our network of 24 servers and 6 NT boxes logging to syslog, we accumulate approximately 45MB of (uncompressed) data in the span of a month. Obviously every site will have its own needs.
The next step is to configure the loghost. This machine should have an alias of "loghost" or something equally relevant in your name resolution system. This will allow the clients to refer to it by a name that can be easily transferred to another host if necessary. Additionally, if you use an alias of "loghost", the m4 macro processor will set the variable LOGHOST when processing the syslog.conf file. You can make use of this to alter the behavior of the clients and servers given the same configuration file. Remember that by default, most systems refer to themselves as their own log host through the default loghost entry in the /etc/hosts file. Look for such entries in the /etc/hosts file of your client systems and remove them.
Configuration of the syslog daemons on the server and clients is really somewhat of an art. The idea is to balance the needs of comprehensive logging with the disk space and "junk message" issues. You want to see the truly important messages immediately and still have access to the interesting messages while dropping the useless and redundant ones to save space. One simple strategy is to start by logging absolutely everything from the clients to the server and gradually pare down the amount logged based on what you feel is necessary. This method assures you aren't dropping messages from a program that you actually want to see, because you start by seeing everything. On the down side, it can be a heavy network and disk load if done with a lot of clients. Start with one or two clients and keep an eye on the size of your logs.
Since you can still keep your clients logging locally while you run these tests, you can make a gradual progression from an all-local scheme to a centralized system. It is a good idea to keep high priority and security oriented (auth) messages logging both locally and remotely because you may lose connectivity with the log server from time to time. If you are especially interested in security, Wietese Venema has written a number of syslog enabled replacements for standard system utilities. Find them at: ftp://ftp.win.tue.nl/pub/security/index.html.
So far, I have discussed the configuration of UNIX clients and servers. It is possible to get your NT boxes into the game as well - although with a little bit of extra work. Windows NT does not make use of syslog as its standard logging utility. Instead, it uses the NT System Eventlog. Programs can log messages here using the event logging functions "OpenEventLog", "ReportEvent", and "CloseEventLog" (to name a few). While not the exact counterpart to syslog, the Eventlog performs the same essential functions. It provides for three distinct types of messages: "Application", "System", and "Security". Messages contain a timestamp, process name, username, and the name of the source computer along with a text description of the event. See Figure 1 for an example of a logon event shown by the Event Viewer.
NT clients can be configured to log all kinds of interesting information to the Eventlog. Any file on the system can be set to log an event when it is accessed, changed, deleted, etc. Logins and logouts, system reboots, security policy changes, process starts and stops can all be logged as events. The trick is getting these events out of the NT Eventlog and into your syslog logging scheme. To do this, you need to run a process as a service on your NT box that periodically reads the Eventlog and forwards new messages to your syslog server. A shareware utility called EvntSlog is available from Adiscon GmbH (www.adiscon.com) that does just this.
As of this writing, EvntSlog is fully functional in its second version, although still a little rough around the edges. Configuration is done by editing registry entries to define the name of the syslog server and the interval between Eventlog checks. All events are forwarded to the syslog server with the default priority of "user.notice". I was able to beta test version three, which allows you to define the priority for events and can log to different priorities based on the type of event. Apparently a GUI is in the works as well. See Figure 2 for the syslog message generated by the event in Figure 1.
Once you have configured EvntSlog (or another similar program), you will need to adjust the types of logging your server is doing. Editing the properties on any file will give you access to its auditing settings. You can define auditing events based on the type of action and the user generating it. User auditing can be set with the User Manager, and auditing for an entire disk is possible in the Disk Administrator. Many applications have their own settings for the Eventlog; check your documentation or online help. As with the UNIX clients, begin logging everything possible on a test system and gradually reduce the flow of information to the right level.
If you would like to set up a syslog server on an NT box instead of a UNIX box, there are a number of shareware solutions available for this as well. Adiscon offers one named "NTSlog", which allows logging from remote hosts to the local Eventlog - essentially reversing the flow of "EvntSlog". Franz Krainer (www.netal.com) offers a similar product called "SL4NT 1.1" with a slightly enhanced feature set allowing logging to an ODBC database as well as the Eventlog. These tools are all being enhanced and new ones are appearing, so check the Internet to see what is currently available.
Once you have your syslog server set up and accepting messages from your various systems, you will begin to notice that there is a good deal of interesting information being logged all the time. So interesting, in fact, that you may be tempted to try something like:
/var/adm> tail -f *
This unfortunately, won't work quite the way you may expect. After trying this myself, I wrote a small Perl script, watch.pl, that can be used to watch multiple files as they grow. Additionally, the script will highlight lines containing strings that you find important.
The script can be installed anywhere and can be run by any user. In its simplest form, it will watch all the files that are given as arguments. Watching in this sense means that it will output any lines added to any of the files soon after they are added. For example:
watch.pl /var/adm/messages /var/adm/kern_info /var/adm/sendmail.log
will watch the three logs in /var/adm. If you leave this running in a window on your screen, you can keep an eye on a multitude of systems and programs all doing their thing as you do yours.
The configuration file (Listing 3) is used to set up the message highlighting. By default it is found in /usr/local/lib/watch.cfg or ~/.watchrc with the latter taking precedence over the former. The file defines a mapping between a search string and an escape code to highlight the message. Definitions can be used to make the file more readable by giving an English name to an unintelligible terminal escape sequence. The example configuration file defines some standard xterm escape sequences such as bold and underline. The define labeled "bold_and_underline" will turn on both of these, and the "bold_and_underline_and_beep" adds a ^G character at the end to sound the terminal bell. When "file system full" appears in one of the logs being watched, I want to know about it.
The escape sequences listed are prefaced with an escape character by the script before they are printed and are followed by the escape code for normal text. These should be adjusted as necessary if you are running in another type of terminal. Color terminals are especially nice, and it is possible to generate a pleasant rainbow when everything is breaking at once.
The second Perl script included here, flog.pl, can be used to make programs log to syslog when they were originally designed to use a flat file. flog.pl works in much the same way as the watch.pl script, and watches a given set of files for new data. When new data arrives, flog.pl uses the logger program to log the message into syslog at a given priority. UUCP logs are a good target for this script, as are Apache Web Server logs, News logs, and XDM logs. You can also log from the root shell history file if you think something odd is going on.
The configuration file for flog.pl (Listing 4) should be put in /usr/local/lib/flog.cfg, although this is easily changed in the script if need be. The file itself contains a list of files to watch, a tag to log messages with, and a facility and level at which to log. The script should be started on boot and run in the background. The messages will be logged directly to the local syslog daemon, which will then forward them based on its configuration.
About the Author
Joshua Weinberg has been involved in UNIX administration for the last 8 years at various organizations around the Denver area, including The Root Group Inc. and lately at IBM Global Services working on the Lucent account. He has a degree in Computer Science from the University of Colorado and is a Certified Solaris Expert. Josh can be reached at email@example.com.