Cover V01, I01
Article
Figure 1
Listing 1

may92.tar


Where did that Core File Come From?

Chris Hare

Whenever UNIX encounters a fatal error, it dumps an image of the halted process into a file named core. Many different types of errors can cause a core dump. The most common are memory violations, illegal instructions, bus errors, and user-generated quit signals. Core files can be nasty disk consumers, especially since many core files are essentially worthless -- the product of quit signals from over-zealous users (who think the system is hung if their 100-megabyte file doesn't sort instantly).

However, some core files are extremely important. To a working developer, the core file holds a vast wealth of debugging information: by examining the core file, he can find out exactly what his program was doing when it was halted.

For the most part, however, when we system administrators find these annoying core files, we have no idea what caused them and thus, no reliable basis for deciding whether to delete them.

In 1988, a program called psc appeared on USENET. This tool looks inside the core file and extracts information about the core dump, including important clues to the origin of the core file. This article explains psc and the related file structures.

As part of each process, UNIX maintains a per process user area which contains the user environment. This user environment defines the process, and is part of the information which is saved as part of a core dump. The per-process user area also includes the registers as they were at the time of the fault. The actual size of the user area is implementation-dependent, but is defined in the system include file /usr/include/sys/param.h. The remainder of the core file represents the actual contents of the user's memory when the image was written to disk.

The system header file /usr/include/sys/user.h describes the per-process user area, and /usr/include/sys/reg.h gives the location of the register values.

The program is fairly straight forward, amounting to only 81 lines (Listing 1).

By default, psc opens a file named "core" if it exists in the current directory. Alternatively, the user may use a command line path argument to specify which file is to be examined. psc opens the file, and loads the per user process structure with the data from the file's user area.

Figure 1 shows a typical psc report. The effective and real user ID numbers tell which user was running the process when the process died. Note that if the effective user ID is different from the real user ID, a core image will not be generated when the fault occurs.

The "process times" section reports the parent and child process times which had accumulated prior to the dump. The user time indicates the amount of time which was spent operating in user mode (as opposed to kernel or system mode).

The "process misc" section gives the tty major and minor numbers and the address of the process structure for the process which created this core dump. On its own, the process structure address may not be useful, but armed with this information and a debugger, you can peruse the entire process slot entry. The controlling tty major and minor numbers tell who was executing the program, and from what terminal.

The IPC section reports on the active interprocess communication locks. A value of "proc" in this section indicates that the process was locked into RAM, "text" indicates that the text portion was locked, and "data" indicates that the data portion was locked. Unless the offending program specifically asked the kernel to lock its text or data area, this section will report "unlocked."

The FILE I/O section defines the output parameters which were active when the program crashed. The base I/O address points to the I/O control structure. "Offset" is the file offset at the time. "Bytes remaining" reports how much data was left when the process aborted.

This section also reports what umask was being applied to files which were created by this program. The ulimit value indicates the maximum file size (in blocks) which could be created by this application.

The "accounting" section reports which process created this core file, how much memory was used by the process, whether the process was created via fork or exec system call, and when the process started.

Conclusion

Though psc is a simple program, it is an important system administrator's tool. The information psc reports can prove invaluable in identifying the cause of a core dump and in determining whether a particular core file needs to be preserved.

About the Author

Chris Hare is Ottawa Technical Services Manager for Choreo Systems, Inc. He has worked in the UNIX environment since 1986 and in 1988 became one the first SCO authorized instructors in Canada. He teaches UNIX introductory, system administration, and programming classes. His current focus is on networking, Perl, and X.