Using fsck to Check ufs File Systems
Tom Clark
Introduction
A file system is a mechanism for locating files on a
storage device
recording medium, the device implementing a compatible
internal data
storage structure (e.g., random access disks with fixed
block sizes
and sequential access tapes with variable block sizes).
The accessible
storage can be considered as an organized "pool"
of data storage
resources. There are several common types of UNIX file
systems [1-2],
but this article discusses only the UNIX ufs file system.
The concepts
presented are applicable to all UNIX file systems being
checked using
fsck.
File systems are generally logically independent of
the required underlying
physical storage devices, although their operation is
directly affected
by these storage devices (for example, errors affecting
the operation
of the devices commonly affect the file system structure
or the files
maintained by the file system). However, not all file
system errors
can be related to an error reported by the underlying
storage device
(some other storage device, or source of error, may
be the root cause).
For simplicity, my discussion will be limited to storage
devices attached
to the host computer via the SCSI (Small Computer System
Interface)
bus. These devices support an addressable linear array
of identical
logical blocks, each block containing 512 bytes of data
(other common
block sizes are 1024 and 2048). Focusing on SCSI systems
dictates
to some extent problem determination and error recovery/restart
procedures,
since these procedures are designed to support the selected
storage
devices. In conjuction with the discussion of using
fsck to
check ufs file systems, I present a system administrator's
script
for checking multiple file systems.
SCSI Disk Error Model
A very high percentage of SCSI disk errors occur when
data is retrieved
from the medium. Errors that occur when storing data
are usually not
detected until the data is retrieved, although if the
device is unable
to locate the proper block on a write operation it can
immediately
signal a write error.
Even though the majority of accesses to a disk are data
retrieval
operations, complete media testing entails testing both
storage and
retrieval operations. The device that supports defect
management enables
linking around defective blocks. Typically, this service
is performed
either in response to the execution of a disk maintenance
command
or at the direction of a disk device driver in response
to a "hard
error" reported by the device.
A "hard error" exists: 1) where the device
signals that it
is unable to recover data (e.g., too many errors); or
2) where successive
unsuccessful attempts to correctly retrieve data (errors
reported
along with the data) are halted by the device driver.
The disk driver may have previously requested that the
device automatically
reassign a "bad" block during an initialization
phase. If
not, it may issue a separate command to the device to
reassign the
known "bad" block. It may also choose to do
nothing, leaving
a hole in the linear array and presenting an opportunity
for another
utility to attempt a patch of the linear array.
Successive unsuccessful attempts to retrieve data followed
by a single
successful data retrieval can occur before the device
driver halts
the data retrieval operations. Each such unsuccessful
attempt is termed
a "soft error." Excessive "soft errors"
are indicative
of future problems and should be corrected during preventive
maintenance.
The reassignment of a data block may result in data
loss even though
performed without additional errors. This usually happens
when the
device is unable to correctly recover data from the
"bad"
block and transfer it to the patch block.
A device driver typically stores whatever data has been
retrieved
from a "bad" block and writes it to the patch
block after
completion of the reassignment. Where no data is retrieved,
a block
of ZEROs is written to the patch block. Little else
can be done to
recover data directly; instead, restoring files from
recent backups
should be considered.
Device maintenance commands such as format can be used
to
"patch" the linear array of logical blocks.
you can perform
surface analysis to detect and reassign "bad"
blocks and should
do this periodically since the recording media degrades
with time.
ufs File System
A ufs file system can be represented simply as a large
data structure
composed of sequences of one or more smaller data structures
(referred
to as a cylinder group) each composed of the following[1]:
-- Offset
-- Super-block
-- Cylinder Group Map
-- Inodes
-- Storage Blocks
The "super-block" contains information on
the size and status
of the file system, the label (obtained from block 0
of a SCSI disk),
and the cylinder group. Multiple "super-blocks"
are created
and used to repair ones that are bad.
Inodes contain all information about a file except its
name (kept
in a directory). Typically, one inode is created for
every 2048 bytes
of available storage (this can be altered when the file
system is
built; refer to the mkfs user command). An inode contains
information on the:
-- file type (regular, directory, block, character,
symbolic link or FIFO/pipe),
-- file permissions,
-- number of hard links,
-- user-id and group-id,
-- number of bytes,
-- first 12 disk block addresses,
-- three indirect pointers to additional disk block
addresses, and
-- file time-related data.
The majority of blocks in a cylinder group (1-to-32
cylinders/group)
are allocated to storage blocks. The UNIX user command
fsck is used
to quickly check the super-blocks and the inodes for
file system inconsistencies:
1) during the "install" phase where the operating
system is
loaded onto the system disk and configured for normal
operation,
2) during a "bring-up" of the operating system,
3) when adding a new disk to the system,
4) when analyzing problems associated with a file system
supported
by a specific physical disk,
5) during repair procedures prior to returning a disk
device to normal
service, and
6) during preventive and predictive maintenance procedures
(such as
a disk probing tool that looks for file system-related
inconsistencies).
fsck Check Sequence
fsck sequences through the following phases:
Check Blocks and Sizes (file system inode list),
Check Pathnames (directory entries),
Check Connectivity (one directory for each inode and
multiple links make sense),
Check Reference Counts, and Cylinder Groups (link count
and alterations made previously), and
Check the Free List (blocks are allocated to an inode
or the free block list).
(Refer to Thomas and Farrow [5] for a more detailed
description of
the check sequence. The information presented here has
been condensed
from the references listed on page 88.)
An underlying presumption regarding the use of fsck
is that
files with corrupted inodes should be replaced from
backup copies
and that the user is able to keep track of the files
needing such
action. The following key factors also affect the use
of fsck:
-- mounted (active) file systems may change while
being examined by fsck,
-- any change that occurs in a file system while
running fsck can produce inconsistencies,
-- inconsistencies may be minor enough to result
in an automatic repair action by fsck,
-- major inconsistencies require user intervention,
and
-- false inconsistencies are treated as though they
were actual inconsistencies.
To avoid related problems, fsck does not work on mounted
file
systems, the exception being the root (/) file system.
fsck
can be run on root while in single-user mode.
However, there are two interfaces to a block storage
device (block
and character, or raw), and it is possible to run fsck
on
a mounted file system through the raw interface. Doing
so makes the
check vulnerable to the problems that could arise if
fsck
were run on an "active" or mounted file system.
All sources
strongly recommend that fsck be run on unmounted file
systems,
and on the root file system when in single-user mode.
The "-y" Option
The "-y" option allows fsck to assume yes
as a response
to all queries about repair actions. This option should
not be used
on a file system that contains important user data.
As I explain later
(in the "Repairs" section), there are cases
where fsck
should not be allowed to perform a repair action.
Install Phase
The install phase yields a fully operational, configured
operating
system upon successful completion. It requires the identification
of at least one operable disk (the system disk) upon
which the necessary
number of supported, operable file systems (determined
using fsck)
can be built. This in turn requires that the host computer
and hardware
supporting the system disk be operable (this can often
be checked
at a lower level of functionality than the install phase,
e.g., in
cases a PROM monitor supports communication with attached
devices).
At all points in the install phase, system disk-related
problems can
result in failures and aborts that require user analysis
to determine
proper responses. Since install processes rarely attempt
a detailed
verification of the underlying hardware, the user must
often rely
upon past history to select an appropriate error recovery
procedure
(i.e., the presumption is that the install would be
performed only
on a fully operational system).
A successful install process does not guarantee that
the system disk
is completely operational. The install process transfers
files to
the system disk and prepares it for normal operation,
a procedure
which requires many storage operations that are not
followed by the
retrieval operations that would normally detect disk-related
errors.
Errors encountered after a successful install may actually
relate
back to the time of the install.
Since fsck may be the only diagnostic tool to used by
the install
process to check system disk operability, errors reported
in an install
should result in a more detailed test of the host computer
and attached
storage devices (some computer system vendors provide
very detailed
system operability test packages). You should not, under
such circumstances,
attempt to continue the install process.
A full destructive surface analysis of the media before
installing
an operating system on a SCSI disk or after subsequent
disk-related
errors have been encountered will detect most latent
media-related
errors that could cause "install" phase and
future errors.
This will also require that the user re-install the
operating system.
The following are useful guidelines for the install
phase:
If an error is reported by the system disk or any other
hardware component, analyze the problem to see if a
service call is
necessary (can the error be corrected via simple user
action?).
If the install has been corrupted by the reporting of
an error, try re-starting the install process.
If fsck has reported errors, the likely suspect
is the underlying storage device, since it is allegedly
operable,
a new file system has just been built on it, and little
time has passed
for data loss or corruption to occur.
If the install has not completed or initial checks have
not performed successfully, take the system down to
single-user mode
and use fsck as a diagnostic probe.
Consider any errors that occur during an install phase
to be indicative of an abnormal condition.
Bring-up Phase
Presuming the successful completion of an install, the
system will
at some time be booted and a "bring-up" phase
initiated. During
this phase the function of fsck is again a very quick
check
of file system inconsistencies. If minor problems are
found, fsck
may be able to correct them and continue. If not, the
"bring-up"
process may require that fsck be run independently,
i.e., some repair actions
will be necessary.
fsck is important here because the system may have been
shut
down incorrectly, may have encountered a "panic"
condition,
or may have shut down due to errors or equipment modifications.
Earlier
statements regarding fsck apply here as well, but there
is
one notable additional factor -- recorded past history
in the system
log.
Problems reported by fsck during a bring-up phase should
prompt
the user to scan the system log for device-related errors.
Both "soft" and "hard" errors should
be analyzed,
since soft errors can evolve into hard errors.
Where difficult errors are present, especially those
affecting the
system disk, a suggested approach is to boot an operating
system image
over the net or from a local CDROM device, or integrate
into the host
computer a known good disk containing an operating system,
and use
it to perform further checking and data recovery.
Problem Determination
fsck is not a particularly useful tool for problem determination.
Errors reported by fsck require a good deal of interpretation
based upon intimate knowledge of the file system and
the underlying
storage device. Such errors should always be analyzed
in conjunction
with the system log, past history, experience, and a
good backup/restore
facility. The proper next step is rarely obvious from
the query/responses
provided by fsck.
Maintenance
Preventive and predictive maintenance procedures should
use fsck
to probe for file system inconsistencies. Such procedures
should first
perform full media "non-destructive" testing
to ensure that
no blocks are accessible that could cause a future "hard"
or "soft" error. "Destructive" media
testing should
be undertaken only after full data recovery has been
completed.
It is possible for maintenance procedures themselves
to encounter
errors (even system "panic" conditions) that
have no relation
to a recorded storage device error (e.g., data corruption
has occurred
and the maintenance procedure is processing garbage).
This should
not dissuade you from developing and performing them,
but should help
you to see the importance of executing them in sufficiently
secure
environments (e.g., full backups performed).
Repairs
fsck has a repair feature that can correct minor problems
but can also, if used inappropriately, do major damage
to a file system.
fsck is presumed to be an expert on the file system
and to
have the proper repair action selected before requesting
user permission
to proceed. The "-y" option, for those who
have complete faith
in its ability, causes it to automatically repair any
system-related
errors it encounters.
In a number of cases, bowever, fsck's suggested repair
is
not appropriate. In most of these cases, fsck will be
asking
for permission to remove a file or clear an inode; authorizing
the
repairs without investigating can result in data loss.
The following
cases have been identified empirically.
Case 1
SORRY: NO lost+found DIRECTORY or NO SPACE in lost+found
DIRECTORY
Clearly there is a problem with the lost+found directory
that requires
immediate attention. fsck should be terminated and other
actions
taken to resolve problems.
If no such directory exists, create one and then rerun
fsck.
If there is no space left in the directory, and the
number of files
in the directory is large, rerunning fsck is inappropriate
(instead, find out why so many files were placed in
this directory).
Where large files exist in the directory, attempt to
determine if
the large files are valid and are not file fragments.
An example would
be a copy operation that could not complete successfully,
with the
result that only a portion of a file remains in the
target directory.
Case 2
DUP TABLE OVERFLOW
The DUP table stores a list of inodes with duplicate
blocks. This
message occurs when the table runs out of space. Such
table overflow
should put the user on notice that unusual conditions
have occurred
and the potential for data loss is high. The recommended
action here
is to "write down the inode numbers of inodes with
duplicate blocks
found after this point, and don't REMOVE any of the
filenames connected
to the inodes or CLEAR these inodes." [5].
An alternate suggestion, given that the overflow represents
an abnormal
condition, is to stop at this point and attempt to recover
whatever
files are accessible before proceeding. The recovery
action could
simply be a raw disk copy (using the dd user command).
Case 3
Read, Write or Seek Errors
These messages indicate that a "hard" error
has occurred,
and fsck is asking if it should continue in the presence
of
such errors. Only if you are very familiar with the
attached physical
devices and with what fsck is attempting to accomplish
should
you attempt to continue. A possible alternative is to
perform a "non-destructive"
surface analysis of the attached storage device.
The surface analysis should reassign a "bad"
block if it encounters
a "hard" error (refer to the "SCSI Disk
Error Model"
section earlier). Since data may be lost in the procedure,
you should
record all reassigned blocks , rerun fsck, and mark
as suspect
all files affected by the reassignment. Upon completion,
you should
consider restoring files from recent backups.
Case 4
PARTIALLY ALLOCATED INODE I = 14
CLEAR?
Legal inode types are given in reference [5]. A partially
allocated
inode is one that has a type of 0, but some information
appears in
the mode word. This often indicates a block containing
garbage. The
occurrence of many of these suggests that the file system
is likely
to have widespread damage, including corrupted files.
A good practice here is to record the inode numbers
so that, if they
become needed in phase 2, the filenames linked to the
inodes can be
found. Data recovery after completion, as well as close
scrutiny of
the file system during preventive and predictive maintenance
procedures
is also recommended. If actual data corruption is found,
the storage
device should be re-evaluated and the file system rebuilt
before returning
the storage device to service.
Case 5
LINK COUNT TABLE OVERFLOW,
CONTINUE?
This message indicates that there is no more room to
store inodes
that have a zero link count and will recur for all subsequent
inodes
with zero link counts. If this is the only error reported,
you can
allow fsck to continue, but if multiple errors are reported,
fsck should be terminated. Upon completion or termination,
a file recovery procedure should be performed.
Case 6
EXCESSIVE BAD BLOCKS I=13
CONTINUE?
Ten bad blocks have been detected while checking this
inode's blocks.
Something is seriously wrong, and it is time to do something
other
than continue with fsck.
Initialization
Initialization errors are worth mentioning because they
can be triggered
by recent problems with devices that have been performing
normally
over some period of time (storage devices can fail suddenly).
When
fsck is initiated, perhaps in response to an error message,
the following error can be quite surprising:
Cannot stat <device name>
Here fsck cannot obtain information on the file
system supported by <device name>. It is possible
that the
file system does not exist, cannot be opened due to
permissions, or
has been removed from the device tree by the device
driver (e.g.,
the device no longer responds to commands). The appropriate
response
is to immediately initiate a problem determination procedure
on the
underlying storage device.
Repair Summary
Minor problems automatically detected and corrected
by fsck
rarely result in data loss or corruption. However, the
potential for
data loss or corruption increases up to a certainty
if the user chooses
to continue running fsck in the presence of clearly
dangerous
major errors. By placing complete faith in the ability
of fsck
to detect and correct errors (the "-y" option),
you lose control
over the repair process and may remain unaware of the
presence of
major problems that require immediate attention, e.g.,
widespread
data corruption.
Administrator's Bourne Shell Script
When a system administrator has to check multiple file
systems simultaneously,
a tool for performing the checks and highlighting the
problems becomes
handy. The mfsck script (Listing 1) allows you to specify
a large number of disks and partitions/slices to check
simultaneously.
It's important that you avoid overloading the OS when
doing this by
initiating too many processes.
mfsck supports a very simple interface which is displayed
if no arguments are provided:
mfsck <task file>
where,
<task file> is the name of a file containing the disks and the file systems to check.
The syntax of the file <task file> is
# comment line
<logical device> <partition> ... <partition>
as in
#
c0t5d0 0 1 6
#
The <task file> below contains a single entry
and points to a nonexistent disk:
# cat task.file
c8t0d0 3
#
Executing the script with this <task file> yields
the
output in Figure 1.
Using a single-entry <task file> that contains
a valid disk
looks like
# cat task.file
c1t2d0
#
and yields results like those in Figure 2.
The fsck operations are performed in the background
and the
script checks for completion. The script can be broken
up into smaller
files, as originally designed, and further simplified.
The present
form is intended to be simple and straightforward, facilitating
modifications
by users.
Multiple <task file> specifications can be built
in anticipation
of future requirements. Log files generated by the script
can be removed
or saved at the user's option. Where errors are encountered,
the log
files should be scanned to determine where the error(s)
occurred.
Conclusion
fsck is a tool for checking file systems that requires
a high
level of skill and experience to use effectively. fsck
can
be considered, in most cases, as an expert in selecting
the best "next
step" in a repair process. However, it has a very
limited, and
in some cases confusing, user interface. Moreover, it
requires in
some cases that the user record data to be used at a
later stage.
fsck also has a high potential for data loss and corruption,
since checking is limited to the file system. It can
be used effectively
as an indicator of possible current damage, but provides
no assistance
in the development of appropriate error recovery procedures.
Even given its limitations, however, when combined with
a detailed
understanding of the underlying storage devices and
an understanding
of the file system, fsck remains a very useful tool
in the
system administrator's toolbox.
References
1. UNIX Software Operation. UNIX System V Release
4 System Administrator's Guide. Englewood Cliffs, N.J.:
Prentice-Hall,
1990.
2. Bach, Maurice J. The Design of the UNIX Operating
System. Englewood Cliffs, N.J.: Prentice-Hall, 1986.
3. Nemeth, Evi, Garth Snyder, and Scott Seebass.
UNIX System Administration Handbook. Englewood Cliffs,
N.J.:
Prentice-Hall, 1989.
4. Fiedler, David, and Bruce H. Hunter. UNIX
System Administration. Indianapolis, IN: Hayden Books,
1987.
5. Thomas, Rebecca, and Rik Farrow. UNIX Administration
Guide for System V. Englewood Cliffs, N.J.: Prentice-Hall,
1989.
About the Author
Tom Clark has been working with UNIX since 1984 as
an applications
developer. He is currently working as a system software
architect
for Sun Microsystems SMCC System Software Quality Assurance
group
in Mountain View, Ca.. Tom has a B.S.E.E. from the University
of New
Mexico, an M.S.E.E. from Wichita State University, an
M.S.C.S. and
Engineering degree from the University of Southern California,
and
a B.S.L. and J.D. from Peninsula University. He can
be reached at
Thomas.Clark@sun.eng.COM.
|