When Inodes Go Bad
Emmett Dulaney
A user complains that the system is acting sluggish,
or data corruption
has occurred. You fire off fsck and the first thing
that comes
up in Phase 1 is a line similar to:
POSSIBLE FILE SIZE ERROR I = 317
Besides knowing that messages in all capital letters
are considered
shouting, it's hard to know what to make of this "possible"
error.
Phases 2 and 3 pass relatively uneventfully, then Phase
4 (checking
connectivity counts) shouts that there is an unreferenced
file, and
gives the inode number of that file. The message further
asks if you
would like to remove the file. After you answer, a message
informs
you that the free inode count is now wrong in the superblock.
Before answering the very first question of whether
or not to remove
the file, it is important to understand the concept
of inodes and
the superblock. The purpose of this article is to explain
each of
these items and describe the utilities used to reference
them.
Superblocks and Inodes
Filesystems are groups of files which exist within a
partition on
a disk -- the partition may be physical or logical.
On every filesystem,
there is an initial sector on the disk where all files
logically map
to. This is known as the superblock.
The superblock, of which there is only one per filesystem,
contains
information about the filesystem, and the physical data
comprising
it. In the PC world of DOS, the superblock is known
as the File Allocation
Table. Both serve the same purpose of mapping file locations
to physical
addresses.
The superblock, which is block 0 of the filesystem,
is broken into
smaller components, or individual entries for the filesystem,
known
as inodes. The physical location of the inode list is
directly after
that of the superblock. For every entry that shows up
in a directory
listing, there is a corresponding inode providing that
information.
Inodes, or "index nodes," consist of between
64 and 200 bytes
(depending upon the vendor), and are the focus of all
file\directory
activity in UNIX; they function as pointers to the physical
data.
Inodes can be thought of as address books giving information
about
the files. Just as one address can apply to two or more
people, one
file can be referenced by several names (links). Two
linked files
will share the same inode number; thus when you execute
a link, the
original set of data is pointed to and called.
While the exact ordering of information occasionally
differs by vendor,
each inode entry consists of the following information
about a file
or directory:
1. A unique inode number.
2. Type of entry. There are seven types, and each type
is assigned a two-digit value:
01 -- a named pipe
02 -- a character device file
04 -- a directory
06 -- a block device file
10 -- an ordinary file
12 -- a symbolic link
14 -- a socket
3. Permissions on the entity. Permissions are four digits,
with the first indicating whether a special mode is
set (1=sticky
bit, 2=SGID, 4=SUID), and the remaining three telling
read, write,
and execute permissions.
4. Actual physical size of the file.
5. Number of links to the entry.
6. Owner of the file (often in numeric form -- the
information returned by the id command).
7. The group possessing the file -- again, quite
often, in numeric form.
8. Times of creation, modification, and access. This
is referenced in directory listings, as well as when
the find
command looks for files meeting certain criteria.
9. The actual physical addresses where the file resides.
These are the components of inode entries, but the ordering
may be
different than as shown here. To check the ordering
on your system,
look for the file /usr/include/sys/inode.h.
Inode Functionality
Directories consists of inode entries that pair them
with files. Structuring
the filesystem this way, makes it incredibly easy to
move entire directories
in UNIX. In reality, all that changes is the name associated
with
the inode.
When files are linked, multiple references to the same
inode occur.
Deleting files removes only the references, not the
physical data,
so long as other files still point to it. Figure 1 shows
an example
of linking a file to multiple names. During the process,
the directory
listing makes it appear as if more files are being created,
but
the inode numbers tell differently. The link numbers
associated with
the files tell differently as well. Notice how the numbers
increment
with the links.
Figure 2 shows the same set of files, but now the files
are being
deleted. The original file goes first, then the first
link. The second
link is now no longer a link, but it still contains
the same information
-- by virtue of still referencing that same inode.
Inode numbers must be unique for physical data. The
way this uniqueness
is traditionally achieved is by incrementing the number
for each new
entry. Thus the first data placed on the filesystem
has the lowest
inode numbers. Figure 3 shows the numbers associated
with files on
a sample system. As a good rule of thumb, anytime fsck
finds
a problem with an inode number that is small, there
is a serious problem
with the machine.
The amount of space allocated for the inode table is
determined when
you make the filesystem (mkfs). If too much space is
allocated,
you are wasting room that could be used elsewhere. If
not enough space
is allocated, you will fill the table. The table contains
entries
-- one for every piece of data out there -- so it doesn't
matter
if the hard drive has plenty of space left on it. If
there are thousands
of tiny files, those thousands of entries can fill the
inode table.
Once an inode table is full, there is nothing to do
but rebuild the
filesystem.
Tools to Use
A directory listing that includes inode numbers can
be performed by
using the "-i" option with ls. To get a full
listing
of the file, complete with inode numbers, use ls -li
(as shown
in Figure 1).
Most standard UNIX packages include several vendor utilities
intended
for use when dealing with inodes. The first is /etc/ncheck.
When used with the "-a" option, as in ncheck
-a, it
generates a filesystem list similar to that created
by du,
but including inode numbers in place of used data. Figure
3 shows
a file list generated using this option.
The "-i" option is used to find only the filenames
matching
a specified inode number. The fsck operation at the
beginning
of this article showed a problem with inode 317.
ncheck -i 317
would find the file, or files, owning that inode number.
In order for ncheck to work properly, the name of the
filesystem
must be contained within a text file, /etc/checklist.
Another useful utility is /etc/clri, which clears an
inode
entry by writing zeros on the space occupied by the
entry. After clri
executes, any blocks that were previously occupied will
show up as
being unaccountable in a fsck operation. This utility
should
be used with extreme caution: it is recommended only
for removing
entries that show up as being occupied, yet point to
files within
no directories (this can create a loop where a new file
gets assigned
the inode, yet it points to the old garbage).
The last utility is /etc/mknod. As its name implies,
it is
useful for making directory entries, and the corresponding
inodes,
for special files. Special files may be, among other
things, block
devices, character devices, or named pipes.
Troubleshooting
The next four examples will document problems that can
occur with
inodes, how the problems show up in fsck listings, and
how to correct
the problems.
The first example, shown in Figure 4, is that of a "possible"
file size error. Neither owner nor group is reported,
and the only
reliable information is the inode number. ncheck -i
suggests
that the culprit is a database file -- with a link --
that is
giving erroneous information. The most likely scenario
here is
that someone failed to exit the database completely
and corrupted
the inode. The solution is to remove the link (because
both point
to the same inode, it doesn't matter which one is the
true file and
which one is the link -- removing either suffices),
then copy the
file over to a new file. The causes the data to be read
and a new
inode to be created. Last, move the new file over the
old one and
recreate the link. With a new inode pointing to the
file, all is well
-- as shown by rerunning fsck.
In the second example, illustrated in Figure 5, an unallocated
inode
is hanging out in cyberspace. The inode number is given,
as well as
the owner. The mode indicates the file type and permissions.
Since
what is being referred to is a "non-file",
the mode is zero.
The size is also zero, but the time of modification
and name of the
entry that once resided there are shown. Since the size
of the file
is zero, fsck reports that it has taken action and removed
the file. The dichotomy is that it cannot remove something
that is
not there -- "(EMPTY)". What it is trying
to do is remove
the reference. What actually happened on this machine
was that a hard
drive crashed and all data went south -- thus explaining
the references
to files that are no longer there after an attempt to
restore the
drive.
After the last phase, fsck it reports that there is
a Bad
Free List, and asks whether it should be salvaged. In
a real-world
situation, you always try to salvage what you can and
should answer
yes. I answered no only for purposes of this illustration.
The message
comes up to boot the operating system without performing
a sync operation.
Notice, however, the number of files, the blocks used,
and the blocks
free (41,402).
I booted the operating system again, made no changes
whatsoever, and
ran fsck again. The first five phases flew by with no
problems.
The Bad Free List still remains, however; this time,
I answered yes
and it was salvaged. Notice now the number of blocks
free (61480).
Even though the unallocated inode entry was corrected
on the first
fsck pass, the space was not returned to the system
until
the free list was salvaged.
Figure 6 shows the third example. Here a file is unreferenced,
meaning
the inode table shows it as being in use while there
is no corresponding
entry in any directory. The owner of the file is root
(user 0). If
it were user karen, it would show up as her ID number
(from the /etc/passwd
file). The mode here is important -- the first two characters
indicate
that it is a regular file, and the last three digits
show its permissions
as being read and write for the owner, and read for
everyone else
(the third digit, by being zero, shows the sticky bit
is not set).
It is safe to say that this is not an executable file.
A good guess
as to what happened here is that a file was deleted,
and processing
was interrupted in the midst of the deletion. With a
file size of
zero, it is safe to clear the file and correct the inode
table.
In the last example, in Figure 7, an inode shows up
as unreferenced
and fsck prompts for removal. Though I answer yes to
the prompt,
the inside remains there, as subsequent runnings of
fsck show.
It turns out that the inode is linked to the console.
On a UNIX machine,
the console basically has permissions over everything
else. The file
/dev/console is a special, character-based file, referencing
the only terminal allowed to login during sensitive
maintenance operations.
The example shown here is a real one, taken from a system
that was
dropped from the back of a truck as it was being moved
from one site
to another. The unreferenced file was keeping the system
from changing
to multiple user state.
After managing to get another terminal logged into the
machine, fsck
still failed to clear the inode. Attempts were made
to remove the
file (and thus clean up the entries in the inode table),
but they
failed, and it become necessary to use clri. After the
mode
was removed, the character file was rebuilt with mknod,
and
relinked to the aliases it uses. With the operation
done, fsck
was able to fly through its checks without uncovering
any problems,
and the system was able to go to multi-user state without
further
delay.
Summary
The inode table stores information on every physical
piece of data
stored on a UNIX filesystem. Within this listing is
kept information
on disc block locations, size, ownership, permissions,
and so forth.
If an inode degrades, the associated data can be damaged,
or completely
lost. This article has discussed some of the utilities
of use in looking
at inodes, as well as in clearing and recreating them.
About the Author
Emmett Dulaney has contributed to several books on
UNIX, and is currently
a product developer for New Riders Publishing. He can
be reached on the
Internet as Edulaney@Newrider.mhs.compuserve.com.
|