Understanding
Oracle Backup & Recovery
W. Curtis Preston
This month's article will explain the elements of Oracle
architecture that make backup and recovery possible. Understanding
these architectural elements is key to being able to successfully
backup and recover Oracle.
Historically, Oracle did not have a standalone backup utility
like Informix's ontape or Sybase's dump,
opting instead for commands that allow the DBA to use any backup
utility. Oracle7 introduced the EBU, or Enterprise Backup Utility,
but it is designed to work only with other commercial backup utilities.
(Oracle8 now comes bundled with an OEM version of Legato Networker,
which means that you do have another free option now, but this setup
is still not as easy to use as ontape or dump.) Oracle8
introduced the Recovery Manager (rman), which also is designed
to work with commercial backup utilities and added a lot more functionality.
Environments without a commercial utility must use backup scripts
of some kind. This method is certainly the least user-friendly and
most difficult to learn if you are new to Oracle and scripting,
but also allows for the greatest flexibility during both backup
and restore. This complexity, of course, requires a bit more explanation.
This article will use the Oracle8 command svrmgr for interfacing
with Oracle databases. If you are running Oracle7, the command is
sqldba.
Oracle Architecture
It is important to understand the design of the database that
is being backed up. We will start with the power user's view
of the database, and then continue with that of the database administrator.
This article uses Oracle-specific terms. As much as possible, these
architectural elements are presented in a "building-block"
order. Elements that are used to explain other elements will be
presented first. For example, I explain what a segment is before
explaining what the rollback segment is.
Power User's View
Unless a power user wants to start doing the DBA's job of
putting a database together, the following terms should be all he
or she needs to know. This view also could be called the "logical"
view, since many of the elements described in this view don't
exist in a physical sense.
Instance -- An instance is a set of processes through which
the Oracle database talks to shared memory. In UNIX, there is often
more than one instance on a single system. On a UNIX system, an
instance can be identified by a set of processes with the pattern
ora_ORACLE_SID_, where ORACLE_SID is the instance
name. When an Oracle instance is started, the database within it
becomes available. On a UNIX system, an instance is started with
the dbstart command and shut down with the dbshut
command. NT databases are started via the Service Startup Utility.
Database -- The database is what most people think about when
they are using Oracle because it is the database that contains the
data. It contains all the tables, indexes, and other important database
objects. In Oracle, there is a one-to-one relationship between instances
and databases. A database resides in only one instance, and there
is only one database within an instance. That is why an Oracle DBA
or user may use the two terms interchangeably. (You will find the
terms used interchangeably in this article as well. The term "instance"
actually is used rather sparingly, since the term "database"
is more widely known.) Technically, however, the instance is a set
of processes through which the database talks to shared memory,
whereas the database is the collection of data itself.
Table -- A table is a collection of related rows that all
have the same attributes. In Oracle, a table can be "partitioned,"
or spread across multiple tablespaces. Other than that, Oracle tables
are the same as any other RDBMSs.
Index -- A database index is analogous to an index in a book
-- it allows Oracle to find data quickly. Again, an Oracle index
is the same as anyone else's index, and it presents no unique
backup requirements. An index is a derived table. It is created
based on the attributes in another table, so it could be re-created
during a restore. However, it almost always is going to be quicker
to restore it than to re-create it.
BLOB datatypes -- Oracle8 has special datatypes called BLOB,
CLOB, and BFILE for storing large objects such as text or graphics.
The Binary Large OBject (BLOB) and Character Large OBject (CLOB)
datatypes present no special backup requirements since they are
stored within the database itself. (A BLOB typically contains image
files, and a CLOB normally contains text data.) However, the BFILE
datatype stores only a pointer inside the database to a file that
actually resides somewhere in the filesystem. This does require
some special attention during backups.
Row -- A row is a collection of related attributes, such as
all the information about a specific customer. Oracle DBAs also
may refer to this as a "record."
Attribute -- An attribute is any specific value (also known
as a "column" or "field") within a row.
DBA's View
Now that I have covered the logical structure of an Oracle database,
I will concentrate on the physical structure. Since only the DBA
should need to know this information, I will call it the "DBA's
View".
Block -- A block is the smallest piece of data that can be
moved within the database. Oracle allows a custom block size for
each instance; the size can range from 1024 to 8096 bytes. A block
is referred to as a page in other RDBMSs.
Extent -- An extent is a collection of Oracle blocks that
are treated as one unit. The size of each extent is determined by
the DBA.
Segment -- A segment is the collection of extents dedicated
to a database object (table). Depending on the type of table, extents
may be allocated or taken away to meet the storage needs of a given
table. A perfect example is the rollback segment, described later,
which would be all the extents on which the rollback logs are stored.
The size of the rollback segment may increase or decrease depending
on how many uncommitted transactions are currently open. Oracle
adds extents to (or subtracts extents from) the rollback segment
as it needs them.
Datafile -- An Oracle datafile can be either a raw (disk device)
or cooked (filesystem) file. Once they are created, the syntax to
work with raw and cooked datafiles is the same. However, backup
scripts do have to take the type of datafile into account. If the
backup script is going to support datafiles on raw partitions, it
will need to use dd or some other command that can back up
a raw partition. Using cp or tar will not work, since
they support only filesystem files.
Each Oracle datafile contains a special header block that holds
that datafile's System Change Number (SCN). This SCN is updated
every time a change is made to the datafile, and the controlfile
keeps track of the current SCN. When an instance is started, the
current SCN is checked against the SCN markers in each datafile.
(See the definition of controlfile later in this article.)
Tablespace -- This is the virtual area onto which a DBA creates
tables. It consists of several datafiles and is created by the create
tablespace tablespace_name on devicea, deviceb, devicec
command. A tablespace may contain several tables. The space that
each table occupies within that tablespace is a segment (see the
earlier definition of segment). Every Oracle instance has at least
one tablespace -- the system tablespace. The files that make
up the system tablespace must be specified when creating a new Oracle
instance. The system tablespace stores the data dictionary, PL/SQL
programs, view definitions, the system rollback segment, and other
types of instance-wide information. When it comes to backup and
recovery, the main difference between the system tablespace and
the rest of the tablespaces is that it must be recovered offline
because the instance cannot be brought online without the system
tablespace. Other tablespaces can be recovered after the instance
has been brought online.
Partition -- A table can be spread out across multiple tablespaces.
When this is done, each tablespace is referred to as a partition.
Controlfile -- The controlfile is a database (of sorts) that
keeps track of the status of all the objects within the database.
It knows about all tablespaces, datafiles, and redologs within the
database. It also knows the current state of each of these objects
by tracking each object's SCN. Every time it makes a change
to a file, the SCN gets incremented both in the controlfile and
in the actual datafile. (See the definition of datafile earlier
in the article.) That way, when the system reboots and the instance
is starting up, the controlfile has a record of what SCN the file
should be at, and it checks that against the SCN that the file has.
This is how it "notices" that a file is older than the
controlfile and is in need of media recovery. Also, if an older
controlfile is put in place, Oracle will see that the SCN of the
datafiles are higher than those that it has recorded in the controlfile.
That's when Oracle displays the "datafile is more recent
than the controlfile" error.
Controlfiles can be backed up using the backup controlfile
to filename command in svrmgr, but restoring controlfiles
is a bit tricky. The mechanics of this recovery are well beyond
the scope of this article. It is best to avoid having to recover
or rebuild a controlfile. A new feature introduced in Oracle7 provided
a way to do this with the mirrored controlfile feature in which
there can be multiple copies of the controlfile, each of which is
updated simultaneously by Oracle. Make sure that this feature is
being used. Mirrored controlfiles take up almost no space, and provide
an incredible amount of recovery flexibility.
Transaction -- A transaction is any activity by a user or
a DBA that changes one or more attributes in an Oracle database.
(If a set of commands is contained between a begin transaction and
end transaction statement, the entire set of commands is treated
as one transaction.) Logically, a transaction modifies one or more
attributes, but what actually occurs physically is a modification
to one or more blocks within the Oracle database.
Rollback segment -- Remember that a segment comprises all
the extents allocated to a database object. A rollback segment,
then, is all the extents allocated to a rollback log. Before a page
is physically changed on disk, the "before" image (its
image before it was changed) needs to be recorded in case the transaction
must be "rolled back". This before image is stored in
a rollback log, which is contained within a rollback segment. (There
can be several rollback segments within a given instance, and a
transaction may even be told which rollback segment to use.) Oracle
writes to only one extent within the rollback segment at a time.
It also writes to these extents in a cyclical fashion, filling each
extent one by one until all the extents are full, then returning
to the "first" extent and overwriting it. However, it
cannot start writing to an extent if there is an uncommitted transaction
whose "before" images are found in that extent, because
the before images must be preserved until the transaction is committed.
Oracle then must assign additional extents so that additional before
images can be saved. (This typically happens with a long transaction
whose before images will span several extents within the rollback
segment.) Once all transactions that need the before images in a
particular extent are committed, that extent then is available for
use by the rollback segment. If the number of extents needed by
the rollback segment decreases, Oracle can release extents as necessary
to shrink the rollback segment.
There is always at least one rollback segment created -- the
system rollback segment -- which is stored in the system tablespace.
Neither this rollback segment nor the tablespace in which it is
stored is sufficient for a normal production database. Therefore,
the DBA will create additional rollback segments in other tablespaces
and take the system rollback segment offline. A common practice
is to create a tablespace that will contain nothing but rollback
segments. Oracle assigns rollback segments to transactions on a
round-robin basis or to a specific rollback segment specified manually
by the transaction. Taking the system rollback segment offline makes
sure that no transactions will be assigned to it. This allows the
system tablespace to concentrate on other matters, without being
slowed down to record rollback information.
The main reason to understand rollback segments (and where they
go) is their unique roll in a database recovery. Remember that the
rollback segments store the before images of all changed blocks.
After a crash or recovery, these pages are essential to return the
database to a consistent state. They are needed in order to roll
back any uncommitted transactions and return the necessary blocks
to their before-transaction status. (This is the entire purpose
of the rollback segment.)
The result of this restriction is that, while a rollback segment
can be recovered online, a normal tablespace cannot be brought online
until the rollback segment that it uses is completely restored.
Therefore, Oracle does not allow the instance to be brought online
unless all defined rollback segments are available. If you try to
open the database without any of them, Oracle gives the error "rollback
segment segment_name specified not available". This
is covered in more detail later.
Checkpoint -- A checkpoint is the point at which all data
kept in memory is flushed to disk. In Oracle, a DBA can force a
checkpoint with the alter system checkpoint command, but
a checkpoint also is done automatically every time the database
switches redolog files.
Redolog -- If the rollback segment contains a rollback log,
the redolog could be called a "roll-forward" log. Every
time that Oracle needs to change a block on disk, it records the
change vector in the redolog; that is, it records how it changed
the block, not the value it changed it to. A mathematical explanation
may be helpful here.
Suppose that you had a variable with a value of 100 and added
1 to it. To record the change vector, you would record +1; to record
the changed value, you would record 101. This is how Oracle records
information to the redologs during normal operation. When a tablespace
is in backup mode, however, Oracle starts recording the changed
value (e.g., 101 in the example above), rather than the changed
vector.
In times of recovery, the redolog is used to "redo"
transactions that have occurred since the last checkpoint or since
the backup that is being used for a restore. Oracle has both online
redologs and offline (archived) redologs. The online and archived
redologs are essential to recovering from a crash or disk failure.
Learn everything you can about how they work and protect them as
if they were gold.
Originally, the online redologs were three or more files to which
Oracle wrote the logs of each transaction. (Oracle requires only
two logs, but the typical practice is to have three or more. That
allows one log to be active, one to be completely inactive, and
one to be in the process of being archived.) The problem with this
approach is that the log to which Oracle was currently writing always
contained the only copy of the most recent transaction logs. If
the disk on which this log was stored were to crash, Oracle would
not be able to recover up to the point of failure.
Log Groups
Oracle7 introduced the concept of log groups. A log group is a
set of two or more files that are written to simultaneously by Oracle
-- essentially a mirror for the redologs. A set of log files
is called a "log group", and the separate files within
that log group are referred to as "members". Each log
group is treated as a single log file, and all transaction records
are simultaneously written to all disks within the currently active
log group. Now, instead of three or more separate files, any one
of which could render the database useless if damaged, there are
three or more separate log groups of mirrored files. If each log
group is assigned more than one member, every transaction is being
recorded in more than one place. After a crash, Oracle can read
any one of these members to perform crash recovery.
Oracle writes to the log groups in a cyclical fashion. It writes
to one log group until that log group is full. It then performs
a log switch and starts writing to the next log group. As soon as
this happens, the log group that was just filled is then copied
to an archived redolog file, if automatic archiving is enabled.
If automatic archiving is not enabled, this file is not copied and
is simply overwritten the next time that Oracle needs to write to
that log. Each of the online redologs is copied to the filename
pattern specified by the LOG_ ARCHIVE_DEST parameter in the
initORACLE_SID.ora file, followed by an incremented string
specified by the LOG_ARCHIVE_FORMAT parameter in the initORACLE_
SID.ora file.
For example, assume that LOG_ARCHIVE_DEST was set to /archivelogs/arch
and LOG_ARCHIVE_FORMAT is set to %s.log, where %s
is Oracle's variable for the current sequence number. If the
current sequence number is 293, a listing of the archivelogs directory
might show the following:
# cd /archivelogs
# ls -l arch*
arch291.log
arch292.log
arch293.log
Depending on how much activity a database has, there may be hundreds
of files in the archive log destination directory over time. Nothing
is done by Oracle to manage this area, so a cron job must be
set up to clean this directory. As long as these files are being backed
up to some kind of backup media, they can be removed after a few days.
However, the more logs there are on disk, the better off the database
will be. That is because it sometimes may be necessary to restore
from a backup that is not the most current one. (For example, this
could happen if the current backup volume is damaged.) If all the
archive logs since the time that old backup was taken are online,
it's a breeze. If they aren't, they have to be restored
as well. That can create an available-space problem, which is why
I recommend having enough space to store enough archive logs to span
two backup cycles. For example, if the system does a full database
backup once a night, there should be enough space to have at least
two days' worth of redologs online. If it backs up once a week,
then there should be enough storage for two weeks' worth of transaction
logs. (This is yet another reason for backing up every night.)
In summary, the online redologs are usually three or more log
groups that Oracle cycles through to write the current transaction
log data. A log group is a set of one or more logs that Oracle treats
as one redolog. (Oracle always uses the term "log groups",
even if a log group has only one member.) Log groups should have
more than one member, since that minimizes the chance for data corruption
in the case of disk failure. Once Oracle fills up one online redolog
group, it copies that redolog to the archive log destination as
a separate file with a sequence number contained in the filename.
It makes this copy only if automatic archiving is enabled.
As you can see, there are many parts to Oracle's architecture,
each of which is integral in the backup and recovery process. Hopefully
this column has helped you to understand these pieces of architecture
a little bit better.
W. Curtis Preston has specialized in storage for over eight
years, and has designed and implemented storage systems for several
Fortune 100 companies. He is the owner of Storage Designs, the Webmaster
of Backup Central (http://www.backupcentral.com), and the
author of two books on storage. He may be reached at curtis@backupcentral.com.
(Portions of some articles may be excerpted from Curtis's books.)
|