Oracle Database Backup
Curtis Preston
Trying to back up today's databases is like trying to
catch a charging
rhino with a butterfly net. Real world databases have
become screaming
behemoths that are very difficult to capture (back up),
and standard
UNIX utilities are like butterfly nets. If you try to
use these
utilities for the job, not only will you not capture
the data -- you'll
break your arm! This article is my attempt to build
a better rhino trap.
It uses many advanced shell scripting features, so you
may find it
useful even if you do not have Oracle.
Hot or Cold Backups?
One of the first decisions a database administrator
must make is whether
to perform "cold" or "hot" backups.
A cold backup involves:
1. Shutting down the instance
2. Backing up its filesystems or raw devices
3. Starting up the instance
Here are the steps to perform a hot backup. (Note that
a hot backup
requires much more knowledge of Oracle's internal structure,
so a basic
definition of helpful terms is given in the sidebar
"Definitions.")
First, you must determine the name of each tablespace
in the instance
and the files that are within it. Then, for each tablespace,
you:
1. Put that tablespace in backup mode;
2. Backup that tablespace's files or raw partitions (using cp, tar, dd, etc.);
3. Take that tablespace out of backup mode.
Once all tablespaces are done:
1. Switch redolog files; and
2. Backup the control file.
It is much easier to perform cold backups, so why do
hot backups at all?
Simply stated, they increase your backup window, or
the time in which
you are allowed to perform backups. Because cold backups
require a
shutdown of the database, your users will probably allow
only a very
small backup window. Hot backups allow you to back up
without shutting
down, remaining virtually invisible to the user, and
thereby
significantly enlarging your window. I have even performed
hot backups
of production data during primetime hours. Try that
with a cold backup!
Be Careful
If you are going to use hot backups, though, you must
follow certain
precautions. Failure to do so could mean you would only
be able to
restore up to your last cold backup -- if you made one.
During a cold backup, all data files have been synchronized.
A restore
simply requires copying all files into place and restarting
the
database. However, during a hot backup, each file is
backed up at a
different time; thus each has a different time stamp.
You must have some
method of syncing all of those files to the same point
during a restore.
That is what Oracle's Redo Logs, (also known as archiving,
or Archive
Logs) are for. Archiving continually records all database
changes in
what is called the On-line Redo log. Once that log is
full, it is copied
to an Archived Redo log. (For more information, see
the sidebar "Should
I Use Archiving?")
When restoring from a hot backup, you first replace
the damaged database
files with their respective backup copies. (You only
need to replace the
damaged files.) You then "play" the redo logs,
effectively "redoing" all
the transactions that affected those files. All files
are then
synchronized, and the database can be brought on line.
With all the logs in place, this restore can be done
with one command.
However, if just one redolog is lost, your restore could
range from
extremely difficult to impossible. If you do not have
the set of redo
logs that span the time of your backup, you will not
recover at all.
This is why you must treat your redo logs as if they
were gold. Some of
the methods of protecting them are:
Mirroring the filesystem on which they are stored
Storing them on an NFS filesystem mounted from another
server (can
result in a performance loss)
Performing hourly incremental backups (to tape or
disk) of that
filesystem
It is not uncommon, or unwise, to use all of the methods
listed above to
protect your redo logs.
A Bird's Eye View
To automate Oracle backups, you must know the location
of tablespaces,
raw partitions, data files, control files, redolog files,
and whether
archiving is running. To make things more difficult,
each installation
will have different answers to these questions. Oraback.sh
(Listing 1)
addresses all of the problems above by obtaining most
of its information
from Oracle. The script will do the following for each
instance:
Use sqldba to obtain configuration information
Perform a cold or hot backup (as described above),
based on the day of
the week
Make a backup copy of essential redologs
Oraback.sh supports these features:
Backups to disk or tape
Backups of raw partitions
Multitasking, which can reduce backup time by up to
75%
Mail-based error notification
The script assumes that all instances want hot backups,
which require
redologs. Oraback.sh will warn you of any instances
that do not have
logs. It will continue to warn you until you correct
the problem or
customize your setup for that instance. Customization
is done with
oraback.conf (Listing 2).
Looking in the Backup Window
Now that I've explained the overall purpose of oraback.sh,
here is a
description of each step within the script. Oraback.sh
begins by
checking oraback.conf to see whether: the user running
the backup is
Oracle, or a user listed in field 7. It then looks to
see whether the
word "skip" is in field 2. If so, it skips
the backup once, and removes
the word from oraback.conf. The script then does one
of the following,
depending on which arguments, if any, it received:
$ oraback.sh at -- If called with at as an argument
(you would usually do
this with cron), the script checks oraback.conf to see
whether the day
in field 3 (the cold backup day) is today. If so, it
will set the
variable $TIME to the time in field 4. If not, it will
set it to the
time in field 5. The script then schedules an at job
that will run
oraback.sh at the time specified by $TIME.
$ oraback.sh [$SID1 $SID2...] -- If given one or more
instance names
(ORACLE_SID's, or SID's for short) as arguments, the
script will perform
a backup of each of them. If given no arguments, it
performs a backup of
all SID's listed in Oracle's oratab file. (The latter
is what the at job
will do.)
Oraback.sh first checks the $HOST.master line in oraback.conf
to find
out if: (a) there is a tape device in field 6. If so,
it labels the
tape, making sure the no-rewind device is being used;
and (b) there is a
number in field 8. If so, it will perform that many
simultaneous copies
of database files.
Next, oraback.sh asks Oracle questions that will determine
how (or if)
the SID will be backed up. It asks:
if the instance is on-line; if so, oraback.sh makes
a set of
configuration files needed later
if archiving is running
if the instance is excluded from hot backups
if this is a cold backup; if so, it shuts down the
instance
Hot Backup
If the instance is online, logs are on, and the instance
is not excluded
from hot backups, the script will put each tablespace
into backup mode
by using the sqldba command begin backup. The script
then copies that
tablespace's files to the backup device. When the copying
is done, it
takes the tablespace out of backup mode with the sqldba
command end
backup.
These three steps are the core of the hot backup. While
the tablespace
is in backup mode, the files will not change. Any changes
made to the
tablespace while it is in backup mode are sent to the
redo logs, and are
applied once it is taken out of backup mode. This is
all done
automatically by Oracle.
The script supports simultaneous copying of individual
data files to
dramatically increase the speed of the backup. Depending
on how many
files there are per tablespace, there may be one or
more tablespaces in
backup mode at once. This will be done in order to perform
the number of
concurrent file copies (using dd) that are specified
in field 8 of
oraback.conf, if you are using this feature.
Next, the script forces a checkpoint and archive log
switch, which
causes the online redo log to be sent to an archived
redo log. It then
backs up the control files to disk using both sqldba
and a manual copy.
Finally, the script makes copies of the essential redo
logs and
compresses them. (Essential logs are those that span
the time of your
backup.) If the backup device is a tape, it then backs
up all files that
were sent to disk to the tape device.
Cold Backup
If the instance is off-line, oraback.sh only needs to
copy the datafiles
to the backup device. The script then backs up the control
files and
redo logs the same way the hot backup does.
Special Cases
If the instance is online, but archiving is off, it
checks oraback.conf
for a line that reads:
$HOST:$SID:NOARCHIVELOG
If that line is not found, oraback.sh complains and
sends you mail. If
the line is found, it looks for the word "offline"
on that same line. If
"offline" is there, oraback.sh will shut down
the instance, perform a
cold backup, and then restart the instance. (This is
one of the ways you
can customize oraback.sh for different instances.)
If the instance is online, oraback.sh also looks for
a line in
oraback.conf that reads
$HOST:$SID:::nohotbackup
If that word is found, the script will skip this instance
when
performing a hot backup.
In summary, oraback.sh can back up from one to all of
the instances in
oratab. To back up all instances, run it with no arguments.
To back up
one or more, run it with those SID's as arguments. To
schedule a backup
for the time in oraback.conf, run it with the at argument.
Installing Oraback.sh
Check the following values in the site-specific section
at the top of
the script. They must be changed to fit your site.
a. BINDIR must be set to where you install oraback.sh.
b. ORATAB must be set to the name and location of Oracle's
oratab file.
(It must be located in a directory that Oracle can write
to.)
c. ORACONF must be set to the name and location of your
oraback.conf
file.
You should also review and confirm all other preference
variables in the
site-specific section, including ORADIR, TMP, LOG, and
PATH.
You must also verify that:
a. You know which instances are using archiving
b. There is a valid archive_dest in each instances's
config.ora (not the
default /?/dbs/arch)
c. Oracle can create directories in the filesystem that
you are using
for backups
Customizing Your Backups
Oraback.conf is the main configuration file for oraback.sh.
If you want
to use the at argument, you must have the following
line in that file:
$HOST.master::Sun:1200:1730:::5 :/dmp::Mail -s Error_oracle:
These and other fields are described below:
$HOST=Hostname generated by uname -n
Field 2: If you want all backups skipped on this host
tonight, put
"skip" here
Field 3: The day you want to do cold backups. This
can be a day of the
week (Fri) or month (03), or blank if you do not want
cold backups.
Field 4: The time to do cold backups
Field 5: The time to do hot backups
Field 6: No-rewind tape device, if you want to backup
to tape. (Blank
for disk-only backup)
Field 7: A "|" separated list of userid's
permitted to run sqldba,
such as "oracle|dba" (Blank allows only Oracle)
Field 8: The number of simultaneous copies you want
to run. (Blank=1)
Field 9: Must be set to a filesystem that has enough
room for a
compressed copy of all database files. (If you are backing
up to tape,
you still need some space on disk.)
Field 10: A "Y" means to compress the files
before they are written to
tape or disk
Field 11: Set this to a complete mail command. (e.g.,
/usr/ucb/Mail -s
Backup_Error oracle, dba@herworkstation.com)
There are four predefined functions, unused by default,
which are listed
in the site-specific section of oraback.sh. They are
listed below, with
their respective usage times:
Preback: Runs before entire backup starts
Preshut: Runs before shutdown of each instance
Poststart:Runs after startup of each instance
Postback: Runs after entire backup finishes
If you wanted to restart orasrv, for instance, after
an instance is
restarted (with a script called rc.orasrv.sh,) you would
change the
Poststart function to read:
Poststart(){
rc.orasrv.sh
}
Customizing Backups of an Instance
If you have an instance that will not run archiving
or hot backups, you
need a line in oraback.conf that starts with:
$HOST:$SID::::
Put "NOARCHIVELOG" in field 3 if the instance
does not use archiving
Put "offline" in field 4 if the instance
does not use archiving, and
you want a cold backup done every time
Put "nohotbackups" in field 5 to skip hot
backups for this instance
Testing the Backup
For testing, select a small instance and run the following
command as
Oracle:
$ /usr/local/bin/oraback.sh <instance>
Be aware that if the word "offline," is in
oraback.conf, or if the day
that you specified for cold backups is today, the instance
will be shut
down!
If this test functions properly, you can put an entry
in Oracle's cron.
Call this command:
/usr/local/bin/oraback.sh at
at around 1200. It will check the oraback.conf and schedule
an at job
that will perform the backup.
Testing the Restore
Never accept that a new backup strategy works until
you have fully
tested it under all circumstances! Make sure you test
all of the
following until you are comfortable:
Complete restore from a cold backup
Complete restore from a hot backup
Partial restore from a hot backup, restoring only
a few of your data
files.
Simulate the loss of a redo log, following the instructions
in the
Oracle manual for this scenario. This one is extremely
painstaking!
Restore from an older backup, using the redo logs
to roll forward to
today
Once your restore test is successful, you will have
caught and tamed the
rhino! This script represents more than three years
of work, and its
development is a continual process. I hope you find
it useful.
About the Author
Curtis Preston is a consultant for Pencom Systems Administration,
and
has been doing Unix Systems Administration for 3 years.
He specializes
in automating backups of heterogeneous networks. He
may be reached at
(312) 781-1111 or curtis@pencom.com.
|